[H] HyperspaceDB
Fastest Vector Database for Hierarchical & Flat Data written in Rust.
HyperspaceDB natively supports both the PoincarΓ© ball model (for hierarchies) and Euclidean space (for standard OpenAI/BGE embeddings), delivering extreme performance through specialized SIMD kernels.
π Key Features
- β‘οΈ Extreme Performance: Built with Nightly Rust and SIMD intrinsics for maximum search throughput.
- π Cognitive Math Engine: Hyperbolic HNSW optimized for the PoincarΓ© and Lorentz metrics, and O(N) Wasserstein-1 logic.
- π¦ Compression: Integrated
ScalarI8andBinaryquantization reduces memory footprint by 87% to 98%. - π§΅ Async Write Pipeline: Decoupled ingestion with a background indexing worker and WAL for 10x faster inserts.
- π₯οΈ Mission Control TUI: Real-time terminal dashboard for monitoring QPS, segments, and system health.
- πΈοΈ Edge Ready: WASM compilation target allows running the full DB in browser with Local-First privacy and IndexedDB persistence.
- π οΈ Runtime Tuning: Dynamically adjust
ef_searchandef_constructionparameters via gRPC on-the-fly. - π Multi-Tenancy: Native SaaS support with namespace isolation (
user_id) and billing stats. - π Replication: Leader-Follower architecture with Anti-Entropy catch-up for high availability.
- βοΈ Cognitive Math & Tribunal Router: Native SDK utilities for calculating geometric trust scores on graphs to detect LLM hallucinations.
- π‘ Memory Reconsolidation: Trigger AI sleep mode natively within the DB to restructure vectors via Flow Matching / Riemannian SGD.
π Architecture
HyperspaceDB follows a Persistence-First, Index-Second design:
- gRPC Request: Insert/Search commands arrive via a high-performance Tonic server.
- WAL & Segmented Storage: Every insert is immediate persisted to a Write-Ahead Log and a memory-mapped segmented file store.
- Background Indexer: The HNSW graph is updated asynchronously by a dedicated thread-pool, ensuring 0ms search blocking.
- Snapshots: Real-time graph topology is periodically serialized using
rkyvfor near-instant restarts.
π Quick Start
1. Build and Start Server
Make sure you have just and nightly rust installed.
cargo build --release
./target/release/hyperspace-server
2. Launch Dashboard
./target/release/hyperspace-cli
3. Use Python SDK
pip install ./sdks/python
from hyperspace import HyperspaceClient
client = HyperspaceClient("localhost:50051")
client.insert(vector=[0.1]*8, metadata={"category": "tech"})
results = client.search(vector=[0.11]*8, top_k=5)
π Performance Benchmarks
Tested on M4 Pro (Emulated), 1M Vectors (8D)
- Insert Throughput: ~156,000 vectors/sec (Sustained)
- Search Latency: ~2.47ms (156,000 QPS) @ 1M scale
- Storage Efficiency: Automatic segmentation + mmap
"The 1 Million Challenge"
HyperspaceDB successfully handles 1,000,000 vectors with zero degradation compared to traditional vector DBs, maintaining 156,000 QPS at the 1M scale.
π License
AGPLv3 Β© YARlabs
Evaluation & Benchmarks
HyperspaceDB is optimized for two critical metrics: Throughput (Ingestion speed) and Latency (Search speed).
Test Environment
- Hardware: Apple M4 Pro (Emulated Environment) / Linux AVX2
- Dataset: 1,000,000 vectors, 1024 Dimensions, Random Distribution in Unit Ball.
- Config:
ef_construction=400,ef_search=400
Results
π Ingestion Speed
Thanks to the Async Write Buffer (WAL) and background indexing, ingestion does not block user requests.
| Count | Time | Throughput | Storage Sements |
|---|---|---|---|
| 10,000 | 0.6s | 15,624 vec/s | 1 |
| 100,000 | 6.5s | 15,300 vec/s | 2 |
| 1,000,000 | 64.8s | 15,420 vec/s | 15 |
π Search Latency (1M Scale)
At 1 million vectors, search performance degrades linearly with graph depth ($\log N$), proving effective HNSW implementation.
| Metric | Value |
|---|---|
| QPS | 14,668 queries/sec |
| Avg Latency | 0.07 ms |
| P99 Latency | < 1.0 ms |
Why is it so fast?
- ScalarI8 Quantization: Fits 8x more vectors in CPU cache.
- No
acosh: Inner loop uses a monotonic proxy function ($\delta$). - SIMD: Vector operations use platform-specific intrinsics.
Installation
HyperspaceDB runs on Linux and macOS. Windows is supported via WSL2.
Prerequisites
- Rust: Nightly toolchain is required for SIMD features.
- Protoc: Protocol Buffer compiler for gRPC.
Option 1: Docker (Recommended)
The easiest way to get started.
docker pull glukhota/hyperspace-db:latest
# or build locally
docker build -t hyperspacedb .
docker run -p 50051:50051 -v $(pwd)/data:/app/data hyperspacedb
Option 2: Build from Source
-
Install dependencies
# Ubuntu/Debian sudo apt install protobuf-compiler cmake # macOS brew install protobuf -
Install Rust Nightly
rustup toolchain install nightly rustup default nightly -
Clone and Build
git clone https://github.com/yarlabs/hyperspace-db cd hyperspace-db cargo build --release -
Run
./target/release/hyperspace-server
Quick Start
Once the server is running on localhost:50051, you can use any official SDK.
1) Start server
cargo build --release
./target/release/hyperspace-server
2) Open dashboard
http://localhost:50050
3) First interaction (Python)
from hyperspace import HyperspaceClient
client = HyperspaceClient("localhost:50051", api_key="I_LOVE_HYPERSPACEDB")
collection = "quickstart"
client.delete_collection(collection)
client.create_collection(collection, dimension=3, metric="cosine")
client.insert(id=1, vector=[0.1, 0.2, 0.3], collection=collection)
client.insert(id=2, vector=[0.2, 0.1, 0.4], collection=collection)
print(client.search(vector=[0.1, 0.2, 0.3], top_k=2, collection=collection))
# Batch search (recommended for throughput)
batch = client.search_batch(
vectors=[[0.1, 0.2, 0.3], [0.2, 0.1, 0.4]],
top_k=2,
collection=collection,
)
print(batch)
4) Metric notes
cosine,l2,euclidean: general embeddings.poincare: vectors must satisfy||x|| < 1.lorentz: vectors must be on upper hyperboloid sheet.
Python SDK
The official Python client provides an ergonomic wrapper around the gRPC interface.
Installation
Install from PyPI:
pip install hyperspacedb
Client-Side Vectorization (Fat Client)
The SDK supports built-in embedding generation using popular providers (OpenAI, Cohere, etc.). This allows you to insert and search using raw text.
Installation with Extras
# Install with OpenAI support
pip install ".[openai]"
# Install with All embedders support
pip install ".[all]"
Usage
from hyperspace import HyperspaceClient, OpenAIEmbedder
# 1. Init with Embedder
embedder = OpenAIEmbedder(api_key="sk-...")
client = HyperspaceClient(embedder=embedder)
# 2. Insert Document
client.insert(id=1, document="HyperspaceDB supports Hyperbolic geometry.", metadata={"tag": "math"})
# 3. Search by Text
results = client.search(query_text="non-euclidean geometry", top_k=5)
Reference
HyperspaceClient
class HyperspaceClient(host="localhost:50051", api_key=None, embedder=None)
embedder: Instance ofBaseEmbeddersubclass.
Supported Embedders
OpenAIEmbedderOpenRouterEmbedderCohereEmbedderVoyageEmbedderGoogleEmbedderSentenceTransformerEmbedder(Local models)
Methods
insert(id, vector=None, document=None, metadata=None) -> bool
id(int): Unique identifier (u32).vector(List[float]): The embedding.document(str): Raw text to embed (requires configured embedder).- Note: Provide either
vectorORdocument.
search(vector=None, query_text=None, top_k=10, ...) -> List[dict]
vector(List[float]): Query vector.query_text(str): Raw text query.
search_batch(vectors, top_k=10, collection="") -> List[List[dict]]
Batch search API that sends multiple SearchRequest objects in one gRPC call.
rebuild_index(collection, filter_query=None) -> bool
Supports metadata-aware pruning during rebuild:
client.rebuild_index(
"docs_py",
filter_query={"key": "energy", "op": "lt", "value": 0.1},
)
delete(id, collection="") -> bool
Removes a single vector by its ID.
analyze_delta_hyperbolicity(vectors, num_samples=1000) -> (float, str)
Analyzes a set of vectors to determine if they exhibit hyperbolic structure. Returns the Gromov delta and a recommended metric ("lorentz", "poincare", or "l2").
Graph traversal methods
get_node(collection, id, layer=0)get_neighbors(collection, id, layer=0, limit=64, offset=0)get_concept_parents(collection, id, layer=0, limit=32)traverse(collection, start_id, max_depth=2, max_nodes=256, layer=0, filter=None, filters=None)find_semantic_clusters(collection, layer=0, min_cluster_size=3, max_clusters=32, max_nodes=10000)
Hyperbolic math utilities
from hyperspace import (
mobius_add,
exp_map,
log_map,
parallel_transport,
riemannian_gradient,
frechet_mean,
)
Rust SDK
For low-latency applications, connect directly using the Rust SDK.
Installation
Add to your Cargo.toml:
[dependencies]
hyperspace-sdk = "2.2.1"
tokio = { version = "1", features = ["full"] }
Usage
use hyperspace_sdk::Client; use std::collections::HashMap; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { // 1. Connect (with optional API Key) let api_key = std::env::var("HYPERSPACE_API_KEY").ok(); let mut client = Client::connect( "http://127.0.0.1:50051".into(), api_key, None ).await?; // --- Optional: Configure Embedder (Feature: "embedders") --- #[cfg(feature = "embedders")] { // Example: OpenAI use hyperspace_sdk::OpenAIEmbedder; let openai_key = std::env::var("OPENAI_API_KEY").unwrap(); let embedder = OpenAIEmbedder::new(openai_key, "text-embedding-3-small".to_string()); // Or: Voyage AI // use hyperspace_sdk::VoyageEmbedder; // let embedder = VoyageEmbedder::new(api_key, "voyage-large-2".to_string()); client.set_embedder(Box::new(embedder)); // Insert Document let mut meta = HashMap::new(); meta.insert("tag".to_string(), "rust".to_string()); client.insert_document(100, "Rust is blazing fast.", meta).await?; // Search Document let results = client.search_document("fast systems language", 5).await?; println!("Document Search Results: {:?}", results); } // ----------------------------------------------------------- // 2. Insert with Vector (Low-Level) let vec = vec![0.1; 8]; let mut meta = HashMap::new(); meta.insert("name".to_string(), "item-42".to_string()); client.insert(42, vec.clone(), meta, None).await?; // 3. Basic Search let results = client.search(vec.clone(), 5, None).await?; // 4. Advanced / Hybrid Search // e.g. Find semantically similar items that also mention "item" let hybrid = Some(("item".to_string(), 0.5)); let results = client.search_advanced(vec, 5, vec![], hybrid, None).await?; for res in results { println!("Match: {} (dist: {})", res.id, res.distance); } Ok(()) }
Features
embedders: Enablesset_embedder,insert_document, andsearch_document. Requiresreqwestandserde.
Batch Search
Use search_batch or search_batch_f32 to reduce per-request overhead in high-concurrency workloads.
Graph Traversal API
Rust SDK exposes graph calls directly:
get_nodeget_neighborsget_concept_parentstraversefind_semantic_clusters
Rebuild with Metadata Pruning
Use rebuild_index_with_filter to run vacuum/rebuild and prune vectors in one request:
#![allow(unused)] fn main() { client .rebuild_index_with_filter( "docs_rust".to_string(), "energy".to_string(), "lt".to_string(), 0.1, ) .await?; }
Hyperbolic Math Utilities
#![allow(unused)] fn main() { use hyperspace_sdk::math::{ mobius_add, exp_map, log_map, parallel_transport, riemannian_gradient, frechet_mean }; }
WebAssembly (WASM)
Integrations (LangChain & n8n)
Model Context Protocol (MCP)
API Reference
HyperspaceDB operates on a Dual-API architecture:
- gRPC (Data Plane): High-performance ingestion and search.
- HTTP (Control Plane): Management, monitoring, and dashboard integration.
π‘ gRPC API (Data Plane)
Defined in hyperspace.proto. Used by SDKs (Python, Rust, Go).
Collection Management
CreateCollection
Creates a new independent vector index.
rpc CreateCollection (CreateCollectionRequest) returns (StatusResponse);
message CreateCollectionRequest {
string name = 1;
uint32 dimension = 2; // e.g. 1536, 1024, 64
string metric = 3; // "l2", "euclidean", "cosine", "poincare", "lorentz"
}
DeleteCollection
Drops a collection and all its data.
rpc DeleteCollection (DeleteCollectionRequest) returns (StatusResponse);
ListCollections
Retrieves all active collections for the current tenant, including their metadata.
rpc ListCollections (Empty) returns (ListCollectionsResponse);
message ListCollectionsResponse {
repeated CollectionSummary collections = 1;
}
message CollectionSummary {
string name = 1;
uint64 count = 2;
uint32 dimension = 3;
string metric = 4;
}
GetCollectionStats
Returns real-time statistics for a single collection.
rpc GetCollectionStats (CollectionStatsRequest) returns (CollectionStatsResponse);
message CollectionStatsResponse {
uint64 count = 1;
uint32 dimension = 2;
string metric = 3;
uint64 indexing_queue = 4;
}
Vector Operations
Insert
Ingests a vector into a specific collection.
rpc Insert (InsertRequest) returns (InsertResponse);
message InsertRequest {
string collection = 1; // Collection name
repeated double vector = 2; // Data point
uint32 id = 3; // External ID
map<string, string> metadata = 4; // Metadata tags
DurabilityLevel durability = 7; // Durability override
map<string, MetadataValue> typed_metadata = 8; // Typed metadata (int/float/bool/string)
}
enum DurabilityLevel {
DEFAULT_LEVEL = 0; // Use server config
ASYNC = 1; // Flush OS cache (Fastest)
BATCH = 2; // Background fsync (Balanced)
STRICT = 3; // Fsync every write (High Safety)
}
typed_metadata is the preferred metadata path for new clients. String metadata remains as a compatibility path.
Search
Finds nearest neighbors.
rpc Search (SearchRequest) returns (SearchResponse);
message SearchRequest {
string collection = 1;
repeated double vector = 2;
uint32 top_k = 3;
// Metadata string filter (e.g. "category:book")
map<string, string> filter = 4;
// Complex filter object
repeated Filter filters = 5;
// Hybrid search
optional string hybrid_query = 6;
optional float hybrid_alpha = 7;
// Wasserstein 1D CDF O(N) distance
optional bool use_wasserstein = 8;
}
#### Geometric Filters (New in v3.0)
HyperspaceDB v3.0 introduces native spatial constraints. These run on the bitset level inside the engine and are significantly faster than application-level filtering.
```protobuf
message Filter {
oneof condition {
Match match = 1;
Range range = 2;
InCone in_cone = 3;
InBox in_box = 4;
InBall in_ball = 5;
}
}
// 1. Proximity Filter
message InBall {
repeated double center = 1;
double radius = 2;
}
// 2. N-Dimensional Bounding Box
message InBox {
repeated double min_bounds = 1;
repeated double max_bounds = 2;
}
// 3. Angular Cone (for ConE-style embeddings)
message InCone {
repeated double axes = 1; // Vector direction
repeated double apertures = 2; // Angular width (radians)
double cen = 3; // Centrality offset
}
`SearchResult` now includes both `metadata` and `typed_metadata`.
Range filters are evaluated with numeric semantics (`f64`) against typed metadata numeric values.
For gRPC clients, decimal thresholds are supported via `Range.gte_f64` / `Range.lte_f64` (`gte/lte` `int64` remains as compatibility path).
gRPC `Range` examples:
```protobuf
// Integer threshold (compatibility path)
Filter {
range: {
key: "depth",
gte: 2,
lte: 10
}
}
// Decimal threshold (recommended for typed numeric metadata)
Filter {
range: {
key: "energy",
gte_f64: 0.8,
lte_f64: 1.0
}
}
SearchBatch
Finds nearest neighbors for multiple queries in a single RPC call.
rpc SearchBatch (BatchSearchRequest) returns (BatchSearchResponse);
message BatchSearchRequest {
repeated SearchRequest searches = 1;
}
message BatchSearchResponse {
repeated SearchResponse responses = 1;
}
Recommended for high-concurrency clients and benchmarks to reduce per-request gRPC overhead.
SubscribeToEvents
Streams CDC events for post-insert/delete hooks.
rpc SubscribeToEvents (EventSubscriptionRequest) returns (stream EventMessage);
enum EventType {
EVENT_UNKNOWN = 0;
VECTOR_INSERTED = 1;
VECTOR_DELETED = 2;
}
message EventSubscriptionRequest {
repeated EventType types = 1;
optional string collection = 2;
}
message EventMessage {
EventType type = 1;
oneof payload {
VectorInsertedEvent vector_inserted = 2;
VectorDeletedEvent vector_deleted = 3;
}
}
Use this stream to build external pipelines (audit, Elasticsearch sync, graph projections, Neo4j updaters). SDKs (Python/TypeScript/Rust) expose convenience subscription methods for this stream.
Reliability note:
- stream consumers may lag under burst load; server now handles lagged broadcast reads without dropping the whole stream task;
- tune
HS_EVENT_STREAM_BUFFERfor higher event fan-out pressure.
Delete
Removes a single vector from a collection by its external ID.
rpc Delete (DeleteRequest) returns (DeleteResponse);
message DeleteRequest {
string collection = 1;
uint32 id = 2;
}
message DeleteResponse {
bool success = 1;
}
π Delta Sync Protocol
Advanced synchronization for consistency verification and recovery.
SyncHandshake
Computes the difference between client and server states using Merkle-like bucket hashes.
rpc SyncHandshake (SyncHandshakeRequest) returns (SyncHandshakeResponse);
SyncPull
Streams missing vectors from the server based on differing buckets.
rpc SyncPull (SyncPullRequest) returns (stream SyncVectorData);
SyncPush
Streams client-side unique vectors to the server to achieve global consistency.
rpc SyncPush (stream SyncVectorData) returns (SyncPushResponse);
MetadataValue (Typed Metadata)
message MetadataValue {
oneof kind {
string string_value = 1;
int64 int_value = 2;
double double_value = 3;
bool bool_value = 4;
}
}
Graph Traversal API (v2.3)
rpc GetNode (GetNodeRequest) returns (GraphNode);
rpc GetNeighbors (GetNeighborsRequest) returns (GetNeighborsResponse);
rpc GetConceptParents (GetConceptParentsRequest) returns (GetConceptParentsResponse);
rpc Traverse (TraverseRequest) returns (TraverseResponse);
rpc FindSemanticClusters (FindSemanticClustersRequest) returns (FindSemanticClustersResponse);
Key safety guards:
GetNeighborsRequest.limitandoffsetfor bounded pagination.TraverseRequest.max_depthandmax_nodesto prevent unbounded graph walks.FindSemanticClustersRequest.max_clustersandmax_nodesfor bounded connected-component scans.
TraverseRequest is filter-aware and supports both:
filter(map<string,string>)filters(Match/Range)
GetNeighborsResponse now includes edge_weights, where edge_weights[i] is the distance from source node to neighbors[i].
RebuildIndex with pruning filter (v2.2.1)
message RebuildIndexRequest {
string name = 1;
optional VacuumFilterQuery filter_query = 2;
}
message VacuumFilterQuery {
string key = 1;
string op = 2; // "lt" | "lte" | "gt" | "gte" | "eq" | "ne"
double value = 3;
}
Use this API for pruning cycles when you need to rebuild an index and drop low-value vectors in one server-side operation.
TriggerReconsolidation (v3.0.1)
Trigger AI Sleep Mode (Riemannian SGD / Flow Matching) directly on the engine to algorithmically shift vectors.
rpc TriggerReconsolidation (ReconsolidationRequest) returns (StatusResponse);
message ReconsolidationRequest {
string collection = 1;
repeated double target_vector = 2;
double learning_rate = 3;
}
InsertText (v3.0.1)
Inserts raw text to be embedded and stored on the server.
rpc InsertText (InsertTextRequest) returns (InsertResponse);
message InsertTextRequest {
string collection = 1;
string text = 2;
uint32 id = 3;
map<string, MetadataValue> typed_metadata = 4;
}
Vectorize (v3.0.1)
Converts text to a vector using the server's embedding engine.
rpc Vectorize (VectorizeRequest) returns (VectorizeResponse);
message VectorizeRequest {
string text = 1;
string metric = 2; // "l2", "cosine", "poincare", "lorentz"
}
message VectorizeResponse {
repeated double vector = 1;
}
SearchText (v3.0.1)
Searches the collection using a text query.
rpc SearchText (SearchTextRequest) returns (SearchResponse);
message SearchTextRequest {
string collection = 1;
string text = 2;
uint32 top_k = 3;
repeated Filter filters = 4;
}
π HTTP API (Control Plane)
Served on port 50050 (default). All endpoints under /api.
Authentication & Multi-Tenancy
Every request should include:
x-api-key: API Key (optional if disabled, but recommended)x-hyperspace-user-id: Tenant Identifier (e.g.client_123). If omitted, defaults todefault_admin.
Cluster Status
GET /api/cluster/status
Returns the node's identity and topology role.
Swarm Peers (Gossip Protocol)
GET /api/swarm/peers
Returns active peers discovered via UDP multicast (Edge-to-Edge Sync).
{
"gossip_enabled": true,
"peer_count": 2,
"peers": [...]
}
{
"node_id": "uuid...",
"role": "Leader", // or "Follower"
"upstream_peer": null,
"downstream_peers": []
}
Node Status (Compatibility)
GET /api/status
Returns runtime status and node configuration. Dashboard uses this endpoint first, with fallback to /api/cluster/status.
System Metrics
GET /api/metrics
Real-time system resource usage.
{
"cpu_usage_percent": 12,
"ram_usage_mb": 512,
"disk_usage_mb": 1024,
"total_collections": 5,
"total_vectors": 1000000
}
Admin / Billing (Since v2.0)
Requires user_id: admin
GET /api/admin/usage
Returns JSON map of user_id -> usage_stats:
{
"tenant_A": {
"collection_count": 2,
"vector_count": 1500,
"disk_usage_bytes": 1048576
}
}
List Collections
GET /api/collections
Returns summary of all active collections.
[
{
"name": "my_docs",
"count": 1500,
"dimension": 1536,
"metric": "l2"
}
]
Collection Search (HTTP Playground)
POST /api/collections/{name}/search
Convenience endpoint for dashboard/manual testing.
{
"vector": [0.1, 0.2, 0.3],
"top_k": 5
}
Graph HTTP Endpoints (Dashboard / tooling)
GET /api/collections/{name}/graph/node?id={id}&layer={layer}GET /api/collections/{name}/graph/neighbors?id={id}&layer={layer}&limit={limit}&offset={offset}GET /api/collections/{name}/graph/parents?id={id}&layer={layer}&limit={limit}POST /api/collections/{name}/graph/traversePOST /api/collections/{name}/graph/clusters
User Guide
Server Configuration
HyperspaceDB is configured via environment variables or a .env file.
Core Settings
| Variable | Default | Description |
|---|---|---|
RUST_LOG | info | Log level (debug, info, error) |
HS_PORT | 50051 | gRPC listening port |
HS_HTTP_PORT | 50050 | HTTP Dashboard port |
HS_DATA_DIR | ./data | Path to store segments and WAL |
HS_IDLE_TIMEOUT_SEC | 3600 | Inactivity time (seconds) before collection unloads to disk |
HS_DIMENSION | 1024 | Default vector dimensionality (8, 64, 768, 1024, 1536, 3072, 4096, 8192) |
HS_METRIC | cosine | Distance metric (cosine, poincare, l2, euclidean, lorentz) |
HS_QUANTIZATION_LEVEL | none | Compression (none, scalar (i8), binary (1-bit)) |
HS_STORAGE_FLOAT32 | false | Store raw vectors as f32 (mode=none) and promote to f64 in distance kernels |
HS_FAST_UPSERT_DELTA | 0.0 | Fast upsert L2 threshold. 0.0 disables; typical 0.001..0.05 for iterative updates; too high can keep stale graph links |
HS_EVENT_STREAM_BUFFER | 1024 | Broadcast ring size for CDC and replication streams |
HS_RERANK_ENABLED | false | Enable exact top-K re-ranking after ANN candidate retrieval |
HS_RERANK_OVERSAMPLE | 4 | Candidate multiplier used before exact re-rank (top_k * factor) |
HS_GPU_BATCH_ENABLED | false | Enable runtime auto-dispatch policy for batch metric kernels |
HS_GPU_MIN_BATCH | 128 | Minimum batch size for GPU offload policy |
HS_GPU_MIN_DIM | 1024 | Minimum vector dimension for GPU offload policy |
HS_GPU_MIN_WORK | 262144 | Minimum workload (batch * dim) for GPU offload |
HS_GPU_L2_ENABLED | true | Enable GPU dispatch for L2 batch kernel (requires gpu-runtime feature) |
HS_GPU_COSINE_ENABLED | true | Enable GPU dispatch for cosine batch kernel (requires gpu-runtime feature) |
HS_GPU_POINCARE_ENABLED | true | Enable GPU dispatch for PoincarΓ© batch kernel (requires gpu-runtime feature) |
HS_GPU_LORENTZ_ENABLED | true | Enable GPU dispatch for Lorentz float batch kernel (runtime path) |
HS_SEARCH_BATCH_INNER_CONCURRENCY | 1 | Internal parallel fan-out in SearchBatch handler (bounded) |
HS_SEARCH_CONCURRENCY | 0 | Global concurrent search-task limit per collection (0 = auto by CPU cores, max clamped to CPU*4) |
Cloud Tiering (S3)
Enabled only when compiled with s3-tiering feature.
| Variable | Default | Description |
|---|---|---|
HS_STORAGE_BACKEND | local | local (all chunks on disk) or s3 (offload cold chunks) |
HS_MAX_LOCAL_CACHE_GB | 10 | Hard limit for local disk cache in Gigabytes |
HS_S3_BUCKET | - | Target S3 bucket name |
HS_S3_REGION | us-east-1 | AWS Region |
HS_S3_ENDPOINT | - | Custom endpoint (e.g. http://minio:9000) |
HS_S3_ACCESS_KEY | - | S3 Access Key ID |
HS_S3_SECRET_KEY | - | S3 Secret Access Key |
HS_S3_MAX_RETRIES | 5 | Retries for failed uploads/downloads |
HS_S3_UPLOAD_CONCURRENCY | 4 | Semaphore-limited parallel uploads |
HS_WAL_SEGMENT_SIZE_MB | 256 | Size before WAL rotation (influences chunk size) |
HS_CHUNK_PROBE_K | 3 | Number of most relevant chunks to search per query |
HNSW Index Tuning
| Variable | Default | Description |
|---|---|---|
HS_HNSW_M | 64 | Max connections per layer |
HS_HNSW_EF_CONSTRUCT | 200 | Build quality (50-500). Higher = slower build, better recall. |
HS_HNSW_EF_SEARCH | 100 | Search beam width (10-500). Higher = slower search, better recall. |
HS_FILTER_BRUTEFORCE_THRESHOLD | 50000 | If filtered candidate count is below threshold, layer-0 uses exact brute-force instead of graph traversal |
HS_INDEXER_CONCURRENCY | 1 | Check README for threading strategies (0=Auto, 1=Serial) |
Persistence & Durability
| Variable | Default | Description |
|---|---|---|
HYPERSPACE_WAL_SYNC_MODE | batch | WAL Sync strategy: strict (fsync), batch (100ms lag), async (OS cache) |
HYPERSPACE_WAL_BATCH_INTERVAL | 100 | Batch interval in milliseconds |
Memory Management (Jemalloc)
HyperspaceDB uses Jemalloc for efficient memory allocation. Tune it via MALLOC_CONF:
- Low RAM (Aggressive):
MALLOC_CONF=background_thread:true,dirty_decay_ms:0,muzzy_decay_ms:0 - Balanced (Default):
MALLOC_CONF=background_thread:true,dirty_decay_ms:5000,muzzy_decay_ms:5000
Security
| Variable | Default | Description |
|---|---|---|
HYPERSPACE_API_KEY | - | If set, requires x-api-key header for all requests |
Multi-Tenancy
HyperspaceDB supports strict data isolation via the x-hyperspace-user-id header.
- Isolation: Every request with a
x-hyperspace-user-idheader operates within that user's private namespace. - Internal Naming: Collections are stored internally as
userid_collectionname. - Default Admin: If
x-hyperspace-user-idis omitted but a validx-api-keyis provided, the user is treated asdefault_admin. - SaaS Integration: Gateways should inject this header after authenticating users.
Lorentz metric notes
When HS_METRIC=lorentz, vectors must satisfy hyperboloid constraints:
t > 0(upper sheet)-t^2 + x_1^2 + ... + x_n^2 = -1
Web Dashboard
HyperspaceDB includes a comprehensive Web Dashboard at http://localhost:50050.
Features:
- Cluster Status: View node role (Leader/Follower) and topology.
- Collections: Create, delete, and inspect collection statistics.
- Explorer: Search playground with filters and typed metadata visibility.
- Graph Explorer: Query neighbors and concept-parent graph views from HNSW layers.
- Metrics: Real-time RAM and CPU usage.
TUI Dashboard (Legacy)
For terminal-based monitoring:
./hyperspace-cli
Key Controls
- TAB: Switch tabs.
- [S]: Trigger snapshot.
- [V]: Trigger vacuum.
- [Q]: Quit.
Embedding Service
Advanced Features
π€ Federated Clustering (v1.2)
HyperspaceDB v1.2 introduces a Federated Leader-Follower architecture. This goes beyond simple read-replication, introducing Node Identity, Logical Clocks, and Topology Awareness to support future Edge-Cloud synchronization scenarios.
Concepts
Node Identity
Every node in the cluster is assigned a persistent, unique UUID (node_id) upon first startup. This ID is used to track the origin of write operations in the replication log.
Roles
- Leader (Coordinator):
- Accepts Writes (
Insert,Delete,CreateCollection). - Manages the Cluster Topology.
- Streams WAL events to connected Followers.
- Accepts Writes (
- Follower (Replica):
- Read-Only.
- Replicates state from the Leader in real-time.
- Can be promoted to Leader if needed.
- Edge Node (Planned v1.4):
- Offline-first node that accumulates writes and syncs via Merkle Trees when online.
Configuration
Leader
Simply start the server. By default, it assumes the Leader role.
./hyperspace-server --port 50051
Follower
Start with --role follower and point to the leader's URL.
./hyperspace-server --port 50052 --role follower --leader http://127.0.0.1:50051
Monitoring Topology
You can inspect the cluster state via the HTTP API on the Dashboard port (default 50050).
Request:
curl http://localhost:50050/api/cluster/status
Response:
{
"node_id": "e8b37fde-6c60-427f-8a09-47103c2da80e",
"role": "Leader",
"upstream_peer": null,
"downstream_peers": [],
"logical_clock": 1234
}
This JSON response tells you:
- The node's unique ID.
- Its current role.
- Who it is following (if Follower).
- Who is following it (if Leader).
- The current logical timestamp of its database state.
Edge-to-Edge Gossip Swarm (v3.0)
Beyond centralized replication, v3.0 introduces a decentralized Peer-to-Peer UDP Swarm network. This feature is crucial for robotics and offline-first autonomous agents.
Features
- Zero-Configuration Topology: Nodes broadcast heartbeat logs via UDP (
tokio::net::UdpSocket). - Self-Healing: Unresponsive nodes (TTL > 30s) are automatically dropped from the registry.
- Auto-Discovery: Swarm nodes discover each other and exchange
Logical ClocksandCollection Digestsfor the Merkle Delta Sync.
Swarm Configuration
Add these variables to your environment or .env file to start joining the global Swarm:
# Enable the Gossip listener on the specified local port
HS_GOSSIP_PORT=7946
# Bootstrapping nodes to connect to
HS_GOSSIP_PEERS=192.168.1.10:7946,192.168.1.11:7946
Swarm State Monitoring
You can monitor the active mesh structure from the dashboard UI or standard HTTP:
Request:
curl http://localhost:50050/api/swarm/peers
Response:
{
"gossip_enabled": true,
"peer_count": 1,
"peers": [
{
"node_id": "a92jfe...",
"addr": "192.168.1.10:50050",
"http_port": 50050,
"role": "Leader",
"logical_clock": 4200,
"collections": [
{
"name": "vision_system",
"state_hash": 6712399120,
"vector_count": 500
}
],
"last_seen_secs": 1729384910,
"healthy": true
}
]
}
π§ Hybrid Search
HyperspaceDB combines Hyperbolic Vector Search with Lexical (Keyword) Search to provide the best of both worlds.
This is powered by Reciprocal Rank Fusion (RRF), which normalizes scores from both engines and merges them.
Conceptual Flow
- Vector Search: Finds semantically similar items (e.g. "smartphone" finds "iPhone").
- Keyword Search: Finds exact token matches in metadata (e.g. "iphone" finds items with "iphone" in title).
- RRF Fusion:
Score = 1/(k + rank_vec) + 1/(k + rank_lex).
API Usage
Python
results = client.search(
vector=query_vector,
top_k=10,
hybrid_query="apple macbook", # Lexical query
hybrid_alpha=0.5 # Balance factor (default 60.0 in RRF usually, but exposed as alpha here)
)
Rust
#![allow(unused)] fn main() { let results = client.search_advanced( query_vector, 10, vec![], Some(("apple macbook".to_string(), 0.5)) ).await?; }
Tokenization
Currently, all string metadata values are automatically tokenized (split by whitespace, lowercase, alphanumeric) and indexed in an inverted index.
π Vector Quantization
HyperspaceDB supports multiple storage modes to balance Precision vs Memory vs Speed. All modes operate transparently β no SDK changes required.
Quantization Modes
| Mode | Bits/dim | Compression | Recall@10 | Best For |
|---|---|---|---|---|
| None | 64 (f64) | 1Γ | 100% | Research, exact recall |
| ScalarI8 | 8 (i8) | 8Γ | ~98% | Production default |
| SQ8 Anisotropic | 8 (i8) | 8Γ | ~99%+ | Cosine / L2 (Sprint 6.2) |
| Binary | 1 (bit) | 64Γ | ~75β85% | Re-ranking, large datasets |
| Lorentz SQ8 | 8 (i8) + scale | ~8Γ | ~95β98% | Hyperboloid (Lorentz) metric |
| Zonal (MOND) | mixed | 30β40%β RAM | ~99% | Hyperbolic (core + boundary) |
1. ScalarI8 (Default)
The default mode. Coordinates are mapped from f64 to i8 β [-127, 127] via:
q_i = round(x_i * 127) // For PoincarΓ©: x_i β (-1, 1)
- Compression: 8Γ vs
f64 - Recall: ~98% (@10 neighbors)
- Distance: Dequantized at query time (
a_i / 127.0)
2. SQ8 Anisotropic (Sprint 6.2 / 7.1 β ScaNN-Inspired)
Standard isotropic quantization applies uniform rounding to all dimensions, which distorts the direction (angle) of a vector. For Cosine/L2 metrics, angular error causes more recall degradation than magnitude error.
Anisotropic SQ8 penalizes orthogonal (directional) error far more than parallel (magnitude) error during the quantization refinement step.
Loss Function
$$L = |e_\parallel|^2 + t_w \cdot |e_\perp|^2$$
Where:
- $e_\parallel = (e \cdot \hat{x}) \hat{x}$ β projection of quantization error onto the original vector direction
- $e_\perp$ β component orthogonal to the original vector
- $t_w = 10$ (anisotropy weight) β penalizes directional error 10Γ more than magnitude error
Coordinate Descent Refinement
After the initial isotropic quantization, each coordinate is refined by Β±1 step in i8-space and the one minimizing the anisotropic loss is selected:
#![allow(unused)] fn main() { for i in 0..N { // Try original, +1, -1 for delta in [-1, 0, 1] { let candidate = (q[i] as i16 + delta).clamp(-127, 127) as i8; let loss = e_parallel_sq + t_weight * e_ortho_sq; if loss < best_loss { best = candidate; } } q[i] = best; } }
Results
| Metric | Mode | Recall@10 Gain |
|---|---|---|
| Cosine | ScalarI8 β Anisotropic SQ8 | +5β8% |
| L2 | ScalarI8 β Anisotropic SQ8 | +3β5% |
Implementation
The anisotropic refinement is in QuantizedHyperVector::from_float() in
crates/hyperspace-core/src/vector.rs.
3. Lorentz SQ8 (Dynamic-Range)
The Lorentz (hyperboloid) model has unbounded coordinates: the time component
x[0] = cosh(r) grows exponentially. A fixed [-1, 1] mapping would saturate immediately.
Solution: Per-vector dynamic-range scaling:
scale = max(|x_i|)
q_i = round(x_i / scale * 127) // i8
Ξ± = scale // stored in alpha field (f32)
Dequantization: xΜ_i = (q_i / 127.0) * Ξ±
See Lorentz SQ8 deep-dive for full details.
4. Binary (1-bit)
Each coordinate is compressed to its sign bit. Distance uses Hamming distance.
- Compression: 64Γ vs
f64 - Recall: ~75β85% (metric-dependent)
- Use case: First-pass re-ranking candidate retrieval over very large datasets
- β οΈ Not supported for Lorentz: sign destroys hierarchical depth information
5. Zonal Quantization β MOND (Sprint 6.3)
Inspired by Modified Newtonian Dynamics: near the center of hyperbolic space the metric is smooth, but it explodes near the horizon.
#![allow(unused)] fn main() { pub enum ZonalVector { Core(Vec<i8>), // ||x|| < 0.5: compress to i8 (~8x RAM saving) Boundary(Vec<f64>), // ||x|| >= 0.5: keep full precision } }
Enabled by a separate env var (independent of HS_QUANTIZATION_LEVEL):
HS_ZONAL_QUANTIZATION=true # Enable MOND zonal storage
When enabled, zonal_storage: DashMap<NodeId, ZonalVector> completely replaces
the standard mmap-based vector store. All read (get_vector) and write (insert_to_storage)
paths are routed through zonal_storage.
- RAM reduction: ~30β40% for datasets where most vectors are near the origin (
||x|| < 0.5) - No precision loss at the boundary (where the metric is most sensitive)
- Compatible with all metrics, not just PoincarΓ©
Configuration
Quantization mode is set via environment variable before creating a collection.
The mode is saved in meta.json alongside each collection and applied on reload.
# Default (ScalarI8 with Anisotropic refinement)
HS_QUANTIZATION_LEVEL=scalar
# Binary (1-bit Hamming)
HS_QUANTIZATION_LEVEL=binary
# Full f64 precision (debugging / research)
HS_QUANTIZATION_LEVEL=none
β οΈ Note: The
--modeCLI flag does not exist. Configuration is exclusively throughHS_QUANTIZATION_LEVEL(env var or.envfile). The mode is stored per-collection in<data_dir>/<collection>/meta.jsonat creation time.
Note: The Lorentz SQ8 path is selected automatically when a collection's metric is
lorentz, regardless ofHS_QUANTIZATION_LEVEL. Thefrom_float_lorentz()encoder is dispatched by the index layer (hyperspace-index/src/lib.rs).
Choosing the Right Mode
Dataset characteristics
β
ββ Full precision required (research)? ββββββββ HS_QUANTIZATION_LEVEL=none
β
ββ Lorentz/Hyperbolic metric? βββββββββββββββββ Automatic (dynamic-range SQ8)
β
ββ Memory-critical (>100M vectors)? βββββββββββ HS_QUANTIZATION_LEVEL=binary
β
ββ Cosine / L2, high recall needed? βββββββββββ HS_QUANTIZATION_LEVEL=scalar (default)
β β Anisotropic refinement applied
ββ Hyperbolic, mixed density? βββββββββββββββββ Zonal (MOND) via ZonalVector store
Lorentz SQ8 & GPU
π Security & Auth
HyperspaceDB includes built-in security features for production deployments.
API Authentication
We use a simple but effective API Key mechanism.
Enabling Auth
Set the HYPERSPACE_API_KEY environment variable when starting the server.
export HYPERSPACE_API_KEY="my-secret-key-123"
./hyperspace-server
If this variable is NOT set, authentication is disabled (dev mode).
Client Usage
Clients must pass the key in the x-api-key metadata header.
Python:
client = HyperspaceClient(
host="localhost:50051",
api_key="my-secret-key-123",
user_id="tenant_name" # Optional: For multi-tenancy
)
Rust:
#![allow(unused)] fn main() { // Use the updated connect function let client = Client::connect( "http://0.0.0.0:50051".to_string(), Some("my-secret-key-123".to_string()), Some("tenant_name".to_string()) ).await?; }
Multi-Tenancy Isolation
Use x-hyperspace-user-id header to isolate data per user.
- Gateway Responsibility: Ensure your API Gateway validates user tokens and injects this header securely.
- Internal Scope: Data created with a
user_idis invisible to other users and the default admin scope.
Security Implementation
- SHA-256 Hashing: The server computes
SHA256(env_key)at startup and stores only the hash. - Constant-Time Comparison: Incoming keys are hashed and compared to prevent timing attacks.
S3 Cloud Tiering
Data Safety & Durability
HyperspaceDB Architecture Guide
HyperspaceDB is a specialized vector database designed for high-performance hyperbolic embedding search. This document details its internal architecture, storage format, and indexing strategies.
π System Overview
The system follows a strict Command-Query Separation (CQS) pattern, tailored for write-heavy ingestion and latency-sensitive search.
graph TD
Client[Client (gRPC)] -->|Insert| S[Server Service]
Client -->|Search| S
subgraph Persistence Layer
S -->|1. Append| WAL[Write-Ahead Log]
S -->|2. Append| VS[Vector Store]
end
subgraph Indexing Layer
S -->|3. Send ID| Q[Async Queue (Channel)]
Q -->|Pop| W[Indexer Worker]
W -->|Update| HNSW[HNSW Graph (RAM)]
end
subgraph Embedding Layer
S -->|InsertText| EE[Embedding Service]
EE -->|Chunking| BE[Embedding Backends]
end
subgraph Background Tasks
Snap[Snapshotter] -->|Serialize| Disk[Index Snapshot (.snap)]
end
πΎ Storage Layer (hyperspace-store)
1. Vector Storage (data/)
Vectors are stored in a segmented, append-only format using Memory-Mapped Files (mmap).
- Segments: Data is split into chunks of 65,536 vectors (
2^16). - Files:
chunk_0.hyp,chunk_1.hyp, etc. - Quantization: Vectors are optionally quantized (e.g.,
ScalarI8), reducing size from 64-bit float to 8-bit integer per dimension (8x compression).
2. Write-Ahead Log (wal.log)
Writes are durable. Every insert is appended to wal.log before being acknowledged. Upon restart, the WAL helps recover data that wasn't yet persisted in the Index Snapshot.
πΈ Indexing Layer (hyperspace-index)
Hyperbolic HNSW
We implement a modified Hierarchical Navigable Small World graph optimized for the PoincarΓ© Ball model.
- Distance Metric: PoincarΓ© distance formula: $$ d(u, v) = \text{acosh}\left(1 + 2 \frac{||u-v||^2}{(1-||u||^2)(1-||v||^2)}\right) $$
- Optimization: We compare $||u-v||^2$ and cached normalization factors $\alpha = 1/(1-||u||^2)$ to avoid expensive
acoshcalls during graph traversal. - Locking: The graph uses fine-grained
RwLockper node layer, allowing concurrent searches and updates.
Dynamic Configuration
Parameters ef_search (search depth) and ef_construction (build quality) are stored in AtomicUsize global config, allowing runtime tuning without restarts.
β‘οΈ Performance Traits
- Async Indexing: Client receives
OKas soon as data hits the WAL. Indexing happens in the background. - Zero-Copy Read: Search uses
mmapto read quantized vectors directly from OS cache without heap allocation. - SIMD Acceleration: Distance calculations use
std::simd(Portable SIMD) for 4-8x speedup on supported CPUs (AVX2, Neon).
π Lifecycle
- Startup:
- Load
index.snap(Rkyv zero-copy deserialization). - Replay
wal.logfor any missing vectors.
- Load
- Runtime:
- Serve read/write requests.
- Background worker consumes indexing queue.
- Snapshotter periodically saves graph state.
- Shutdown:
- Stop accepting writes.
- Drain indexing queue.
- Save final snapshot.
- Close file handles.
Memory Management & Stability
Cold Storage Architecture
HyperspaceDB implements a "Cold Storage" mechanism to handle large numbers of collections efficiently:
- Lazy Loading: Collections are not loaded into RAM at startup. Instead, only metadata is scanned. The actual collection (vector index, storage) is instantiated from disk only upon the first
get()request. - Idle Eviction (Reaper): A background task runs every 60 seconds to scan for idle collections. Any collection not accessed for a configurable period (default: 1 hour) is automatically unloaded from memory to free up RAM.
- Graceful Shutdown: When a collection is evicted or deleted, its
Dropimplementation ensures that all associated background tasks (indexing, snapshotting) are immediately aborted, preventing resource leaks and panicked threads.
This architecture allows HyperspaceDB to support thousands of collections while keeping the active memory footprint low, scaling based on actual usage rather than total data.
Storage Format
HyperspaceDB uses a custom segmented file format designed for:
- Fast Appends (Zero seek time).
- Mmap Compatibility (OS manages caching).
- Space Efficiency (Quantization).
Segmentation
Data is split into "Chunks" of fixed size ($2^{16} = 65,536$ vectors). This avoids allocating one giant file and allows easier lifecycle management.
data/chunk_0.hyp- ...
LSM-Tree Segmentation
HyperspaceDB 3.0 adopts an LSM-Tree architecture. Data flows from hot memory to immutable on-disk segments:
- MemTable (Hot): New vectors are indexed in an in-memory HNSW.
- Immutable Chunks (Cold): When a WAL segment is rotated, the Flush Worker persists the MemTable into an immutable
.hypchunk. During this flush, the in-memory HNSW topology is re-written into a Spatial Navigable Graph (Vamana / DiskANN format) to minimize page faults when read via mmap from SSDs. - Local vs Cloud: Chunks can live on local NVMe or be tiered to S3.
S3 Cloud Tiering (Optional)
Using the s3-tiering feature, HyperspaceDB can offload cold chunks to an S3-compatible object store.
- LRU Cache: A byte-weighted cache (
HS_MAX_LOCAL_CACHE_GB) manages how much data stays on local disk. - Lazy Load: Search queries automatically trigger a download if a required chunk is only present in the cloud.
- Backpressure: Semaphore-limited concurrent downloads prevent IO/Network saturations.
Directory Structure (Multi-Tenancy)
File Layout
Each .hyp file is a flat array of fixed-size records. No headers, no metadata. Metadata is stored in the Index Snapshot or recovered from layout.
Zonal Quantization (v3.0.1)
For hyperbolic collections, HyperspaceDB automatically applies Zonal Quantization (MOND theory) to vectors.
- Vectors near the origin ($||x|| < 0.5$) are tightly compressed as
i8(Core). - Vectors near the infinite boundary ($||x|| \to 1$) are preserved in pure
f64(Boundary) to maintain strict exact precision required for hierarchical routing.
Record Structure (ScalarI8)
When QuantizationMode::ScalarI8 is active (and vector is within the Core zone):
| Byte Offset | Content | Type |
|---|---|---|
0..N | Quantized Coordinates | [i8; N] |
N..N+4 | Pre-computed Alpha | f32 |
Total size per vector (for N=8): $8 + 4 = 12$ bytes. Without quantization (f64), it would be $8 \times 8 = 64$ bytes. Savings: ~81%.
Optional raw f32 storage (v2.2.x)
For QuantizationMode::None, you can enable:
HS_STORAGE_FLOAT32=true
In this mode, raw vectors are stored as f32 in mmap and promoted to f64 in distance kernels.
This reduces raw-vector memory footprint by ~50% while preserving numerical behavior in hyperbolic math paths.
Write-Ahead Log (WAL)
Path: wal.log
The WAL ensures durability. Format:
id(u32)vector([f64; N])
It is only read during startup if the Index Snapshot is older than the last WAL entry.
RAM Backend (WASM)
For WebAssembly deployments (hyperspace-wasm), the storage backend automatically switches to RAMVectorStore.
- Structure: Uses
Vec<Arc<RwLock<Vec<u8>>>>(Heap Memory) instead of memory-mapped files. - Segmentation: The same chunking logic (64k vectors) is preserved. This allows the core
HNSWindex to use the same addressing logic (id >> 16,id & 0xFFFF) regardless of the backend. - Persistence: Persistence is achieved by serializing the "used" portion of segments into a
Vec<u8>blob and storing it in the browser's IndexedDB. - Pre-allocation: Creating a DB instance pre-allocates the first chunk (64k * VectorSize bytes) to avoid frequent allocation calls during inserts.
The Hyperbolic Geometry
HyperspaceDB operates in the PoincarΓ© Ball Model & Lorentz (hyperboloid) of hyperbolic geometry. This space is uniquely suited for hierarchical data (trees, graphs, taxonomies) because the amount of "space" available grows exponentially with the radius, similar to how the number of nodes in a tree grows with depth.
The Distance Formula
The distance $d(u, v)$ between two vectors $u, v$ in the PoincarΓ© ball ($\mathbb{D}^n$) is defined as:
$$ d(u, v) = \text{arccosh}\left( 1 + 2 \frac{|u - v|^2}{(1 - |u|^2)(1 - |v|^2)} \right) $$
Where:
- $|u|$ is the Euclidean norm of vector $u$.
- The vectors must satisfy $|u| < 1$.
Optimization: The "Alpha" Trick
Calculating arccosh and divisions for every distance check in HNSW is expensive. HyperspaceDB optimizes this by pre-computing the curvature factors.
For every vector $x$, we store an additional scalar $\alpha_x$:
$$ \alpha_x = \frac{1}{1 - |x|^2} $$
This is stored alongside the quantized vector in our memory-mapped storage.
The Monotonicity Trick
Since $f(x) = \text{arccosh}(x)$ is a monotonically increasing function for $x \ge 1$, we do not need to compute the full arccosh during the Nearest Neighbor Search phase. We only need to compare the arguments:
$$ \delta(u, v) = |u - v|^2 \cdot \alpha_u \cdot \alpha_v $$
If $\delta(A) < \delta(B)$, then $d(A) < d(B)$.
HyperspaceDB performs all internal graph traversals using only $\delta$ (SIMD-optimized), and applies the heavy arccosh only when required by final ranking/output.
Lorentz Model (Hyperboloid)
For Lorentz vectors x = (t, x1, ..., xn) and y = (s, y1, ..., yn):
$$ \langle x, y \rangle_L = -ts + \sum_i x_i y_i $$
Distance:
$$ d(x, y) = \operatorname{arcosh}\left(-\langle x, y \rangle_L\right) $$
Validation constraints:
- upper sheet:
t > 0 - unit hyperboloid:
-t^2 + x_1^2 + ... + x_n^2 = -1
Optimization: SQ8 Quantization
For the Lorentz model, HyperspaceDB implements a specialized 8-bit scalar quantization (SQ8) with dynamic range scaling and GPU/SIMD acceleration. See Lorentz Quantization Details.
SDK Hyperbolic Utilities (v2.2.1)
To keep core DB focused and still support geometry-heavy clients, SDKs include helpers:
- Python:
hyperspace.mobius_add,hyperspace.exp_map,hyperspace.log_map - Rust:
hyperspace_sdk::math::{mobius_add, exp_map, log_map, parallel_transport, riemannian_gradient, frechet_mean} - TypeScript:
HyperbolicMath.mobiusAdd/expMap/logMap/parallelTransport/riemannianGradient/frechetMean
FrΓ©chet mean support is useful for reconsolidation workflows where multiple nearby hyperbolic embeddings should be merged into one robust centroid.
These functions are useful for L-system growth, manifold transforms, and pre-insert vector shaping pipelines.
Geometric Search (Spatial Filters)
HyperspaceDB v3.0 introduces native geometric predicates. Unlike metadata filters, these are based on the vector's position in the embedding space.
1. The Ball Filter (Proximity)
Mathematical definition: ${ v \in \mathbb{D}^n \mid d(c, v) \le r }$. Used for finding all entities within a semantic radius of a concept center $c$.
2. The Box Filter (Constraints)
Mathematical definition: ${ v \in \mathbb{R}^n \mid \forall i, \min_i \le v_i \le \max_i }$. Used for bounding reasoning to a specific workspace (e.g., "only consider nodes in the 1st quadrant").
3. The Cone Filter (Angular Logic)
Mathematical definition (Angular distance): ${ v \in \mathbb{R}^n \mid \text{angle}(\text{axes}, v) \le \text{aperture} }$. Inspired by ConE (Zhang & Wang, 2021), this filter allows for modeling logical entailment and hierarchy-aware FOV. In HyperspaceDB, this is implemented as an $O(N)$ dot-product check against the aperture thresholds.
Performance: Sequential Bitset Pruning
To ensure these filters don't slow down the engine, geometric intersection is performed efficiently during the candidate selection phase. We use a Bitset Pruning pattern:
- Generate a bitset of candidates satisfying the geometric query.
- Perform HNSW bitwise-AND intersection during the search phase.
- This allows for $O(1)$ rejection of candidates outside the region of interest.
Zero-Copy Hyperbolic HNSW
Our implementation of Hierarchical Navigable Small Worlds is unique in two ways:
- Metric: It natively speaks hyperbolic geometry.
- Concurrency: It uses fine-grained locking (
parking_lot::RwLock) on every node.
Graph Structure
The graph consists of Layers (0..max).
- Layer 0: Contains ALL vectors. This is the base ground truth.
- Layer N: Contains a random subset of vectors from Layer N-1.
This creates a skip-list-like structure for navigation.
The "Select Neighbors" Heuristic
When connecting a new node $U$ to neighbors in HNSW, we use a heuristic to ensure diversity.
Standard Euclidean HNSW checks:
- Add neighbor $V$ if $dist(U, V)$ is minimal.
- Skip $V$ if it is closer to an already selected neighbor than to $U$.
Hyperbolic Adaptation: We use the PoincarΓ© distance for this check. Because the space expands exponentially, "diversity" is easier to achieve, but "closeness" is tricky because points near the boundary (norm $\approx$ 1) have massive distances even if they look close in Euclidean space.
Our heuristic strictly respects the PoincarΓ© metric, preventing "short-circuiting" through the center of the ball unless mathematically valid.
Locking Strategy
We do not use a global lock.
- Reading: Search traverses nodes acquiring brief Read Locks.
- Writing: Indexer acquires Write Locks only on the specific adjacency lists (layers) it is modifying.
This allows insert and search to run in parallel with high throughput.
Batch Search Acceleration
For high-throughput batch search operations, HNSW can offload Minkowski distance computations to the GPU using WGSL compute shaders. This is particularly effective when combined with Lorentz SQ8 Quantization.