[H] HyperspaceDB

Build Status License: MIT Rust Version

Fastest Vector Database for Hierarchical & Flat Data written in Rust.
HyperspaceDB natively supports both the PoincarΓ© ball model (for hierarchies) and Euclidean space (for standard OpenAI/BGE embeddings), delivering extreme performance through specialized SIMD kernels.


πŸš€ Key Features

  • ⚑️ Extreme Performance: Built with Nightly Rust and SIMD intrinsics for maximum search throughput.
  • πŸ“ Cognitive Math Engine: Hyperbolic HNSW optimized for the PoincarΓ© and Lorentz metrics, and O(N) Wasserstein-1 logic.
  • πŸ“¦ Compression: Integrated ScalarI8 and Binary quantization reduces memory footprint by 87% to 98%.
  • 🧡 Async Write Pipeline: Decoupled ingestion with a background indexing worker and WAL for 10x faster inserts.
  • πŸ–₯️ Mission Control TUI: Real-time terminal dashboard for monitoring QPS, segments, and system health.
  • πŸ•ΈοΈ Edge Ready: WASM compilation target allows running the full DB in browser with Local-First privacy and IndexedDB persistence.
  • πŸ› οΈ Runtime Tuning: Dynamically adjust ef_search and ef_construction parameters via gRPC on-the-fly.
  • πŸ™ Multi-Tenancy: Native SaaS support with namespace isolation (user_id) and billing stats.
  • πŸ” Replication: Leader-Follower architecture with Anti-Entropy catch-up for high availability.
  • βš–οΈ Cognitive Math & Tribunal Router: Native SDK utilities for calculating geometric trust scores on graphs to detect LLM hallucinations.
  • πŸ“‘ Memory Reconsolidation: Trigger AI sleep mode natively within the DB to restructure vectors via Flow Matching / Riemannian SGD.

πŸ›  Architecture

HyperspaceDB follows a Persistence-First, Index-Second design:

  1. gRPC Request: Insert/Search commands arrive via a high-performance Tonic server.
  2. WAL & Segmented Storage: Every insert is immediate persisted to a Write-Ahead Log and a memory-mapped segmented file store.
  3. Background Indexer: The HNSW graph is updated asynchronously by a dedicated thread-pool, ensuring 0ms search blocking.
  4. Snapshots: Real-time graph topology is periodically serialized using rkyv for near-instant restarts.

πŸƒ Quick Start

1. Build and Start Server

Make sure you have just and nightly rust installed.

cargo build --release
./target/release/hyperspace-server

2. Launch Dashboard

./target/release/hyperspace-cli

3. Use Python SDK

pip install ./sdks/python
from hyperspace import HyperspaceClient

client = HyperspaceClient("localhost:50051")
client.insert(vector=[0.1]*8, metadata={"category": "tech"})
results = client.search(vector=[0.11]*8, top_k=5)

πŸ“Š Performance Benchmarks

Tested on M4 Pro (Emulated), 1M Vectors (8D)

  • Insert Throughput: ~156,000 vectors/sec (Sustained)
  • Search Latency: ~2.47ms (156,000 QPS) @ 1M scale
  • Storage Efficiency: Automatic segmentation + mmap

"The 1 Million Challenge"

HyperspaceDB successfully handles 1,000,000 vectors with zero degradation compared to traditional vector DBs, maintaining 156,000 QPS at the 1M scale.


πŸ“„ License

AGPLv3 Β© YARlabs

Evaluation & Benchmarks

HyperspaceDB is optimized for two critical metrics: Throughput (Ingestion speed) and Latency (Search speed).

Test Environment

  • Hardware: Apple M4 Pro (Emulated Environment) / Linux AVX2
  • Dataset: 1,000,000 vectors, 1024 Dimensions, Random Distribution in Unit Ball.
  • Config: ef_construction=400, ef_search=400

Results

πŸš€ Ingestion Speed

Thanks to the Async Write Buffer (WAL) and background indexing, ingestion does not block user requests.

CountTimeThroughputStorage Sements
10,0000.6s15,624 vec/s1
100,0006.5s15,300 vec/s2
1,000,00064.8s15,420 vec/s15

πŸ” Search Latency (1M Scale)

At 1 million vectors, search performance degrades linearly with graph depth ($\log N$), proving effective HNSW implementation.

MetricValue
QPS14,668 queries/sec
Avg Latency0.07 ms
P99 Latency< 1.0 ms

Why is it so fast?

  1. ScalarI8 Quantization: Fits 8x more vectors in CPU cache.
  2. No acosh: Inner loop uses a monotonic proxy function ($\delta$).
  3. SIMD: Vector operations use platform-specific intrinsics.

Installation

HyperspaceDB runs on Linux and macOS. Windows is supported via WSL2.

Prerequisites

  • Rust: Nightly toolchain is required for SIMD features.
  • Protoc: Protocol Buffer compiler for gRPC.

The easiest way to get started.

docker pull glukhota/hyperspace-db:latest
# or build locally
docker build -t hyperspacedb .

docker run -p 50051:50051 -v $(pwd)/data:/app/data hyperspacedb

Option 2: Build from Source

  1. Install dependencies

    # Ubuntu/Debian
    sudo apt install protobuf-compiler cmake
    
    # macOS
    brew install protobuf
    
  2. Install Rust Nightly

    rustup toolchain install nightly
    rustup default nightly
    
  3. Clone and Build

    git clone https://github.com/yarlabs/hyperspace-db
    cd hyperspace-db
    cargo build --release
    
  4. Run

    ./target/release/hyperspace-server
    

Quick Start

Once the server is running on localhost:50051, you can use any official SDK.

1) Start server

cargo build --release
./target/release/hyperspace-server

2) Open dashboard

http://localhost:50050

3) First interaction (Python)

from hyperspace import HyperspaceClient

client = HyperspaceClient("localhost:50051", api_key="I_LOVE_HYPERSPACEDB")
collection = "quickstart"

client.delete_collection(collection)
client.create_collection(collection, dimension=3, metric="cosine")

client.insert(id=1, vector=[0.1, 0.2, 0.3], collection=collection)
client.insert(id=2, vector=[0.2, 0.1, 0.4], collection=collection)

print(client.search(vector=[0.1, 0.2, 0.3], top_k=2, collection=collection))

# Batch search (recommended for throughput)
batch = client.search_batch(
    vectors=[[0.1, 0.2, 0.3], [0.2, 0.1, 0.4]],
    top_k=2,
    collection=collection,
)
print(batch)

4) Metric notes

  • cosine, l2, euclidean: general embeddings.
  • poincare: vectors must satisfy ||x|| < 1.
  • lorentz: vectors must be on upper hyperboloid sheet.

Python SDK

The official Python client provides an ergonomic wrapper around the gRPC interface.

Installation

Install from PyPI:

pip install hyperspacedb

Client-Side Vectorization (Fat Client)

The SDK supports built-in embedding generation using popular providers (OpenAI, Cohere, etc.). This allows you to insert and search using raw text.

Installation with Extras

# Install with OpenAI support
pip install ".[openai]"

# Install with All embedders support
pip install ".[all]"

Usage

from hyperspace import HyperspaceClient, OpenAIEmbedder

# 1. Init with Embedder
embedder = OpenAIEmbedder(api_key="sk-...")
client = HyperspaceClient(embedder=embedder)

# 2. Insert Document
client.insert(id=1, document="HyperspaceDB supports Hyperbolic geometry.", metadata={"tag": "math"})

# 3. Search by Text
results = client.search(query_text="non-euclidean geometry", top_k=5)

Reference

HyperspaceClient

class HyperspaceClient(host="localhost:50051", api_key=None, embedder=None)
  • embedder: Instance of BaseEmbedder subclass.

Supported Embedders

  • OpenAIEmbedder
  • OpenRouterEmbedder
  • CohereEmbedder
  • VoyageEmbedder
  • GoogleEmbedder
  • SentenceTransformerEmbedder (Local models)

Methods

insert(id, vector=None, document=None, metadata=None) -> bool

  • id (int): Unique identifier (u32).
  • vector (List[float]): The embedding.
  • document (str): Raw text to embed (requires configured embedder).
  • Note: Provide either vector OR document.

search(vector=None, query_text=None, top_k=10, ...) -> List[dict]

  • vector (List[float]): Query vector.
  • query_text (str): Raw text query.

search_batch(vectors, top_k=10, collection="") -> List[List[dict]]

Batch search API that sends multiple SearchRequest objects in one gRPC call.

rebuild_index(collection, filter_query=None) -> bool

Supports metadata-aware pruning during rebuild:

client.rebuild_index(
    "docs_py",
    filter_query={"key": "energy", "op": "lt", "value": 0.1},
)

delete(id, collection="") -> bool

Removes a single vector by its ID.

analyze_delta_hyperbolicity(vectors, num_samples=1000) -> (float, str)

Analyzes a set of vectors to determine if they exhibit hyperbolic structure. Returns the Gromov delta and a recommended metric ("lorentz", "poincare", or "l2").

Graph traversal methods

  • get_node(collection, id, layer=0)
  • get_neighbors(collection, id, layer=0, limit=64, offset=0)
  • get_concept_parents(collection, id, layer=0, limit=32)
  • traverse(collection, start_id, max_depth=2, max_nodes=256, layer=0, filter=None, filters=None)
  • find_semantic_clusters(collection, layer=0, min_cluster_size=3, max_clusters=32, max_nodes=10000)

Hyperbolic math utilities

from hyperspace import (
    mobius_add,
    exp_map,
    log_map,
    parallel_transport,
    riemannian_gradient,
    frechet_mean,
)

Rust SDK

For low-latency applications, connect directly using the Rust SDK.

Installation

Add to your Cargo.toml:

[dependencies]
hyperspace-sdk = "2.2.1"
tokio = { version = "1", features = ["full"] }

Usage

use hyperspace_sdk::Client;
use std::collections::HashMap;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Connect (with optional API Key)
    let api_key = std::env::var("HYPERSPACE_API_KEY").ok();
    let mut client = Client::connect(
        "http://127.0.0.1:50051".into(),
        api_key,
        None
    ).await?;

    // --- Optional: Configure Embedder (Feature: "embedders") ---
    #[cfg(feature = "embedders")]
    {
        // Example: OpenAI
        use hyperspace_sdk::OpenAIEmbedder;
        let openai_key = std::env::var("OPENAI_API_KEY").unwrap();
        let embedder = OpenAIEmbedder::new(openai_key, "text-embedding-3-small".to_string());
        
        // Or: Voyage AI
        // use hyperspace_sdk::VoyageEmbedder;
        // let embedder = VoyageEmbedder::new(api_key, "voyage-large-2".to_string());

        client.set_embedder(Box::new(embedder));
        
        // Insert Document
        let mut meta = HashMap::new();
        meta.insert("tag".to_string(), "rust".to_string());
        client.insert_document(100, "Rust is blazing fast.", meta).await?;
        
        // Search Document
        let results = client.search_document("fast systems language", 5).await?;
        println!("Document Search Results: {:?}", results);
    }
    // -----------------------------------------------------------

    // 2. Insert with Vector (Low-Level)
    let vec = vec![0.1; 8];
    let mut meta = HashMap::new();
    meta.insert("name".to_string(), "item-42".to_string());
    
    client.insert(42, vec.clone(), meta, None).await?;

    // 3. Basic Search
    let results = client.search(vec.clone(), 5, None).await?;
    
    // 4. Advanced / Hybrid Search
    // e.g. Find semantically similar items that also mention "item"
    let hybrid = Some(("item".to_string(), 0.5)); 
    let results = client.search_advanced(vec, 5, vec![], hybrid, None).await?;
    
    for res in results {
        println!("Match: {} (dist: {})", res.id, res.distance);
    }
    
    Ok(())
}

Features

  • embedders: Enables set_embedder, insert_document, and search_document. Requires reqwest and serde.

Use search_batch or search_batch_f32 to reduce per-request overhead in high-concurrency workloads.

Graph Traversal API

Rust SDK exposes graph calls directly:

  • get_node
  • get_neighbors
  • get_concept_parents
  • traverse
  • find_semantic_clusters

Rebuild with Metadata Pruning

Use rebuild_index_with_filter to run vacuum/rebuild and prune vectors in one request:

#![allow(unused)]
fn main() {
client
    .rebuild_index_with_filter(
        "docs_rust".to_string(),
        "energy".to_string(),
        "lt".to_string(),
        0.1,
    )
    .await?;
}

Hyperbolic Math Utilities

#![allow(unused)]
fn main() {
use hyperspace_sdk::math::{
    mobius_add, exp_map, log_map, parallel_transport, riemannian_gradient, frechet_mean
};
}

WebAssembly (WASM)

Integrations (LangChain & n8n)

Model Context Protocol (MCP)

API Reference

HyperspaceDB operates on a Dual-API architecture:

  1. gRPC (Data Plane): High-performance ingestion and search.
  2. HTTP (Control Plane): Management, monitoring, and dashboard integration.

πŸ“‘ gRPC API (Data Plane)

Defined in hyperspace.proto. Used by SDKs (Python, Rust, Go).

Collection Management

CreateCollection

Creates a new independent vector index.

rpc CreateCollection (CreateCollectionRequest) returns (StatusResponse);

message CreateCollectionRequest {
  string name = 1;
  uint32 dimension = 2; // e.g. 1536, 1024, 64
  string metric = 3;    // "l2", "euclidean", "cosine", "poincare", "lorentz"
}

DeleteCollection

Drops a collection and all its data.

rpc DeleteCollection (DeleteCollectionRequest) returns (StatusResponse);

ListCollections

Retrieves all active collections for the current tenant, including their metadata.

rpc ListCollections (Empty) returns (ListCollectionsResponse);

message ListCollectionsResponse {
  repeated CollectionSummary collections = 1;
}

message CollectionSummary {
  string name = 1;
  uint64 count = 2;
  uint32 dimension = 3;
  string metric = 4;
}

GetCollectionStats

Returns real-time statistics for a single collection.

rpc GetCollectionStats (CollectionStatsRequest) returns (CollectionStatsResponse);

message CollectionStatsResponse {
  uint64 count = 1;
  uint32 dimension = 2;
  string metric = 3;
  uint64 indexing_queue = 4;
}

Vector Operations

Insert

Ingests a vector into a specific collection.

rpc Insert (InsertRequest) returns (InsertResponse);

message InsertRequest {
  string collection = 1;      // Collection name
  repeated double vector = 2; // Data point
  uint32 id = 3;              // External ID
  map<string, string> metadata = 4; // Metadata tags
  DurabilityLevel durability = 7; // Durability override
  map<string, MetadataValue> typed_metadata = 8; // Typed metadata (int/float/bool/string)
}

enum DurabilityLevel {
  DEFAULT_LEVEL = 0; // Use server config
  ASYNC = 1;         // Flush OS cache (Fastest)
  BATCH = 2;         // Background fsync (Balanced)
  STRICT = 3;        // Fsync every write (High Safety)
}

typed_metadata is the preferred metadata path for new clients. String metadata remains as a compatibility path.

Finds nearest neighbors.

rpc Search (SearchRequest) returns (SearchResponse);

message SearchRequest {
  string collection = 1;
  repeated double vector = 2;
  uint32 top_k = 3;
  // Metadata string filter (e.g. "category:book")
  map<string, string> filter = 4;
  // Complex filter object
  repeated Filter filters = 5;
  // Hybrid search
  optional string hybrid_query = 6;
  optional float hybrid_alpha = 7;
  // Wasserstein 1D CDF O(N) distance
  optional bool use_wasserstein = 8;
}

#### Geometric Filters (New in v3.0)

HyperspaceDB v3.0 introduces native spatial constraints. These run on the bitset level inside the engine and are significantly faster than application-level filtering.

```protobuf
message Filter {
  oneof condition {
    Match match = 1;
    Range range = 2;
    InCone in_cone = 3;
    InBox in_box = 4;
    InBall in_ball = 5;
  }
}

// 1. Proximity Filter
message InBall {
  repeated double center = 1;
  double radius = 2;
}

// 2. N-Dimensional Bounding Box
message InBox {
  repeated double min_bounds = 1;
  repeated double max_bounds = 2;
}

// 3. Angular Cone (for ConE-style embeddings)
message InCone {
  repeated double axes = 1;      // Vector direction
  repeated double apertures = 2; // Angular width (radians)
  double cen = 3;                // Centrality offset
}

`SearchResult` now includes both `metadata` and `typed_metadata`.
Range filters are evaluated with numeric semantics (`f64`) against typed metadata numeric values.
For gRPC clients, decimal thresholds are supported via `Range.gte_f64` / `Range.lte_f64` (`gte/lte` `int64` remains as compatibility path).

gRPC `Range` examples:

```protobuf
// Integer threshold (compatibility path)
Filter {
  range: {
    key: "depth",
    gte: 2,
    lte: 10
  }
}

// Decimal threshold (recommended for typed numeric metadata)
Filter {
  range: {
    key: "energy",
    gte_f64: 0.8,
    lte_f64: 1.0
  }
}

SearchBatch

Finds nearest neighbors for multiple queries in a single RPC call.

rpc SearchBatch (BatchSearchRequest) returns (BatchSearchResponse);

message BatchSearchRequest {
  repeated SearchRequest searches = 1;
}

message BatchSearchResponse {
  repeated SearchResponse responses = 1;
}

Recommended for high-concurrency clients and benchmarks to reduce per-request gRPC overhead.

SubscribeToEvents

Streams CDC events for post-insert/delete hooks.

rpc SubscribeToEvents (EventSubscriptionRequest) returns (stream EventMessage);

enum EventType {
  EVENT_UNKNOWN = 0;
  VECTOR_INSERTED = 1;
  VECTOR_DELETED = 2;
}

message EventSubscriptionRequest {
  repeated EventType types = 1;
  optional string collection = 2;
}

message EventMessage {
  EventType type = 1;
  oneof payload {
    VectorInsertedEvent vector_inserted = 2;
    VectorDeletedEvent vector_deleted = 3;
  }
}

Use this stream to build external pipelines (audit, Elasticsearch sync, graph projections, Neo4j updaters). SDKs (Python/TypeScript/Rust) expose convenience subscription methods for this stream.

Reliability note:

  • stream consumers may lag under burst load; server now handles lagged broadcast reads without dropping the whole stream task;
  • tune HS_EVENT_STREAM_BUFFER for higher event fan-out pressure.

Delete

Removes a single vector from a collection by its external ID.

rpc Delete (DeleteRequest) returns (DeleteResponse);

message DeleteRequest {
  string collection = 1;
  uint32 id = 2;
}

message DeleteResponse {
  bool success = 1;
}

πŸ” Delta Sync Protocol

Advanced synchronization for consistency verification and recovery.

SyncHandshake

Computes the difference between client and server states using Merkle-like bucket hashes.

rpc SyncHandshake (SyncHandshakeRequest) returns (SyncHandshakeResponse);

SyncPull

Streams missing vectors from the server based on differing buckets.

rpc SyncPull (SyncPullRequest) returns (stream SyncVectorData);

SyncPush

Streams client-side unique vectors to the server to achieve global consistency.

rpc SyncPush (stream SyncVectorData) returns (SyncPushResponse);

MetadataValue (Typed Metadata)

message MetadataValue {
  oneof kind {
    string string_value = 1;
    int64 int_value = 2;
    double double_value = 3;
    bool bool_value = 4;
  }
}

Graph Traversal API (v2.3)

rpc GetNode (GetNodeRequest) returns (GraphNode);
rpc GetNeighbors (GetNeighborsRequest) returns (GetNeighborsResponse);
rpc GetConceptParents (GetConceptParentsRequest) returns (GetConceptParentsResponse);
rpc Traverse (TraverseRequest) returns (TraverseResponse);
rpc FindSemanticClusters (FindSemanticClustersRequest) returns (FindSemanticClustersResponse);

Key safety guards:

  • GetNeighborsRequest.limit and offset for bounded pagination.
  • TraverseRequest.max_depth and max_nodes to prevent unbounded graph walks.
  • FindSemanticClustersRequest.max_clusters and max_nodes for bounded connected-component scans.

TraverseRequest is filter-aware and supports both:

  • filter (map<string,string>)
  • filters (Match / Range)

GetNeighborsResponse now includes edge_weights, where edge_weights[i] is the distance from source node to neighbors[i].

RebuildIndex with pruning filter (v2.2.1)

message RebuildIndexRequest {
  string name = 1;
  optional VacuumFilterQuery filter_query = 2;
}

message VacuumFilterQuery {
  string key = 1;
  string op = 2; // "lt" | "lte" | "gt" | "gte" | "eq" | "ne"
  double value = 3;
}

Use this API for pruning cycles when you need to rebuild an index and drop low-value vectors in one server-side operation.

TriggerReconsolidation (v3.0.1)

Trigger AI Sleep Mode (Riemannian SGD / Flow Matching) directly on the engine to algorithmically shift vectors.

rpc TriggerReconsolidation (ReconsolidationRequest) returns (StatusResponse);

message ReconsolidationRequest {
  string collection = 1;
  repeated double target_vector = 2;
  double learning_rate = 3;
}

InsertText (v3.0.1)

Inserts raw text to be embedded and stored on the server.

rpc InsertText (InsertTextRequest) returns (InsertResponse);

message InsertTextRequest {
  string collection = 1;
  string text = 2;
  uint32 id = 3;
  map<string, MetadataValue> typed_metadata = 4;
}

Vectorize (v3.0.1)

Converts text to a vector using the server's embedding engine.

rpc Vectorize (VectorizeRequest) returns (VectorizeResponse);

message VectorizeRequest {
  string text = 1;
  string metric = 2; // "l2", "cosine", "poincare", "lorentz"
}

message VectorizeResponse {
  repeated double vector = 1;
}

SearchText (v3.0.1)

Searches the collection using a text query.

rpc SearchText (SearchTextRequest) returns (SearchResponse);

message SearchTextRequest {
  string collection = 1;
  string text = 2;
  uint32 top_k = 3;
  repeated Filter filters = 4;
}

🌐 HTTP API (Control Plane)

Served on port 50050 (default). All endpoints under /api.

Authentication & Multi-Tenancy

Every request should include:

  • x-api-key: API Key (optional if disabled, but recommended)
  • x-hyperspace-user-id: Tenant Identifier (e.g. client_123). If omitted, defaults to default_admin.

Cluster Status

GET /api/cluster/status

Returns the node's identity and topology role.

Swarm Peers (Gossip Protocol)

GET /api/swarm/peers

Returns active peers discovered via UDP multicast (Edge-to-Edge Sync).

{
  "gossip_enabled": true,
  "peer_count": 2,
  "peers": [...]
}
{
  "node_id": "uuid...",
  "role": "Leader", // or "Follower"
  "upstream_peer": null,
  "downstream_peers": []
}

Node Status (Compatibility)

GET /api/status

Returns runtime status and node configuration. Dashboard uses this endpoint first, with fallback to /api/cluster/status.

System Metrics

GET /api/metrics

Real-time system resource usage.

{
    "cpu_usage_percent": 12,
    "ram_usage_mb": 512,
    "disk_usage_mb": 1024,
    "total_collections": 5,
    "total_vectors": 1000000
}

Admin / Billing (Since v2.0)

Requires user_id: admin

GET /api/admin/usage

Returns JSON map of user_id -> usage_stats:

{
  "tenant_A": {
    "collection_count": 2,
    "vector_count": 1500,
    "disk_usage_bytes": 1048576
  }
}

List Collections

GET /api/collections

Returns summary of all active collections.

[
  {
    "name": "my_docs",
    "count": 1500,
    "dimension": 1536,
    "metric": "l2"
  }
]

Collection Search (HTTP Playground)

POST /api/collections/{name}/search

Convenience endpoint for dashboard/manual testing.

{
  "vector": [0.1, 0.2, 0.3],
  "top_k": 5
}

Graph HTTP Endpoints (Dashboard / tooling)

  • GET /api/collections/{name}/graph/node?id={id}&layer={layer}
  • GET /api/collections/{name}/graph/neighbors?id={id}&layer={layer}&limit={limit}&offset={offset}
  • GET /api/collections/{name}/graph/parents?id={id}&layer={layer}&limit={limit}
  • POST /api/collections/{name}/graph/traverse
  • POST /api/collections/{name}/graph/clusters

User Guide

Server Configuration

HyperspaceDB is configured via environment variables or a .env file.

Core Settings

VariableDefaultDescription
RUST_LOGinfoLog level (debug, info, error)
HS_PORT50051gRPC listening port
HS_HTTP_PORT50050HTTP Dashboard port
HS_DATA_DIR./dataPath to store segments and WAL
HS_IDLE_TIMEOUT_SEC3600Inactivity time (seconds) before collection unloads to disk
HS_DIMENSION1024Default vector dimensionality (8, 64, 768, 1024, 1536, 3072, 4096, 8192)
HS_METRICcosineDistance metric (cosine, poincare, l2, euclidean, lorentz)
HS_QUANTIZATION_LEVELnoneCompression (none, scalar (i8), binary (1-bit))
HS_STORAGE_FLOAT32falseStore raw vectors as f32 (mode=none) and promote to f64 in distance kernels
HS_FAST_UPSERT_DELTA0.0Fast upsert L2 threshold. 0.0 disables; typical 0.001..0.05 for iterative updates; too high can keep stale graph links
HS_EVENT_STREAM_BUFFER1024Broadcast ring size for CDC and replication streams
HS_RERANK_ENABLEDfalseEnable exact top-K re-ranking after ANN candidate retrieval
HS_RERANK_OVERSAMPLE4Candidate multiplier used before exact re-rank (top_k * factor)
HS_GPU_BATCH_ENABLEDfalseEnable runtime auto-dispatch policy for batch metric kernels
HS_GPU_MIN_BATCH128Minimum batch size for GPU offload policy
HS_GPU_MIN_DIM1024Minimum vector dimension for GPU offload policy
HS_GPU_MIN_WORK262144Minimum workload (batch * dim) for GPU offload
HS_GPU_L2_ENABLEDtrueEnable GPU dispatch for L2 batch kernel (requires gpu-runtime feature)
HS_GPU_COSINE_ENABLEDtrueEnable GPU dispatch for cosine batch kernel (requires gpu-runtime feature)
HS_GPU_POINCARE_ENABLEDtrueEnable GPU dispatch for PoincarΓ© batch kernel (requires gpu-runtime feature)
HS_GPU_LORENTZ_ENABLEDtrueEnable GPU dispatch for Lorentz float batch kernel (runtime path)
HS_SEARCH_BATCH_INNER_CONCURRENCY1Internal parallel fan-out in SearchBatch handler (bounded)
HS_SEARCH_CONCURRENCY0Global concurrent search-task limit per collection (0 = auto by CPU cores, max clamped to CPU*4)

Cloud Tiering (S3)

Enabled only when compiled with s3-tiering feature.

VariableDefaultDescription
HS_STORAGE_BACKENDlocallocal (all chunks on disk) or s3 (offload cold chunks)
HS_MAX_LOCAL_CACHE_GB10Hard limit for local disk cache in Gigabytes
HS_S3_BUCKET-Target S3 bucket name
HS_S3_REGIONus-east-1AWS Region
HS_S3_ENDPOINT-Custom endpoint (e.g. http://minio:9000)
HS_S3_ACCESS_KEY-S3 Access Key ID
HS_S3_SECRET_KEY-S3 Secret Access Key
HS_S3_MAX_RETRIES5Retries for failed uploads/downloads
HS_S3_UPLOAD_CONCURRENCY4Semaphore-limited parallel uploads
HS_WAL_SEGMENT_SIZE_MB256Size before WAL rotation (influences chunk size)
HS_CHUNK_PROBE_K3Number of most relevant chunks to search per query

HNSW Index Tuning

VariableDefaultDescription
HS_HNSW_M64Max connections per layer
HS_HNSW_EF_CONSTRUCT200Build quality (50-500). Higher = slower build, better recall.
HS_HNSW_EF_SEARCH100Search beam width (10-500). Higher = slower search, better recall.
HS_FILTER_BRUTEFORCE_THRESHOLD50000If filtered candidate count is below threshold, layer-0 uses exact brute-force instead of graph traversal
HS_INDEXER_CONCURRENCY1Check README for threading strategies (0=Auto, 1=Serial)

Persistence & Durability

VariableDefaultDescription
HYPERSPACE_WAL_SYNC_MODEbatchWAL Sync strategy: strict (fsync), batch (100ms lag), async (OS cache)
HYPERSPACE_WAL_BATCH_INTERVAL100Batch interval in milliseconds

Memory Management (Jemalloc)

HyperspaceDB uses Jemalloc for efficient memory allocation. Tune it via MALLOC_CONF:

  • Low RAM (Aggressive): MALLOC_CONF=background_thread:true,dirty_decay_ms:0,muzzy_decay_ms:0
  • Balanced (Default): MALLOC_CONF=background_thread:true,dirty_decay_ms:5000,muzzy_decay_ms:5000

Security

VariableDefaultDescription
HYPERSPACE_API_KEY-If set, requires x-api-key header for all requests

Multi-Tenancy

HyperspaceDB supports strict data isolation via the x-hyperspace-user-id header.

  • Isolation: Every request with a x-hyperspace-user-id header operates within that user's private namespace.
  • Internal Naming: Collections are stored internally as userid_collectionname.
  • Default Admin: If x-hyperspace-user-id is omitted but a valid x-api-key is provided, the user is treated as default_admin.
  • SaaS Integration: Gateways should inject this header after authenticating users.

Lorentz metric notes

When HS_METRIC=lorentz, vectors must satisfy hyperboloid constraints:

  • t > 0 (upper sheet)
  • -t^2 + x_1^2 + ... + x_n^2 = -1

Web Dashboard

HyperspaceDB includes a comprehensive Web Dashboard at http://localhost:50050.

Features:

  • Cluster Status: View node role (Leader/Follower) and topology.
  • Collections: Create, delete, and inspect collection statistics.
  • Explorer: Search playground with filters and typed metadata visibility.
  • Graph Explorer: Query neighbors and concept-parent graph views from HNSW layers.
  • Metrics: Real-time RAM and CPU usage.

TUI Dashboard (Legacy)

For terminal-based monitoring:

./hyperspace-cli

Key Controls

  • TAB: Switch tabs.
  • [S]: Trigger snapshot.
  • [V]: Trigger vacuum.
  • [Q]: Quit.

Embedding Service

Advanced Features

🀝 Federated Clustering (v1.2)

HyperspaceDB v1.2 introduces a Federated Leader-Follower architecture. This goes beyond simple read-replication, introducing Node Identity, Logical Clocks, and Topology Awareness to support future Edge-Cloud synchronization scenarios.

Concepts

Node Identity

Every node in the cluster is assigned a persistent, unique UUID (node_id) upon first startup. This ID is used to track the origin of write operations in the replication log.

Roles

  • Leader (Coordinator):
    • Accepts Writes (Insert, Delete, CreateCollection).
    • Manages the Cluster Topology.
    • Streams WAL events to connected Followers.
  • Follower (Replica):
    • Read-Only.
    • Replicates state from the Leader in real-time.
    • Can be promoted to Leader if needed.
  • Edge Node (Planned v1.4):
    • Offline-first node that accumulates writes and syncs via Merkle Trees when online.

Configuration

Leader

Simply start the server. By default, it assumes the Leader role.

./hyperspace-server --port 50051

Follower

Start with --role follower and point to the leader's URL.

./hyperspace-server --port 50052 --role follower --leader http://127.0.0.1:50051

Monitoring Topology

You can inspect the cluster state via the HTTP API on the Dashboard port (default 50050).

Request:

curl http://localhost:50050/api/cluster/status

Response:

{
  "node_id": "e8b37fde-6c60-427f-8a09-47103c2da80e",
  "role": "Leader",
  "upstream_peer": null,
  "downstream_peers": [],
  "logical_clock": 1234
}

This JSON response tells you:

  • The node's unique ID.
  • Its current role.
  • Who it is following (if Follower).
  • Who is following it (if Leader).
  • The current logical timestamp of its database state.

Edge-to-Edge Gossip Swarm (v3.0)

Beyond centralized replication, v3.0 introduces a decentralized Peer-to-Peer UDP Swarm network. This feature is crucial for robotics and offline-first autonomous agents.

Features

  • Zero-Configuration Topology: Nodes broadcast heartbeat logs via UDP (tokio::net::UdpSocket).
  • Self-Healing: Unresponsive nodes (TTL > 30s) are automatically dropped from the registry.
  • Auto-Discovery: Swarm nodes discover each other and exchange Logical Clocks and Collection Digests for the Merkle Delta Sync.

Swarm Configuration

Add these variables to your environment or .env file to start joining the global Swarm:

# Enable the Gossip listener on the specified local port
HS_GOSSIP_PORT=7946

# Bootstrapping nodes to connect to
HS_GOSSIP_PEERS=192.168.1.10:7946,192.168.1.11:7946

Swarm State Monitoring

You can monitor the active mesh structure from the dashboard UI or standard HTTP:

Request:

curl http://localhost:50050/api/swarm/peers

Response:

{
  "gossip_enabled": true,
  "peer_count": 1,
  "peers": [
    {
      "node_id": "a92jfe...",
      "addr": "192.168.1.10:50050",
      "http_port": 50050,
      "role": "Leader",
      "logical_clock": 4200,
      "collections": [
        {
          "name": "vision_system",
          "state_hash": 6712399120,
          "vector_count": 500
        }
      ],
      "last_seen_secs": 1729384910,
      "healthy": true
    }
  ]
}

🧠 Hybrid Search

HyperspaceDB combines Hyperbolic Vector Search with Lexical (Keyword) Search to provide the best of both worlds.

This is powered by Reciprocal Rank Fusion (RRF), which normalizes scores from both engines and merges them.

Conceptual Flow

  1. Vector Search: Finds semantically similar items (e.g. "smartphone" finds "iPhone").
  2. Keyword Search: Finds exact token matches in metadata (e.g. "iphone" finds items with "iphone" in title).
  3. RRF Fusion: Score = 1/(k + rank_vec) + 1/(k + rank_lex).

API Usage

Python

results = client.search(
    vector=query_vector,
    top_k=10,
    hybrid_query="apple macbook",  # Lexical query
    hybrid_alpha=0.5               # Balance factor (default 60.0 in RRF usually, but exposed as alpha here)
)

Rust

#![allow(unused)]
fn main() {
let results = client.search_advanced(
    query_vector,
    10,
    vec![], 
    Some(("apple macbook".to_string(), 0.5))
).await?;
}

Tokenization

Currently, all string metadata values are automatically tokenized (split by whitespace, lowercase, alphanumeric) and indexed in an inverted index.

πŸ“‰ Vector Quantization

HyperspaceDB supports multiple storage modes to balance Precision vs Memory vs Speed. All modes operate transparently β€” no SDK changes required.


Quantization Modes

ModeBits/dimCompressionRecall@10Best For
None64 (f64)1Γ—100%Research, exact recall
ScalarI88 (i8)8Γ—~98%Production default
SQ8 Anisotropic8 (i8)8Γ—~99%+Cosine / L2 (Sprint 6.2)
Binary1 (bit)64Γ—~75–85%Re-ranking, large datasets
Lorentz SQ88 (i8) + scale~8Γ—~95–98%Hyperboloid (Lorentz) metric
Zonal (MOND)mixed30–40%↓ RAM~99%Hyperbolic (core + boundary)

1. ScalarI8 (Default)

The default mode. Coordinates are mapped from f64 to i8 ∈ [-127, 127] via:

q_i = round(x_i * 127)       // For Poincaré: x_i ∈ (-1, 1)
  • Compression: 8Γ— vs f64
  • Recall: ~98% (@10 neighbors)
  • Distance: Dequantized at query time (a_i / 127.0)

2. SQ8 Anisotropic (Sprint 6.2 / 7.1 β€” ScaNN-Inspired)

Standard isotropic quantization applies uniform rounding to all dimensions, which distorts the direction (angle) of a vector. For Cosine/L2 metrics, angular error causes more recall degradation than magnitude error.

Anisotropic SQ8 penalizes orthogonal (directional) error far more than parallel (magnitude) error during the quantization refinement step.

Loss Function

$$L = |e_\parallel|^2 + t_w \cdot |e_\perp|^2$$

Where:

  • $e_\parallel = (e \cdot \hat{x}) \hat{x}$ β€” projection of quantization error onto the original vector direction
  • $e_\perp$ β€” component orthogonal to the original vector
  • $t_w = 10$ (anisotropy weight) β€” penalizes directional error 10Γ— more than magnitude error

Coordinate Descent Refinement

After the initial isotropic quantization, each coordinate is refined by Β±1 step in i8-space and the one minimizing the anisotropic loss is selected:

#![allow(unused)]
fn main() {
for i in 0..N {
    // Try original, +1, -1
    for delta in [-1, 0, 1] {
        let candidate = (q[i] as i16 + delta).clamp(-127, 127) as i8;
        let loss = e_parallel_sq + t_weight * e_ortho_sq;
        if loss < best_loss { best = candidate; }
    }
    q[i] = best;
}
}

Results

MetricModeRecall@10 Gain
CosineScalarI8 β†’ Anisotropic SQ8+5–8%
L2ScalarI8 β†’ Anisotropic SQ8+3–5%

Implementation

The anisotropic refinement is in QuantizedHyperVector::from_float() in crates/hyperspace-core/src/vector.rs.


3. Lorentz SQ8 (Dynamic-Range)

The Lorentz (hyperboloid) model has unbounded coordinates: the time component x[0] = cosh(r) grows exponentially. A fixed [-1, 1] mapping would saturate immediately.

Solution: Per-vector dynamic-range scaling:

scale = max(|x_i|)
q_i   = round(x_i / scale * 127)   // i8
Ξ±     = scale                        // stored in alpha field (f32)

Dequantization: x̃_i = (q_i / 127.0) * α

See Lorentz SQ8 deep-dive for full details.


4. Binary (1-bit)

Each coordinate is compressed to its sign bit. Distance uses Hamming distance.

  • Compression: 64Γ— vs f64
  • Recall: ~75–85% (metric-dependent)
  • Use case: First-pass re-ranking candidate retrieval over very large datasets
  • ⚠️ Not supported for Lorentz: sign destroys hierarchical depth information

5. Zonal Quantization β€” MOND (Sprint 6.3)

Inspired by Modified Newtonian Dynamics: near the center of hyperbolic space the metric is smooth, but it explodes near the horizon.

#![allow(unused)]
fn main() {
pub enum ZonalVector {
    Core(Vec<i8>),       // ||x|| < 0.5: compress to i8 (~8x RAM saving)
    Boundary(Vec<f64>),  // ||x|| >= 0.5: keep full precision
}
}

Enabled by a separate env var (independent of HS_QUANTIZATION_LEVEL):

HS_ZONAL_QUANTIZATION=true   # Enable MOND zonal storage

When enabled, zonal_storage: DashMap<NodeId, ZonalVector> completely replaces the standard mmap-based vector store. All read (get_vector) and write (insert_to_storage) paths are routed through zonal_storage.

  • RAM reduction: ~30–40% for datasets where most vectors are near the origin (||x|| < 0.5)
  • No precision loss at the boundary (where the metric is most sensitive)
  • Compatible with all metrics, not just PoincarΓ©

Configuration

Quantization mode is set via environment variable before creating a collection. The mode is saved in meta.json alongside each collection and applied on reload.

# Default (ScalarI8 with Anisotropic refinement)
HS_QUANTIZATION_LEVEL=scalar

# Binary (1-bit Hamming)
HS_QUANTIZATION_LEVEL=binary

# Full f64 precision (debugging / research)
HS_QUANTIZATION_LEVEL=none

⚠️ Note: The --mode CLI flag does not exist. Configuration is exclusively through HS_QUANTIZATION_LEVEL (env var or .env file). The mode is stored per-collection in <data_dir>/<collection>/meta.json at creation time.

Note: The Lorentz SQ8 path is selected automatically when a collection's metric is lorentz, regardless of HS_QUANTIZATION_LEVEL. The from_float_lorentz() encoder is dispatched by the index layer (hyperspace-index/src/lib.rs).


Choosing the Right Mode

Dataset characteristics
    β”‚
    β”œβ”€ Full precision required (research)? ───────→ HS_QUANTIZATION_LEVEL=none
    β”‚
    β”œβ”€ Lorentz/Hyperbolic metric? ────────────────→ Automatic (dynamic-range SQ8)
    β”‚
    β”œβ”€ Memory-critical (>100M vectors)? ──────────→ HS_QUANTIZATION_LEVEL=binary
    β”‚
    β”œβ”€ Cosine / L2, high recall needed? ──────────→ HS_QUANTIZATION_LEVEL=scalar (default)
    β”‚                                                 β†’ Anisotropic refinement applied
    └─ Hyperbolic, mixed density? ────────────────→ Zonal (MOND) via ZonalVector store

Lorentz SQ8 & GPU

πŸ”’ Security & Auth

HyperspaceDB includes built-in security features for production deployments.

API Authentication

We use a simple but effective API Key mechanism.

Enabling Auth

Set the HYPERSPACE_API_KEY environment variable when starting the server.

export HYPERSPACE_API_KEY="my-secret-key-123"
./hyperspace-server

If this variable is NOT set, authentication is disabled (dev mode).

Client Usage

Clients must pass the key in the x-api-key metadata header.

Python:

client = HyperspaceClient(
    host="localhost:50051", 
    api_key="my-secret-key-123",
    user_id="tenant_name"  # Optional: For multi-tenancy
)

Rust:

#![allow(unused)]
fn main() {
// Use the updated connect function
let client = Client::connect(
    "http://0.0.0.0:50051".to_string(),
    Some("my-secret-key-123".to_string()),
    Some("tenant_name".to_string())
).await?;
}

Multi-Tenancy Isolation

Use x-hyperspace-user-id header to isolate data per user.

  • Gateway Responsibility: Ensure your API Gateway validates user tokens and injects this header securely.
  • Internal Scope: Data created with a user_id is invisible to other users and the default admin scope.

Security Implementation

  • SHA-256 Hashing: The server computes SHA256(env_key) at startup and stores only the hash.
  • Constant-Time Comparison: Incoming keys are hashed and compared to prevent timing attacks.

S3 Cloud Tiering

Data Safety & Durability

HyperspaceDB Architecture Guide

HyperspaceDB is a specialized vector database designed for high-performance hyperbolic embedding search. This document details its internal architecture, storage format, and indexing strategies.


πŸ— System Overview

The system follows a strict Command-Query Separation (CQS) pattern, tailored for write-heavy ingestion and latency-sensitive search.

graph TD
    Client[Client (gRPC)] -->|Insert| S[Server Service]
    Client -->|Search| S
    
    subgraph Persistence Layer
        S -->|1. Append| WAL[Write-Ahead Log]
        S -->|2. Append| VS[Vector Store]
    end
    
    subgraph Indexing Layer
        S -->|3. Send ID| Q[Async Queue (Channel)]
        Q -->|Pop| W[Indexer Worker]
        W -->|Update| HNSW[HNSW Graph (RAM)]
    end

    subgraph Embedding Layer
        S -->|InsertText| EE[Embedding Service]
        EE -->|Chunking| BE[Embedding Backends]
    end
    
    subgraph Background Tasks
        Snap[Snapshotter] -->|Serialize| Disk[Index Snapshot (.snap)]
    end

πŸ’Ύ Storage Layer (hyperspace-store)

1. Vector Storage (data/)

Vectors are stored in a segmented, append-only format using Memory-Mapped Files (mmap).

  • Segments: Data is split into chunks of 65,536 vectors (2^16).
  • Files: chunk_0.hyp, chunk_1.hyp, etc.
  • Quantization: Vectors are optionally quantized (e.g., ScalarI8), reducing size from 64-bit float to 8-bit integer per dimension (8x compression).

2. Write-Ahead Log (wal.log)

Writes are durable. Every insert is appended to wal.log before being acknowledged. Upon restart, the WAL helps recover data that wasn't yet persisted in the Index Snapshot.


πŸ•Έ Indexing Layer (hyperspace-index)

Hyperbolic HNSW

We implement a modified Hierarchical Navigable Small World graph optimized for the PoincarΓ© Ball model.

  • Distance Metric: PoincarΓ© distance formula: $$ d(u, v) = \text{acosh}\left(1 + 2 \frac{||u-v||^2}{(1-||u||^2)(1-||v||^2)}\right) $$
  • Optimization: We compare $||u-v||^2$ and cached normalization factors $\alpha = 1/(1-||u||^2)$ to avoid expensive acosh calls during graph traversal.
  • Locking: The graph uses fine-grained RwLock per node layer, allowing concurrent searches and updates.

Dynamic Configuration

Parameters ef_search (search depth) and ef_construction (build quality) are stored in AtomicUsize global config, allowing runtime tuning without restarts.


⚑️ Performance Traits

  1. Async Indexing: Client receives OK as soon as data hits the WAL. Indexing happens in the background.
  2. Zero-Copy Read: Search uses mmap to read quantized vectors directly from OS cache without heap allocation.
  3. SIMD Acceleration: Distance calculations use std::simd (Portable SIMD) for 4-8x speedup on supported CPUs (AVX2, Neon).

πŸ”„ Lifecycle

  1. Startup:
    • Load index.snap (Rkyv zero-copy deserialization).
    • Replay wal.log for any missing vectors.
  2. Runtime:
    • Serve read/write requests.
    • Background worker consumes indexing queue.
    • Snapshotter periodically saves graph state.
  3. Shutdown:
    • Stop accepting writes.
    • Drain indexing queue.
    • Save final snapshot.
    • Close file handles.

Memory Management & Stability

Cold Storage Architecture

HyperspaceDB implements a "Cold Storage" mechanism to handle large numbers of collections efficiently:

  1. Lazy Loading: Collections are not loaded into RAM at startup. Instead, only metadata is scanned. The actual collection (vector index, storage) is instantiated from disk only upon the first get() request.
  2. Idle Eviction (Reaper): A background task runs every 60 seconds to scan for idle collections. Any collection not accessed for a configurable period (default: 1 hour) is automatically unloaded from memory to free up RAM.
  3. Graceful Shutdown: When a collection is evicted or deleted, its Drop implementation ensures that all associated background tasks (indexing, snapshotting) are immediately aborted, preventing resource leaks and panicked threads.

This architecture allows HyperspaceDB to support thousands of collections while keeping the active memory footprint low, scaling based on actual usage rather than total data.

Storage Format

HyperspaceDB uses a custom segmented file format designed for:

  1. Fast Appends (Zero seek time).
  2. Mmap Compatibility (OS manages caching).
  3. Space Efficiency (Quantization).

Segmentation

Data is split into "Chunks" of fixed size ($2^{16} = 65,536$ vectors). This avoids allocating one giant file and allows easier lifecycle management.

  • data/chunk_0.hyp
  • ...

LSM-Tree Segmentation

HyperspaceDB 3.0 adopts an LSM-Tree architecture. Data flows from hot memory to immutable on-disk segments:

  1. MemTable (Hot): New vectors are indexed in an in-memory HNSW.
  2. Immutable Chunks (Cold): When a WAL segment is rotated, the Flush Worker persists the MemTable into an immutable .hyp chunk. During this flush, the in-memory HNSW topology is re-written into a Spatial Navigable Graph (Vamana / DiskANN format) to minimize page faults when read via mmap from SSDs.
  3. Local vs Cloud: Chunks can live on local NVMe or be tiered to S3.

S3 Cloud Tiering (Optional)

Using the s3-tiering feature, HyperspaceDB can offload cold chunks to an S3-compatible object store.

  • LRU Cache: A byte-weighted cache (HS_MAX_LOCAL_CACHE_GB) manages how much data stays on local disk.
  • Lazy Load: Search queries automatically trigger a download if a required chunk is only present in the cloud.
  • Backpressure: Semaphore-limited concurrent downloads prevent IO/Network saturations.

Directory Structure (Multi-Tenancy)

File Layout

Each .hyp file is a flat array of fixed-size records. No headers, no metadata. Metadata is stored in the Index Snapshot or recovered from layout.

Zonal Quantization (v3.0.1)

For hyperbolic collections, HyperspaceDB automatically applies Zonal Quantization (MOND theory) to vectors.

  • Vectors near the origin ($||x|| < 0.5$) are tightly compressed as i8 (Core).
  • Vectors near the infinite boundary ($||x|| \to 1$) are preserved in pure f64 (Boundary) to maintain strict exact precision required for hierarchical routing.

Record Structure (ScalarI8)

When QuantizationMode::ScalarI8 is active (and vector is within the Core zone):

Byte OffsetContentType
0..NQuantized Coordinates[i8; N]
N..N+4Pre-computed Alphaf32

Total size per vector (for N=8): $8 + 4 = 12$ bytes. Without quantization (f64), it would be $8 \times 8 = 64$ bytes. Savings: ~81%.

Optional raw f32 storage (v2.2.x)

For QuantizationMode::None, you can enable:

  • HS_STORAGE_FLOAT32=true

In this mode, raw vectors are stored as f32 in mmap and promoted to f64 in distance kernels.
This reduces raw-vector memory footprint by ~50% while preserving numerical behavior in hyperbolic math paths.

Write-Ahead Log (WAL)

Path: wal.log

The WAL ensures durability. Format:

  • id (u32)
  • vector ([f64; N])

It is only read during startup if the Index Snapshot is older than the last WAL entry.

RAM Backend (WASM)

For WebAssembly deployments (hyperspace-wasm), the storage backend automatically switches to RAMVectorStore.

  • Structure: Uses Vec<Arc<RwLock<Vec<u8>>>> (Heap Memory) instead of memory-mapped files.
  • Segmentation: The same chunking logic (64k vectors) is preserved. This allows the core HNSW index to use the same addressing logic (id >> 16, id & 0xFFFF) regardless of the backend.
  • Persistence: Persistence is achieved by serializing the "used" portion of segments into a Vec<u8> blob and storing it in the browser's IndexedDB.
  • Pre-allocation: Creating a DB instance pre-allocates the first chunk (64k * VectorSize bytes) to avoid frequent allocation calls during inserts.

The Hyperbolic Geometry

HyperspaceDB operates in the PoincarΓ© Ball Model & Lorentz (hyperboloid) of hyperbolic geometry. This space is uniquely suited for hierarchical data (trees, graphs, taxonomies) because the amount of "space" available grows exponentially with the radius, similar to how the number of nodes in a tree grows with depth.

The Distance Formula

The distance $d(u, v)$ between two vectors $u, v$ in the PoincarΓ© ball ($\mathbb{D}^n$) is defined as:

$$ d(u, v) = \text{arccosh}\left( 1 + 2 \frac{|u - v|^2}{(1 - |u|^2)(1 - |v|^2)} \right) $$

Where:

  • $|u|$ is the Euclidean norm of vector $u$.
  • The vectors must satisfy $|u| < 1$.

Optimization: The "Alpha" Trick

Calculating arccosh and divisions for every distance check in HNSW is expensive. HyperspaceDB optimizes this by pre-computing the curvature factors.

For every vector $x$, we store an additional scalar $\alpha_x$:

$$ \alpha_x = \frac{1}{1 - |x|^2} $$

This is stored alongside the quantized vector in our memory-mapped storage.

The Monotonicity Trick

Since $f(x) = \text{arccosh}(x)$ is a monotonically increasing function for $x \ge 1$, we do not need to compute the full arccosh during the Nearest Neighbor Search phase. We only need to compare the arguments:

$$ \delta(u, v) = |u - v|^2 \cdot \alpha_u \cdot \alpha_v $$

If $\delta(A) < \delta(B)$, then $d(A) < d(B)$.

HyperspaceDB performs all internal graph traversals using only $\delta$ (SIMD-optimized), and applies the heavy arccosh only when required by final ranking/output.

Lorentz Model (Hyperboloid)

For Lorentz vectors x = (t, x1, ..., xn) and y = (s, y1, ..., yn):

$$ \langle x, y \rangle_L = -ts + \sum_i x_i y_i $$

Distance:

$$ d(x, y) = \operatorname{arcosh}\left(-\langle x, y \rangle_L\right) $$

Validation constraints:

  • upper sheet: t > 0
  • unit hyperboloid: -t^2 + x_1^2 + ... + x_n^2 = -1

Optimization: SQ8 Quantization

For the Lorentz model, HyperspaceDB implements a specialized 8-bit scalar quantization (SQ8) with dynamic range scaling and GPU/SIMD acceleration. See Lorentz Quantization Details.

SDK Hyperbolic Utilities (v2.2.1)

To keep core DB focused and still support geometry-heavy clients, SDKs include helpers:

  • Python: hyperspace.mobius_add, hyperspace.exp_map, hyperspace.log_map
  • Rust: hyperspace_sdk::math::{mobius_add, exp_map, log_map, parallel_transport, riemannian_gradient, frechet_mean}
  • TypeScript: HyperbolicMath.mobiusAdd/expMap/logMap/parallelTransport/riemannianGradient/frechetMean

FrΓ©chet mean support is useful for reconsolidation workflows where multiple nearby hyperbolic embeddings should be merged into one robust centroid.

These functions are useful for L-system growth, manifold transforms, and pre-insert vector shaping pipelines.

Geometric Search (Spatial Filters)

HyperspaceDB v3.0 introduces native geometric predicates. Unlike metadata filters, these are based on the vector's position in the embedding space.

1. The Ball Filter (Proximity)

Mathematical definition: ${ v \in \mathbb{D}^n \mid d(c, v) \le r }$. Used for finding all entities within a semantic radius of a concept center $c$.

2. The Box Filter (Constraints)

Mathematical definition: ${ v \in \mathbb{R}^n \mid \forall i, \min_i \le v_i \le \max_i }$. Used for bounding reasoning to a specific workspace (e.g., "only consider nodes in the 1st quadrant").

3. The Cone Filter (Angular Logic)

Mathematical definition (Angular distance): ${ v \in \mathbb{R}^n \mid \text{angle}(\text{axes}, v) \le \text{aperture} }$. Inspired by ConE (Zhang & Wang, 2021), this filter allows for modeling logical entailment and hierarchy-aware FOV. In HyperspaceDB, this is implemented as an $O(N)$ dot-product check against the aperture thresholds.

Performance: Sequential Bitset Pruning

To ensure these filters don't slow down the engine, geometric intersection is performed efficiently during the candidate selection phase. We use a Bitset Pruning pattern:

  1. Generate a bitset of candidates satisfying the geometric query.
  2. Perform HNSW bitwise-AND intersection during the search phase.
  3. This allows for $O(1)$ rejection of candidates outside the region of interest.

Zero-Copy Hyperbolic HNSW

Our implementation of Hierarchical Navigable Small Worlds is unique in two ways:

  1. Metric: It natively speaks hyperbolic geometry.
  2. Concurrency: It uses fine-grained locking (parking_lot::RwLock) on every node.

Graph Structure

The graph consists of Layers (0..max).

  • Layer 0: Contains ALL vectors. This is the base ground truth.
  • Layer N: Contains a random subset of vectors from Layer N-1.

This creates a skip-list-like structure for navigation.

The "Select Neighbors" Heuristic

When connecting a new node $U$ to neighbors in HNSW, we use a heuristic to ensure diversity.

Standard Euclidean HNSW checks:

  • Add neighbor $V$ if $dist(U, V)$ is minimal.
  • Skip $V$ if it is closer to an already selected neighbor than to $U$.

Hyperbolic Adaptation: We use the PoincarΓ© distance for this check. Because the space expands exponentially, "diversity" is easier to achieve, but "closeness" is tricky because points near the boundary (norm $\approx$ 1) have massive distances even if they look close in Euclidean space.

Our heuristic strictly respects the PoincarΓ© metric, preventing "short-circuiting" through the center of the ball unless mathematically valid.

Locking Strategy

We do not use a global lock.

  • Reading: Search traverses nodes acquiring brief Read Locks.
  • Writing: Indexer acquires Write Locks only on the specific adjacency lists (layers) it is modifying.

This allows insert and search to run in parallel with high throughput.

Batch Search Acceleration

For high-throughput batch search operations, HNSW can offload Minkowski distance computations to the GPU using WGSL compute shaders. This is particularly effective when combined with Lorentz SQ8 Quantization.

GPU Acceleration Roadmap