[H] HyperspaceDB

Fastest Vector Database for Hierarchical & Flat Data written in Rust.
HyperspaceDB natively supports both the Poincaré ball model (for hierarchies) and Euclidean space (for standard OpenAI/BGE embeddings), delivering extreme performance through specialized SIMD kernels.

🚀 Key Features

⚡️ Extreme Performance: Built with Nightly Rust and SIMD intrinsics for maximum search throughput.
📐 Cognitive Math Engine: Hyperbolic HNSW optimized for the Poincaré and Lorentz metrics, and O(N) Wasserstein-1 logic.
📦 Compression: Integrated ScalarI8 and Binary quantization reduces memory footprint by 87% to 98%.
🧵 Async Write Pipeline: Decoupled ingestion with a background indexing worker and WAL for 10x faster inserts.
🖥️ Mission Control TUI: Real-time terminal dashboard for monitoring QPS, segments, and system health.
🕸️ Edge Ready: WASM compilation target allows running the full DB in browser with Local-First privacy and IndexedDB persistence.
🛠️ Runtime Tuning: Dynamically adjust ef_search and ef_construction parameters via gRPC on-the-fly.
🏙 Multi-Tenancy: Native SaaS support with namespace isolation (user_id) and billing stats.
🔁 Replication: Leader-Follower architecture with Anti-Entropy catch-up for high availability.
⚖️ Cognitive Math & Tribunal Router: Native SDK utilities for calculating geometric trust scores on graphs to detect LLM hallucinations.
📡 Memory Reconsolidation: Trigger AI sleep mode natively within the DB to restructure vectors via Flow Matching / Riemannian SGD.

🛠 Architecture

HyperspaceDB follows a Persistence-First, Index-Second design:

gRPC Request: Insert/Search commands arrive via a high-performance Tonic server.
WAL & Segmented Storage: Every insert is immediate persisted to a Write-Ahead Log and a memory-mapped segmented file store.
Background Indexer: The HNSW graph is updated asynchronously by a dedicated thread-pool, ensuring 0ms search blocking.
Snapshots: Real-time graph topology is periodically serialized using rkyv for near-instant restarts.

🏃 Quick Start

1. Build and Start Server

Make sure you have just and nightly rust installed.

cargo build --release
./target/release/hyperspace-server

2. Launch Dashboard

./target/release/hyperspace-cli

3. Use Python SDK

pip install ./sdks/python

from hyperspace import HyperspaceClient

client = HyperspaceClient("localhost:50051")
client.insert(vector=[0.1]*8, metadata={"category": "tech"})
results = client.search(vector=[0.11]*8, top_k=5)

📊 Performance Benchmarks

Tested on M4 Pro (Emulated), 1M Vectors (8D)

Insert Throughput: ~156,000 vectors/sec (Sustained)
Search Latency: ~2.47ms (156,000 QPS) @ 1M scale
Storage Efficiency: Automatic segmentation + mmap

"The 1 Million Challenge"

HyperspaceDB successfully handles 1,000,000 vectors with zero degradation compared to traditional vector DBs, maintaining 156,000 QPS at the 1M scale.

📄 License

AGPLv3 © YARlabs

Evaluation & Benchmarks

HyperspaceDB is optimized for two critical metrics: Throughput (Ingestion speed) and Latency (Search speed).

Test Environment

Hardware: Apple M4 Pro (Emulated Environment) / Linux AVX2
Dataset: 1,000,000 vectors, 1024 Dimensions, Random Distribution in Unit Ball.
Config: ef_construction=400, ef_search=400

Results

🚀 Ingestion Speed

Thanks to the Async Write Buffer (WAL) and background indexing, ingestion does not block user requests.

Count	Time	Throughput	Storage Sements
10,000	0.6s	15,624 vec/s	1
100,000	6.5s	15,300 vec/s	2
1,000,000	64.8s	15,420 vec/s	15

🔍 Search Latency (1M Scale)

At 1 million vectors, search performance degrades linearly with graph depth ($\log N$), proving effective HNSW implementation.

Metric	Value
QPS	14,668 queries/sec
Avg Latency	0.07 ms
P99 Latency	< 1.0 ms

Why is it so fast?

ScalarI8 Quantization: Fits 8x more vectors in CPU cache.
No acosh: Inner loop uses a monotonic proxy function ($\delta$).
SIMD: Vector operations use platform-specific intrinsics.

Installation

HyperspaceDB runs on Linux and macOS. Windows is supported via WSL2.

Prerequisites

Rust: Nightly toolchain is required for SIMD features.
Protoc: Protocol Buffer compiler for gRPC.

Option 1: Docker (Recommended)

The easiest way to get started.

docker pull glukhota/hyperspace-db:latest
# or build locally
docker build -t hyperspacedb .

docker run -p 50051:50051 -v $(pwd)/data:/app/data hyperspacedb

Option 2: Build from Source

Install dependencies

# Ubuntu/Debian
sudo apt install protobuf-compiler cmake

# macOS
brew install protobuf

Install Rust Nightly

rustup toolchain install nightly
rustup default nightly

Clone and Build

git clone https://github.com/yarlabs/hyperspace-db
cd hyperspace-db
cargo build --release

Run
```
./target/release/hyperspace-server
```

Quick Start

Once the server is running on localhost:50051, you can use any official SDK.

1) Start server

cargo build --release
./target/release/hyperspace-server

2) Open dashboard

http://localhost:50050

3) First interaction (Python)

from hyperspace import HyperspaceClient

client = HyperspaceClient("localhost:50051", api_key="I_LOVE_HYPERSPACEDB")
collection = "quickstart"

client.delete_collection(collection)
client.create_collection(collection, dimension=3, metric="cosine")

client.insert(id=1, vector=[0.1, 0.2, 0.3], collection=collection)
client.insert(id=2, vector=[0.2, 0.1, 0.4], collection=collection)

print(client.search(vector=[0.1, 0.2, 0.3], top_k=2, collection=collection))

# Batch search (recommended for throughput)
batch = client.search_batch(
    vectors=[[0.1, 0.2, 0.3], [0.2, 0.1, 0.4]],
    top_k=2,
    collection=collection,
)
print(batch)

4) Metric notes

cosine, l2, euclidean: general embeddings.
poincare: vectors must satisfy ||x|| < 1.
lorentz: vectors must be on upper hyperboloid sheet.

Python SDK

The official Python client provides an ergonomic wrapper around the gRPC interface.

Installation

Install from PyPI:

pip install hyperspacedb

Quick Start

from hyperspace import HyperspaceClient

client = HyperspaceClient("localhost:50051", api_key="KEY")

# 1. Insert (id comes first)
client.insert(1, [0.1, 0.2], metadata={"tag": "demo"}, collection="docs")

# 2. Hybrid Search (Semantic + BM25)
results = client.search(
    vector=[0.1, 0.2],
    hybrid_query="autonomous robotics",
    hybrid_alpha=0.7,
    collection="docs"
)

Reference

`HyperspaceClient`

class HyperspaceClient(host="localhost:50051", api_key=None, embedder=None, user_id=None)

embedder: Instance of BaseEmbedder subclass for client-side vectorization.
user_id: Tenant identifier for multi-tenancy.

Methods

`insert(id, vector=None, document=None, metadata=None, typed_metadata=None, collection="", durability=Durability.DEFAULT) -> bool`

id (int): Unique identifier (u32).
vector (List[float]): The embedding.
document (str): Raw text to embed (requires client-side embedder).
typed_metadata: Dict with values of type str, int, float, or bool.

`insert_text(id, text, metadata=None, collection="", durability=Durability.DEFAULT) -> bool`

Server-side vectorization: inserts raw text to be embedded by the database.

`search(vector=None, query_text=None, top_k=10, filter=None, filters=None, hybrid_query=None, hybrid_alpha=None, bm25=None, collection="") -> List[dict]`

vector: Query vector.
query_text: Text to embed client-side (if vector is None).
hybrid_query: Lexical query for BM25.
hybrid_alpha: Weight fusion factor [0, 1].
bm25: Configuration dict (method, language, k1, b).

`search_text(text, top_k=10, filter=None, filters=None, hybrid_alpha=None, bm25=None, collection="") -> List[dict]`

Server-side vectorization search.

`search_batch(vectors, top_k=10, collection="") -> List[List[dict]]`

Multi-query batch search.

`trigger_reconsolidation(collection, target_vector, learning_rate) -> bool`

Traces a Riemannian SGD path on the engine (AI Sleep Mode).

`rebuild_index(collection, filter_query=None) -> bool`

Supports metadata-aware pruning:

client.rebuild_index("docs", filter_query={"key": "age", "op": "gt", "value": 30.0})

`analyze_delta_hyperbolicity(vectors, num_samples=1000) -> (float, str)`

Gromov's delta analysis for metric recommendation.

Hyperbolic & Cognitive Math

from hyperspace.math import (
    mobius_add,
    exp_map,
    log_map,
    frechet_mean,
    local_entropy,
    lyapunov_convergence,
)

local_entropy: Detects hallucination/dispersion.
lyapunov_convergence: Verifies COT stability.

Rust SDK

For low-latency applications, connect directly using the Rust SDK.

Installation

Add to your Cargo.toml:

[dependencies]
hyperspace-sdk = "2.2.1"
tokio = { version = "1", features = ["full"] }

Usage

use hyperspace_sdk::Client;
use std::collections::HashMap;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Connect (with optional API Key)
    let api_key = std::env::var("HYPERSPACE_API_KEY").ok();
    let mut client = Client::connect(
        "http://127.0.0.1:50051".into(),
        api_key,
        None
    ).await?;

    // --- Optional: Configure Embedder (Feature: "embedders") ---
    #[cfg(feature = "embedders")]
    {
        // Example: OpenAI
        use hyperspace_sdk::OpenAIEmbedder;
        let openai_key = std::env::var("OPENAI_API_KEY").unwrap();
        let embedder = OpenAIEmbedder::new(openai_key, "text-embedding-3-small".to_string());
        
        // Or: Voyage AI
        // use hyperspace_sdk::VoyageEmbedder;
        // let embedder = VoyageEmbedder::new(api_key, "voyage-large-2".to_string());

        client.set_embedder(Box::new(embedder));
        
        // Insert Document
        let mut meta = HashMap::new();
        meta.insert("tag".to_string(), "rust".to_string());
        client.insert_document(100, "Rust is blazing fast.", meta).await?;
        
        // Search Document
        let results = client.search_document("fast systems language", 5).await?;
        println!("Document Search Results: {:?}", results);
    }
    // -----------------------------------------------------------

    // 2. Insert with Vector (Low-Level)
    let vec = vec![0.1; 8];
    let mut meta = HashMap::new();
    meta.insert("name".to_string(), "item-42".to_string());
    
    client.insert(42, vec.clone(), meta, None).await?;

    // 3. Basic Search
    let results = client.search(vec.clone(), 5, None).await?;
    
    // 4. Advanced / Hybrid Search
    // e.g. Find semantically similar items that also mention "item"
    let hybrid = Some(("item".to_string(), 0.5)); 
    let results = client.search_advanced(vec, 5, vec![], hybrid, None).await?;
    
    for res in results {
        println!("Match: {} (dist: {})", res.id, res.distance);
    }
    
    Ok(())
}

Features

embedders: Enables set_embedder, insert_document, and search_document. Requires reqwest and serde.

Batch Search

Use search_batch or search_batch_f32 to reduce per-request overhead in high-concurrency workloads.

Graph Traversal API

Rust SDK exposes graph calls directly:

get_node
get_neighbors
get_concept_parents
traverse
find_semantic_clusters

Rebuild with Metadata Pruning

Use rebuild_index_with_filter to run vacuum/rebuild and prune vectors in one request:

#![allow(unused)]
fn main() {
client
    .rebuild_index_with_filter(
        "docs_rust".to_string(),
        "energy".to_string(),
        "lt".to_string(),
        0.1,
    )
    .await?;
}

Hyperbolic Math Utilities

#![allow(unused)]
fn main() {
use hyperspace_sdk::math::{
    mobius_add, exp_map, log_map, parallel_transport, riemannian_gradient, frechet_mean
};
}

WebAssembly (WASM)

Integrations (LangChain & n8n)

Model Context Protocol (MCP)

API Reference

HyperspaceDB operates on a Dual-API architecture:

gRPC (Data Plane): High-performance ingestion and search.
HTTP (Control Plane): Management, monitoring, and dashboard integration.

📡 gRPC API (Data Plane)

Defined in hyperspace.proto. Used by SDKs (Python, Rust, Go).

Collection Management

`CreateCollection`

Creates a new independent vector index.

rpc CreateCollection (CreateCollectionRequest) returns (StatusResponse);

message CreateCollectionRequest {
  string name = 1;
  uint32 dimension = 2; // e.g. 1536, 1024, 64
  string metric = 3;    // "l2", "euclidean", "cosine", "poincare", "lorentz"
}

`DeleteCollection`

Drops a collection and all its data.

rpc DeleteCollection (DeleteCollectionRequest) returns (StatusResponse);

`ListCollections`

Retrieves all active collections for the current tenant, including their metadata.

rpc ListCollections (Empty) returns (ListCollectionsResponse);

message ListCollectionsResponse {
  repeated CollectionSummary collections = 1;
}

message CollectionSummary {
  string name = 1;
  uint64 count = 2;
  uint32 dimension = 3;
  string metric = 4;
}

`GetCollectionStats`

Returns real-time statistics for a single collection.

rpc GetCollectionStats (CollectionStatsRequest) returns (CollectionStatsResponse);

message CollectionStatsResponse {
  uint64 count = 1;
  uint32 dimension = 2;
  string metric = 3;
  uint64 indexing_queue = 4;
}

Vector Operations

`Insert`

Ingests a vector into a specific collection.

rpc Insert (InsertRequest) returns (InsertResponse);

message InsertRequest {
  string collection = 1;      // Collection name
  repeated double vector = 2; // Data point
  uint32 id = 3;              // External ID
  map<string, string> metadata = 4; // Metadata tags
  DurabilityLevel durability = 7; // Durability override
  map<string, MetadataValue> typed_metadata = 8; // Typed metadata (int/float/bool/string)
}

enum DurabilityLevel {
  DEFAULT_LEVEL = 0; // Use server config
  ASYNC = 1;         // Flush OS cache (Fastest)
  BATCH = 2;         // Background fsync (Balanced)
  STRICT = 3;        // Fsync every write (High Safety)
}

typed_metadata is the preferred metadata path for new clients. String metadata remains as a compatibility path.

`Search`

Finds nearest neighbors.

rpc Search (SearchRequest) returns (SearchResponse);

message SearchRequest {
  string collection = 1;
  repeated double vector = 2;
  uint32 top_k = 3;
  // Metadata string filter (e.g. "category:book")
  map<string, string> filter = 4;
  // Complex filter object
  repeated Filter filters = 5;
  // Hybrid search
  optional string hybrid_query = 6;
  optional float hybrid_alpha = 7;
  // BM25 Lexical Options
  optional Bm25Options bm25_options = 9;
  // Wasserstein 1D CDF O(N) distance
  optional bool use_wasserstein = 8;
}

message Bm25Options {
  string method = 1;          // "bm25", "bm25plus", "lucene", "atire"
  float k1 = 2;               // default 1.2
  float b = 3;                // default 0.75
  float delta = 4;            // default 1.0 (for bm25plus)
  string language = 5;        // "english", "russian", etc.
  uint32 ngrams = 6;          // 1 (unigram), 2 (bigram)
  string fusion_method = 7;   // "rrf" (default), "linear"
}

#### Geometric Filters (New in v3.0)

HyperspaceDB v3.0 introduces native spatial constraints. These run on the bitset level inside the engine and are significantly faster than application-level filtering.

```protobuf
message Filter {
  oneof condition {
    Match match = 1;
    Range range = 2;
    InCone in_cone = 3;
    InBox in_box = 4;
    InBall in_ball = 5;
  }
}

// 1. Proximity Filter
message InBall {
  repeated double center = 1;
  double radius = 2;
}

// 2. N-Dimensional Bounding Box
message InBox {
  repeated double min_bounds = 1;
  repeated double max_bounds = 2;
}

// 3. Angular Cone (for ConE-style embeddings)
message InCone {
  repeated double axes = 1;      // Vector direction
  repeated double apertures = 2; // Angular width (radians)
  double cen = 3;                // Centrality offset
}


`SearchResult` now includes both `metadata` and `typed_metadata`.
Range filters are evaluated with numeric semantics (`f64`) against typed metadata numeric values.
For gRPC clients, decimal thresholds are supported via `Range.gte_f64` / `Range.lte_f64` (`gte/lte` `int64` remains as compatibility path).

gRPC `Range` examples:

```protobuf
// Integer threshold (compatibility path)
Filter {
  range: {
    key: "depth",
    gte: 2,
    lte: 10
  }
}

// Decimal threshold (recommended for typed numeric metadata)
Filter {
  range: {
    key: "energy",
    gte_f64: 0.8,
    lte_f64: 1.0
  }
}

`SearchBatch`

Finds nearest neighbors for multiple queries in a single RPC call.

rpc SearchBatch (BatchSearchRequest) returns (BatchSearchResponse);

message BatchSearchRequest {
  repeated SearchRequest searches = 1;
}

message BatchSearchResponse {
  repeated SearchResponse responses = 1;
}

Recommended for high-concurrency clients and benchmarks to reduce per-request gRPC overhead.

`SubscribeToEvents`

Streams CDC events for post-insert/delete hooks.

rpc SubscribeToEvents (EventSubscriptionRequest) returns (stream EventMessage);

enum EventType {
  EVENT_UNKNOWN = 0;
  VECTOR_INSERTED = 1;
  VECTOR_DELETED = 2;
}

message EventSubscriptionRequest {
  repeated EventType types = 1;
  optional string collection = 2;
}

message EventMessage {
  EventType type = 1;
  oneof payload {
    VectorInsertedEvent vector_inserted = 2;
    VectorDeletedEvent vector_deleted = 3;
  }
}

Use this stream to build external pipelines (audit, Elasticsearch sync, graph projections, Neo4j updaters). SDKs (Python/TypeScript/Rust) expose convenience subscription methods for this stream.

Reliability note:

stream consumers may lag under burst load; server now handles lagged broadcast reads without dropping the whole stream task;
tune HS_EVENT_STREAM_BUFFER for higher event fan-out pressure.

`Delete`

Removes a single vector from a collection by its external ID.

rpc Delete (DeleteRequest) returns (DeleteResponse);

message DeleteRequest {
  string collection = 1;
  uint32 id = 2;
}

message DeleteResponse {
  bool success = 1;
}

🔁 Delta Sync Protocol

Advanced synchronization for consistency verification and recovery.

`SyncHandshake`

Computes the difference between client and server states using Merkle-like bucket hashes.

rpc SyncHandshake (SyncHandshakeRequest) returns (SyncHandshakeResponse);

`SyncPull`

Streams missing vectors from the server based on differing buckets.

rpc SyncPull (SyncPullRequest) returns (stream SyncVectorData);

`SyncPush`

Streams client-side unique vectors to the server to achieve global consistency.

rpc SyncPush (stream SyncVectorData) returns (SyncPushResponse);

`MetadataValue` (Typed Metadata)

message MetadataValue {
  oneof kind {
    string string_value = 1;
    int64 int_value = 2;
    double double_value = 3;
    bool bool_value = 4;
  }
}

`Graph Traversal API` (v2.3)

rpc GetNode (GetNodeRequest) returns (GraphNode);
rpc GetNeighbors (GetNeighborsRequest) returns (GetNeighborsResponse);
rpc GetConceptParents (GetConceptParentsRequest) returns (GetConceptParentsResponse);
rpc Traverse (TraverseRequest) returns (TraverseResponse);
rpc FindSemanticClusters (FindSemanticClustersRequest) returns (FindSemanticClustersResponse);

Key safety guards:

GetNeighborsRequest.limit and offset for bounded pagination.
TraverseRequest.max_depth and max_nodes to prevent unbounded graph walks.
FindSemanticClustersRequest.max_clusters and max_nodes for bounded connected-component scans.

TraverseRequest is filter-aware and supports both:

filter (map<string,string>)
filters (Match / Range)

GetNeighborsResponse now includes edge_weights, where edge_weights[i] is the distance from source node to neighbors[i].

`RebuildIndex` with pruning filter (v2.2.1)

message RebuildIndexRequest {
  string name = 1;
  optional VacuumFilterQuery filter_query = 2;
}

message VacuumFilterQuery {
  string key = 1;
  string op = 2; // "lt" | "lte" | "gt" | "gte" | "eq" | "ne"
  double value = 3;
}

Use this API for pruning cycles when you need to rebuild an index and drop low-value vectors in one server-side operation.

`TriggerReconsolidation` (v3.0.1)

Trigger AI Sleep Mode (Riemannian SGD / Flow Matching) directly on the engine to algorithmically shift vectors.

rpc TriggerReconsolidation (ReconsolidationRequest) returns (StatusResponse);

message ReconsolidationRequest {
  string collection = 1;
  repeated double target_vector = 2;
  double learning_rate = 3;
}

`InsertText` (v3.0.1)

Inserts raw text to be embedded and stored on the server.

rpc InsertText (InsertTextRequest) returns (InsertResponse);

message InsertTextRequest {
  string collection = 1;
  string text = 2;
  uint32 id = 3;
  map<string, MetadataValue> typed_metadata = 4;
}

`Vectorize` (v3.0.1)

Converts text to a vector using the server's embedding engine.

rpc Vectorize (VectorizeRequest) returns (VectorizeResponse);

message VectorizeRequest {
  string text = 1;
  string metric = 2; // "l2", "cosine", "poincare", "lorentz"
}

message VectorizeResponse {
  repeated double vector = 1;
}

`SearchText` (v3.0.1)

Searches the collection using a text query.

rpc SearchText (SearchTextRequest) returns (SearchResponse);

message SearchTextRequest {
  string collection = 1;
  string text = 2;
  uint32 top_k = 3;
  repeated Filter filters = 4;
  optional float hybrid_alpha = 5;
  optional Bm25Options bm25_options = 6;
}

🌐 HTTP API (Control Plane)

Served on port 50050 (default). All endpoints under /api.

Authentication & Multi-Tenancy

Every request should include:

x-api-key: API Key (optional if disabled, but recommended)
x-hyperspace-user-id: Tenant Identifier (e.g. client_123). If omitted, defaults to default_admin.

Cluster Status

GET /api/cluster/status

Returns the node's identity and topology role.

Swarm Peers (Gossip Protocol)

GET /api/swarm/peers

Returns active peers discovered via UDP multicast (Edge-to-Edge Sync).

{
  "gossip_enabled": true,
  "peer_count": 2,
  "peers": [...]
}

{
  "node_id": "uuid...",
  "role": "Leader", // or "Follower"
  "upstream_peer": null,
  "downstream_peers": []
}

Node Status (Compatibility)

GET /api/status

Returns runtime status and node configuration. Dashboard uses this endpoint first, with fallback to /api/cluster/status.

System Metrics

GET /api/metrics

Real-time system resource usage.

{
    "cpu_usage_percent": 12,
    "ram_usage_mb": 512,
    "disk_usage_mb": 1024,
    "total_collections": 5,
    "total_vectors": 1000000
}

Admin / Billing (Since v2.0)

Requires user_id: admin

GET /api/admin/usage

Returns JSON map of user_id -> usage_stats:

{
  "tenant_A": {
    "collection_count": 2,
    "vector_count": 1500,
    "disk_usage_bytes": 1048576
  }
}

List Collections

GET /api/collections

Returns summary of all active collections.

[
  {
    "name": "my_docs",
    "count": 1500,
    "dimension": 1536,
    "metric": "l2"
  }
]

Collection Search (HTTP Playground)

POST /api/collections/{name}/search

Convenience endpoint for dashboard/manual testing.

{
  "vector": [0.1, 0.2, 0.3],
  "top_k": 5
}

Graph HTTP Endpoints (Dashboard / tooling)

GET /api/collections/{name}/graph/node?id={id}&layer={layer}
GET /api/collections/{name}/graph/neighbors?id={id}&layer={layer}&limit={limit}&offset={offset}
GET /api/collections/{name}/graph/parents?id={id}&layer={layer}&limit={limit}
POST /api/collections/{name}/graph/traverse
POST /api/collections/{name}/graph/clusters

User Guide

Server Configuration

HyperspaceDB is configured via environment variables or a .env file.

Core Settings

Variable	Default	Description
`RUST_LOG`	`info`	Log level (`debug`, `info`, `error`)
`HS_PORT`	`50051`	gRPC listening port
`HS_HTTP_PORT`	`50050`	HTTP Dashboard port
`HS_DATA_DIR`	`./data`	Path to store segments and WAL
`HS_IDLE_TIMEOUT_SEC`	`3600`	Inactivity time (seconds) before collection unloads to disk
`HS_DIMENSION`	`1024`	Default vector dimensionality (8, 64, 768, 1024, 1536, 3072, 4096, 8192)
`HS_METRIC`	`cosine`	Distance metric (`cosine`, `poincare`, `l2`, `euclidean`, `lorentz`)
`HS_QUANTIZATION_LEVEL`	`none`	Compression (`none`, `scalar` (i8), `binary` (1-bit))
`HS_STORAGE_FLOAT32`	`false`	Store raw vectors as `f32` (`mode=none`) and promote to `f64` in distance kernels
`HS_FAST_UPSERT_DELTA`	`0.0`	Fast upsert L2 threshold. `0.0` disables; typical `0.001..0.05` for iterative updates; too high can keep stale graph links
`HS_EVENT_STREAM_BUFFER`	`1024`	Broadcast ring size for CDC and replication streams
`HS_RERANK_ENABLED`	`false`	Enable exact top-K re-ranking after ANN candidate retrieval
`HS_RERANK_OVERSAMPLE`	`4`	Candidate multiplier used before exact re-rank (`top_k * factor`)
`HS_GPU_BATCH_ENABLED`	`false`	Enable runtime auto-dispatch policy for batch metric kernels
`HS_GPU_MIN_BATCH`	`128`	Minimum batch size for GPU offload policy
`HS_GPU_MIN_DIM`	`1024`	Minimum vector dimension for GPU offload policy
`HS_GPU_MIN_WORK`	`262144`	Minimum workload (`batch * dim`) for GPU offload
`HS_GPU_L2_ENABLED`	`true`	Enable GPU dispatch for L2 batch kernel (requires `gpu-runtime` feature)
`HS_GPU_COSINE_ENABLED`	`true`	Enable GPU dispatch for cosine batch kernel (requires `gpu-runtime` feature)
`HS_GPU_POINCARE_ENABLED`	`true`	Enable GPU dispatch for Poincaré batch kernel (requires `gpu-runtime` feature)
`HS_GPU_LORENTZ_ENABLED`	`true`	Enable GPU dispatch for Lorentz float batch kernel (runtime path)
`HS_SEARCH_BATCH_INNER_CONCURRENCY`	`1`	Internal parallel fan-out in `SearchBatch` handler (bounded)
`HS_SEARCH_CONCURRENCY`	`0`	Global concurrent search-task limit per collection (`0` = auto by CPU cores, max clamped to `CPU*4`)

Cloud Tiering (S3)

Enabled only when compiled with s3-tiering feature.

Variable	Default	Description
`HS_STORAGE_BACKEND`	`local`	`local` (all chunks on disk) or `s3` (offload cold chunks)
`HS_MAX_LOCAL_CACHE_GB`	`10`	Hard limit for local disk cache in Gigabytes
`HS_S3_BUCKET`	-	Target S3 bucket name
`HS_S3_REGION`	`us-east-1`	AWS Region
`HS_S3_ENDPOINT`	-	Custom endpoint (e.g. `http://minio:9000`)
`HS_S3_ACCESS_KEY`	-	S3 Access Key ID
`HS_S3_SECRET_KEY`	-	S3 Secret Access Key
`HS_S3_MAX_RETRIES`	`5`	Retries for failed uploads/downloads
`HS_S3_UPLOAD_CONCURRENCY`	`4`	Semaphore-limited parallel uploads
`HS_WAL_SEGMENT_SIZE_MB`	`256`	Size before WAL rotation (influences chunk size)
`HS_CHUNK_PROBE_K`	`3`	Number of most relevant chunks to search per query

HNSW Index Tuning

Variable	Default	Description
`HS_HNSW_M`	`64`	Max connections per layer
`HS_HNSW_EF_CONSTRUCT`	`200`	Build quality (50-500). Higher = slower build, better recall.
`HS_HNSW_EF_SEARCH`	`100`	Search beam width (10-500). Higher = slower search, better recall.
`HS_FILTER_BRUTEFORCE_THRESHOLD`	`50000`	If filtered candidate count is below threshold, layer-0 uses exact brute-force instead of graph traversal
`HS_INDEXER_CONCURRENCY`	`1`	Check README for threading strategies (0=Auto, 1=Serial)

Persistence & Durability

Variable	Default	Description
`HYPERSPACE_WAL_SYNC_MODE`	`batch`	WAL Sync strategy: `strict` (fsync), `batch` (100ms lag), `async` (OS cache)
`HYPERSPACE_WAL_BATCH_INTERVAL`	`100`	Batch interval in milliseconds

Memory Management (Jemalloc)

HyperspaceDB uses Jemalloc for efficient memory allocation. Tune it via MALLOC_CONF:

Low RAM (Aggressive): MALLOC_CONF=background_thread:true,dirty_decay_ms:0,muzzy_decay_ms:0
Balanced (Default): MALLOC_CONF=background_thread:true,dirty_decay_ms:5000,muzzy_decay_ms:5000

Security

Variable	Default	Description
`HYPERSPACE_API_KEY`	-	If set, requires `x-api-key` header for all requests

Multi-Tenancy

HyperspaceDB supports strict data isolation via the x-hyperspace-user-id header.

Isolation: Every request with a x-hyperspace-user-id header operates within that user's private namespace.
Internal Naming: Collections are stored internally as userid_collectionname.
Default Admin: If x-hyperspace-user-id is omitted but a valid x-api-key is provided, the user is treated as default_admin.
SaaS Integration: Gateways should inject this header after authenticating users.

Lorentz metric notes

When HS_METRIC=lorentz, vectors must satisfy hyperboloid constraints:

t > 0 (upper sheet)
-t^2 + x_1^2 + ... + x_n^2 = -1

Web Dashboard

HyperspaceDB includes a comprehensive Web Dashboard at http://localhost:50050.

Features:

Cluster Status: View node role (Leader/Follower) and topology.
Collections: Create, delete, and inspect collection statistics.
Explorer: Search playground with filters and typed metadata visibility.
Graph Explorer: Query neighbors and concept-parent graph views from HNSW layers.
Metrics: Real-time RAM and CPU usage.

TUI Dashboard (Legacy)

For terminal-based monitoring:

./hyperspace-cli

Key Controls

TAB: Switch tabs.
[S]: Trigger snapshot.
[V]: Trigger vacuum.
[Q]: Quit.

Embedding Service

Advanced Features

🤝 Federated Clustering (v1.2)

HyperspaceDB v1.2 introduces a Federated Leader-Follower architecture. This goes beyond simple read-replication, introducing Node Identity, Logical Clocks, and Topology Awareness to support future Edge-Cloud synchronization scenarios.

Concepts

Node Identity

Every node in the cluster is assigned a persistent, unique UUID (node_id) upon first startup. This ID is used to track the origin of write operations in the replication log.

Roles

Leader (Coordinator):
- Accepts Writes (Insert, Delete, CreateCollection).
- Manages the Cluster Topology.
- Streams WAL events to connected Followers.
Follower (Replica):
- Read-Only.
- Replicates state from the Leader in real-time.
- Can be promoted to Leader if needed.
Edge Node (Planned v1.4):
- Offline-first node that accumulates writes and syncs via Merkle Trees when online.

Configuration

Leader

Simply start the server. By default, it assumes the Leader role.

./hyperspace-server --port 50051

Follower

Start with --role follower and point to the leader's URL.

./hyperspace-server --port 50052 --role follower --leader http://127.0.0.1:50051

Monitoring Topology

You can inspect the cluster state via the HTTP API on the Dashboard port (default 50050).

Request:

curl http://localhost:50050/api/cluster/status

Response:

{
  "node_id": "e8b37fde-6c60-427f-8a09-47103c2da80e",
  "role": "Leader",
  "upstream_peer": null,
  "downstream_peers": [],
  "logical_clock": 1234
}

This JSON response tells you:

The node's unique ID.
Its current role.
Who it is following (if Follower).
Who is following it (if Leader).
The current logical timestamp of its database state.

Edge-to-Edge Gossip Swarm (v3.0)

Beyond centralized replication, v3.0 introduces a decentralized Peer-to-Peer UDP Swarm network. This feature is crucial for robotics and offline-first autonomous agents.

Features

Zero-Configuration Topology: Nodes broadcast heartbeat logs via UDP (tokio::net::UdpSocket).
Self-Healing: Unresponsive nodes (TTL > 30s) are automatically dropped from the registry.
Auto-Discovery: Swarm nodes discover each other and exchange Logical Clocks and Collection Digests for the Merkle Delta Sync.

Swarm Configuration

Add these variables to your environment or .env file to start joining the global Swarm:

# Enable the Gossip listener on the specified local port
HS_GOSSIP_PORT=7946

# Bootstrapping nodes to connect to
HS_GOSSIP_PEERS=192.168.1.10:7946,192.168.1.11:7946

Swarm State Monitoring

You can monitor the active mesh structure from the dashboard UI or standard HTTP:

Request:

curl http://localhost:50050/api/swarm/peers

Response:

{
  "gossip_enabled": true,
  "peer_count": 1,
  "peers": [
    {
      "node_id": "a92jfe...",
      "addr": "192.168.1.10:50050",
      "http_port": 50050,
      "role": "Leader",
      "logical_clock": 4200,
      "collections": [
        {
          "name": "vision_system",
          "state_hash": 6712399120,
          "vector_count": 500
        }
      ],
      "last_seen_secs": 1729384910,
      "healthy": true
    }
  ]
}

🧠 Hybrid Search (BM25 + RRF)

HyperspaceDB combines Hyperbolic Vector Search with state-of-the-art BM25 Lexical Ranking to deliver maximum retrieval accuracy.

Conceptual Flow

Semantic Branch (Dense): Finds conceptually similar items using HNSW (L2, Cosine, Poincaré).
Lexical Branch (Sparse): Finds exact token matches using a BM25-optimized inverted index.
Fusion Layer: Scores from both branches are fused using Reciprocal Rank Fusion (RRF) or Linear Weighted Fusion.

RRF Score = 1/(k + rank_vec) + 1/(k + rank_lex) (where k defaults to 60).

BM25 Options

You can tune the lexical scavenger by providing a bm25 configuration:

method: "bm25" (classic), "bm25plus" (recommended for long docs), "lucene", "atire".
k1: Term frequency saturation (default 1.2).
b: Length normalization impact (default 0.75).
language: Stemmer choice (e.g. "english", "russian").

API Usage

Python

results = client.search(
    vector=query_vector,
    hybrid_query="apple macbook air",
    hybrid_alpha=0.7,  # 70% vector weight
    top_k=10,
    bm25={
        "method": "bm25plus",
        "language": "english"
    }
)

TypeScript

const results = await client.search(vector, 10, "collection", {
  hybridQuery: "apple macbook",
  hybridAlpha: 0.7,
  bm25: { method: "bm25plus" }
});

Rust

#![allow(unused)]
fn main() {
let results = client.search(SearchRequest {
    collection: "docs".into(),
    vector: query_vector,
    top_k: 10,
    hybrid_query: Some("macbook".into()),
    hybrid_alpha: Some(0.7),
    bm25_options: Some(Bm25Options {
        method: "bm25plus".into(),
        ..Default::default()
    }),
    ..Default::default()
}).await?;
}

Tokenization

The engine uses a built-in multi-lingual tokenizer that performs:

Case folding (lower-casing).
Alpha-numeric filtering.
Stop-word removal (optional).
Language-specific stemming based on bm25_options.language.

📉 Vector Quantization

HyperspaceDB supports multiple storage modes to balance Precision vs Memory vs Speed. All modes operate transparently — no SDK changes required.

Quantization Modes

Mode	Bits/dim	Compression	Recall@10	Best For
None	64 (f64)	1×	100%	Research, exact recall
ScalarI8	8 (i8)	8×	~98%	Production default
SQ8 Anisotropic	8 (i8)	8×	~99%+	Cosine / L2 (Sprint 6.2)
Binary	1 (bit)	64×	~75–85%	Re-ranking, large datasets
Lorentz SQ8	8 (i8) + scale	~8×	~95–98%	Hyperboloid (Lorentz) metric
Zonal (MOND)	mixed	30–40%↓ RAM	~99%	Hyperbolic (core + boundary)

1. ScalarI8 (Default)

The default mode. Coordinates are mapped from f64 to i8 ∈ [-127, 127] via:

q_i = round(x_i * 127)       // For Poincaré: x_i ∈ (-1, 1)

Compression: 8× vs f64
Recall: ~98% (@10 neighbors)
Distance: Dequantized at query time (a_i / 127.0)

2. SQ8 Anisotropic (Sprint 6.2 / 7.1 — ScaNN-Inspired)

Standard isotropic quantization applies uniform rounding to all dimensions, which distorts the direction (angle) of a vector. For Cosine/L2 metrics, angular error causes more recall degradation than magnitude error.

Anisotropic SQ8 penalizes orthogonal (directional) error far more than parallel (magnitude) error during the quantization refinement step.

Loss Function

$$L = |e_\parallel|^2 + t_w \cdot |e_\perp|^2$$

Where:

$e_\parallel = (e \cdot \hat{x}) \hat{x}$ — projection of quantization error onto the original vector direction
$e_\perp$ — component orthogonal to the original vector
$t_w = 10$ (anisotropy weight) — penalizes directional error 10× more than magnitude error

After the initial isotropic quantization, each coordinate is refined by ±1 step in i8-space and the one minimizing the anisotropic loss is selected:

#![allow(unused)]
fn main() {
for i in 0..N {
    // Try original, +1, -1
    for delta in [-1, 0, 1] {
        let candidate = (q[i] as i16 + delta).clamp(-127, 127) as i8;
        let loss = e_parallel_sq + t_weight * e_ortho_sq;
        if loss < best_loss { best = candidate; }
    }
    q[i] = best;
}
}

Results

Metric	Mode	Recall@10 Gain
Cosine	ScalarI8 → Anisotropic SQ8	+5–8%
L2	ScalarI8 → Anisotropic SQ8	+3–5%

Implementation

The anisotropic refinement is in QuantizedHyperVector::from_float() in crates/hyperspace-core/src/vector.rs.

3. Lorentz SQ8 (Dynamic-Range)

The Lorentz (hyperboloid) model has unbounded coordinates: the time component x[0] = cosh(r) grows exponentially. A fixed [-1, 1] mapping would saturate immediately.

Solution: Per-vector dynamic-range scaling:

scale = max(|x_i|)
q_i   = round(x_i / scale * 127)   // i8
α     = scale                        // stored in alpha field (f32)

Dequantization: x̃_i = (q_i / 127.0) * α

See Lorentz SQ8 deep-dive for full details.

4. Binary (1-bit)

Each coordinate is compressed to its sign bit. Distance uses Hamming distance.

Compression: 64× vs f64
Recall: ~75–85% (metric-dependent)
Use case: First-pass re-ranking candidate retrieval over very large datasets
⚠️ Not supported for Lorentz: sign destroys hierarchical depth information

5. Zonal Quantization — MOND (Sprint 6.3)

Inspired by Modified Newtonian Dynamics: near the center of hyperbolic space the metric is smooth, but it explodes near the horizon.

#![allow(unused)]
fn main() {
pub enum ZonalVector {
    Core(Vec<i8>),       // ||x|| < 0.5: compress to i8 (~8x RAM saving)
    Boundary(Vec<f64>),  // ||x|| >= 0.5: keep full precision
}
}

Enabled by a separate env var (independent of HS_QUANTIZATION_LEVEL):

HS_ZONAL_QUANTIZATION=true   # Enable MOND zonal storage

When enabled, zonal_storage: DashMap<NodeId, ZonalVector> completely replaces the standard mmap-based vector store. All read (get_vector) and write (insert_to_storage) paths are routed through zonal_storage.

RAM reduction: ~30–40% for datasets where most vectors are near the origin (||x|| < 0.5)
No precision loss at the boundary (where the metric is most sensitive)
Compatible with all metrics, not just Poincaré

Configuration

Quantization mode is set via environment variable before creating a collection. The mode is saved in meta.json alongside each collection and applied on reload.

# Default (ScalarI8 with Anisotropic refinement)
HS_QUANTIZATION_LEVEL=scalar

# Binary (1-bit Hamming)
HS_QUANTIZATION_LEVEL=binary

# Full f64 precision (debugging / research)
HS_QUANTIZATION_LEVEL=none

⚠️ Note: The --mode CLI flag does not exist. Configuration is exclusively through HS_QUANTIZATION_LEVEL (env var or .env file). The mode is stored per-collection in <data_dir>/<collection>/meta.json at creation time.

Note: The Lorentz SQ8 path is selected automatically when a collection's metric is lorentz, regardless of HS_QUANTIZATION_LEVEL. The from_float_lorentz() encoder is dispatched by the index layer (hyperspace-index/src/lib.rs).

Choosing the Right Mode

Dataset characteristics
    │
    ├─ Full precision required (research)? ───────→ HS_QUANTIZATION_LEVEL=none
    │
    ├─ Lorentz/Hyperbolic metric? ────────────────→ Automatic (dynamic-range SQ8)
    │
    ├─ Memory-critical (>100M vectors)? ──────────→ HS_QUANTIZATION_LEVEL=binary
    │
    ├─ Cosine / L2, high recall needed? ──────────→ HS_QUANTIZATION_LEVEL=scalar (default)
    │                                                 → Anisotropic refinement applied
    └─ Hyperbolic, mixed density? ────────────────→ Zonal (MOND) via ZonalVector store

Lorentz SQ8 & GPU

🔒 Security & Auth

HyperspaceDB includes built-in security features for production deployments.

API Authentication

We use a simple but effective API Key mechanism.

Enabling Auth

Set the HYPERSPACE_API_KEY environment variable when starting the server.

export HYPERSPACE_API_KEY="my-secret-key-123"
./hyperspace-server

If this variable is NOT set, authentication is disabled (dev mode).

Client Usage

Clients must pass the key in the x-api-key metadata header.

Python:

client = HyperspaceClient(
    host="localhost:50051", 
    api_key="my-secret-key-123",
    user_id="tenant_name"  # Optional: For multi-tenancy
)

Rust:

#![allow(unused)]
fn main() {
// Use the updated connect function
let client = Client::connect(
    "http://0.0.0.0:50051".to_string(),
    Some("my-secret-key-123".to_string()),
    Some("tenant_name".to_string())
).await?;
}

Multi-Tenancy Isolation

Use x-hyperspace-user-id header to isolate data per user.

Gateway Responsibility: Ensure your API Gateway validates user tokens and injects this header securely.
Internal Scope: Data created with a user_id is invisible to other users and the default admin scope.

Security Implementation

SHA-256 Hashing: The server computes SHA256(env_key) at startup and stores only the hash.
Constant-Time Comparison: Incoming keys are hashed and compared to prevent timing attacks.

S3 Cloud Tiering

Data Safety & Durability

HyperspaceDB Architecture Guide

HyperspaceDB is a specialized vector database designed for high-performance hyperbolic embedding search. This document details its internal architecture, storage format, and indexing strategies.

🏗 System Overview

The system follows a strict Command-Query Separation (CQS) pattern, tailored for write-heavy ingestion and latency-sensitive search.

graph TD
    Client[Client (gRPC)] -->|Insert| S[Server Service]
    Client -->|Search| S
    
    subgraph Persistence Layer
        S -->|1. Append| WAL[Write-Ahead Log]
        S -->|2. Append| VS[Vector Store]
    end
    
    subgraph Indexing Layer
        S -->|3. Send ID| Q[Async Queue (Channel)]
        Q -->|Pop| W[Indexer Worker]
        W -->|Update| HNSW[HNSW Graph (RAM)]
    end

    subgraph Embedding Layer
        S -->|InsertText| EE[Embedding Service]
        EE -->|Chunking| BE[Embedding Backends]
    end
    
    subgraph Background Tasks
        Snap[Snapshotter] -->|Serialize| Disk[Index Snapshot (.snap)]
    end

💾 Storage Layer (hyperspace-store)

1. Vector Storage (`data/`)

Vectors are stored in a segmented, append-only format using Memory-Mapped Files (mmap).

Segments: Data is split into chunks of 65,536 vectors (2^16).
Files: chunk_0.hyp, chunk_1.hyp, etc.
Quantization: Vectors are optionally quantized (e.g., ScalarI8), reducing size from 64-bit float to 8-bit integer per dimension (8x compression).

2. Write-Ahead Log (`wal.log`)

Writes are durable. Every insert is appended to wal.log before being acknowledged. Upon restart, the WAL helps recover data that wasn't yet persisted in the Index Snapshot.

🕸 Indexing Layer (hyperspace-index)

Hyperbolic HNSW

We implement a modified Hierarchical Navigable Small World graph optimized for the Poincaré Ball model.

Distance Metric: Poincaré distance formula: $$ d(u, v) = \text{acosh}\left(1 + 2 \frac{||u-v||^2}{(1-||u||^2)(1-||v||^2)}\right) $$
Optimization: We compare $||u-v||^2$ and cached normalization factors $\alpha = 1/(1-||u||^2)$ to avoid expensive acosh calls during graph traversal.
Locking: The graph uses fine-grained RwLock per node layer, allowing concurrent searches and updates.

Dynamic Configuration

Parameters ef_search (search depth) and ef_construction (build quality) are stored in AtomicUsize global config, allowing runtime tuning without restarts.

⚡️ Performance Traits

Async Indexing: Client receives OK as soon as data hits the WAL. Indexing happens in the background.
Zero-Copy Read: Search uses mmap to read quantized vectors directly from OS cache without heap allocation.
SIMD Acceleration: Distance calculations use std::simd (Portable SIMD) for 4-8x speedup on supported CPUs (AVX2, Neon).

🔄 Lifecycle

Startup:
- Load index.snap (Rkyv zero-copy deserialization).
- Replay wal.log for any missing vectors.
Runtime:
- Serve read/write requests.
- Background worker consumes indexing queue.
- Snapshotter periodically saves graph state.
Shutdown:
- Stop accepting writes.
- Drain indexing queue.
- Save final snapshot.
- Close file handles.

Memory Management & Stability

Cold Storage Architecture

HyperspaceDB implements a "Cold Storage" mechanism to handle large numbers of collections efficiently:

Lazy Loading: Collections are not loaded into RAM at startup. Instead, only metadata is scanned. The actual collection (vector index, storage) is instantiated from disk only upon the first get() request.
Idle Eviction (Reaper): A background task runs every 60 seconds to scan for idle collections. Any collection not accessed for a configurable period (default: 1 hour) is automatically unloaded from memory to free up RAM.
Graceful Shutdown: When a collection is evicted or deleted, its Drop implementation ensures that all associated background tasks (indexing, snapshotting) are immediately aborted, preventing resource leaks and panicked threads.

This architecture allows HyperspaceDB to support thousands of collections while keeping the active memory footprint low, scaling based on actual usage rather than total data.

Storage Format

HyperspaceDB uses a custom segmented file format designed for:

Fast Appends (Zero seek time).
Mmap Compatibility (OS manages caching).
Space Efficiency (Quantization).

Segmentation

Data is split into "Chunks" of fixed size ($2^{16} = 65,536$ vectors). This avoids allocating one giant file and allows easier lifecycle management.

data/chunk_0.hyp
...

LSM-Tree Segmentation

HyperspaceDB 3.0 adopts an LSM-Tree architecture. Data flows from hot memory to immutable on-disk segments:

MemTable (Hot): New vectors are indexed in an in-memory HNSW.
Immutable Chunks (Cold): When a WAL segment is rotated, the Flush Worker persists the MemTable into an immutable .hyp chunk. During this flush, the in-memory HNSW topology is re-written into a Spatial Navigable Graph (Vamana / DiskANN format) to minimize page faults when read via mmap from SSDs.
Local vs Cloud: Chunks can live on local NVMe or be tiered to S3.

S3 Cloud Tiering (Optional)

Using the s3-tiering feature, HyperspaceDB can offload cold chunks to an S3-compatible object store.

LRU Cache: A byte-weighted cache (HS_MAX_LOCAL_CACHE_GB) manages how much data stays on local disk.
Lazy Load: Search queries automatically trigger a download if a required chunk is only present in the cloud.
Backpressure: Semaphore-limited concurrent downloads prevent IO/Network saturations.

Directory Structure (Multi-Tenancy)

File Layout

Each .hyp file is a flat array of fixed-size records. No headers, no metadata. Metadata is stored in the Index Snapshot or recovered from layout.

Zonal Quantization (v3.0.1)

For hyperbolic collections, HyperspaceDB automatically applies Zonal Quantization (MOND theory) to vectors.

Vectors near the origin ($||x|| < 0.5$) are tightly compressed as i8 (Core).
Vectors near the infinite boundary ($||x|| \to 1$) are preserved in pure f64 (Boundary) to maintain strict exact precision required for hierarchical routing.

Record Structure (`ScalarI8`)

When QuantizationMode::ScalarI8 is active (and vector is within the Core zone):

Byte Offset	Content	Type
`0..N`	Quantized Coordinates	`[i8; N]`
`N..N+4`	Pre-computed Alpha	`f32`

Total size per vector (for N=8): $8 + 4 = 12$ bytes. Without quantization (f64), it would be $8 \times 8 = 64$ bytes. Savings: ~81%.

Optional raw `f32` storage (v2.2.x)

For QuantizationMode::None, you can enable:

HS_STORAGE_FLOAT32=true

In this mode, raw vectors are stored as f32 in mmap and promoted to f64 in distance kernels.
This reduces raw-vector memory footprint by ~50% while preserving numerical behavior in hyperbolic math paths.

Write-Ahead Log (WAL)

Path: wal.log

The WAL ensures durability. Format:

id (u32)
vector ([f64; N])

It is only read during startup if the Index Snapshot is older than the last WAL entry.

RAM Backend (WASM)

For WebAssembly deployments (hyperspace-wasm), the storage backend automatically switches to RAMVectorStore.

Structure: Uses Vec<Arc<RwLock<Vec<u8>>>> (Heap Memory) instead of memory-mapped files.
Segmentation: The same chunking logic (64k vectors) is preserved. This allows the core HNSW index to use the same addressing logic (id >> 16, id & 0xFFFF) regardless of the backend.
Persistence: Persistence is achieved by serializing the "used" portion of segments into a Vec<u8> blob and storing it in the browser's IndexedDB.
Pre-allocation: Creating a DB instance pre-allocates the first chunk (64k * VectorSize bytes) to avoid frequent allocation calls during inserts.

The Hyperbolic Geometry

HyperspaceDB operates in the Poincaré Ball Model & Lorentz (hyperboloid) of hyperbolic geometry. This space is uniquely suited for hierarchical data (trees, graphs, taxonomies) because the amount of "space" available grows exponentially with the radius, similar to how the number of nodes in a tree grows with depth.

The Distance Formula

The distance $d(u, v)$ between two vectors $u, v$ in the Poincaré ball ($\mathbb{D}^n$) is defined as:

$$ d(u, v) = \text{arccosh}\left( 1 + 2 \frac{|u - v|^2}{(1 - |u|^2)(1 - |v|^2)} \right) $$

Where:

$|u|$ is the Euclidean norm of vector $u$.
The vectors must satisfy $|u| < 1$.

Optimization: The "Alpha" Trick

Calculating arccosh and divisions for every distance check in HNSW is expensive. HyperspaceDB optimizes this by pre-computing the curvature factors.

For every vector $x$, we store an additional scalar $\alpha_x$:

$$ \alpha_x = \frac{1}{1 - |x|^2} $$

This is stored alongside the quantized vector in our memory-mapped storage.

The Monotonicity Trick

Since $f(x) = \text{arccosh}(x)$ is a monotonically increasing function for $x \ge 1$, we do not need to compute the full arccosh during the Nearest Neighbor Search phase. We only need to compare the arguments:

$$ \delta(u, v) = |u - v|^2 \cdot \alpha_u \cdot \alpha_v $$

If $\delta(A) < \delta(B)$, then $d(A) < d(B)$.

HyperspaceDB performs all internal graph traversals using only $\delta$ (SIMD-optimized), and applies the heavy arccosh only when required by final ranking/output.

Lorentz Model (Hyperboloid)

For Lorentz vectors x = (t, x1, ..., xn) and y = (s, y1, ..., yn):

$$ \langle x, y \rangle_L = -ts + \sum_i x_i y_i $$

Distance:

$$ d(x, y) = \operatorname{arcosh}\left(-\langle x, y \rangle_L\right) $$

Validation constraints:

upper sheet: t > 0
unit hyperboloid: -t^2 + x_1^2 + ... + x_n^2 = -1

Optimization: SQ8 Quantization

For the Lorentz model, HyperspaceDB implements a specialized 8-bit scalar quantization (SQ8) with dynamic range scaling and GPU/SIMD acceleration. See Lorentz Quantization Details.

SDK Hyperbolic Utilities (v2.2.1)

To keep core DB focused and still support geometry-heavy clients, SDKs include helpers:

Python: hyperspace.mobius_add, hyperspace.exp_map, hyperspace.log_map
Rust: hyperspace_sdk::math::{mobius_add, exp_map, log_map, parallel_transport, riemannian_gradient, frechet_mean}
TypeScript: HyperbolicMath.mobiusAdd/expMap/logMap/parallelTransport/riemannianGradient/frechetMean

Fréchet mean support is useful for reconsolidation workflows where multiple nearby hyperbolic embeddings should be merged into one robust centroid.

These functions are useful for L-system growth, manifold transforms, and pre-insert vector shaping pipelines.

Geometric Search (Spatial Filters)

HyperspaceDB v3.0 introduces native geometric predicates. Unlike metadata filters, these are based on the vector's position in the embedding space.

1. The Ball Filter (Proximity)

Mathematical definition: ${ v \in \mathbb{D}^n \mid d(c, v) \le r }$. Used for finding all entities within a semantic radius of a concept center $c$.

2. The Box Filter (Constraints)

Mathematical definition: ${ v \in \mathbb{R}^n \mid \forall i, \min_i \le v_i \le \max_i }$. Used for bounding reasoning to a specific workspace (e.g., "only consider nodes in the 1st quadrant").

3. The Cone Filter (Angular Logic)

Mathematical definition (Angular distance): ${ v \in \mathbb{R}^n \mid \text{angle}(\text{axes}, v) \le \text{aperture} }$. Inspired by ConE (Zhang & Wang, 2021), this filter allows for modeling logical entailment and hierarchy-aware FOV. In HyperspaceDB, this is implemented as an $O(N)$ dot-product check against the aperture thresholds.

Performance: Sequential Bitset Pruning

To ensure these filters don't slow down the engine, geometric intersection is performed efficiently during the candidate selection phase. We use a Bitset Pruning pattern:

Generate a bitset of candidates satisfying the geometric query.
Perform HNSW bitwise-AND intersection during the search phase.
This allows for $O(1)$ rejection of candidates outside the region of interest.

Zero-Copy Hyperbolic HNSW

Our implementation of Hierarchical Navigable Small Worlds is unique in two ways:

Metric: It natively speaks hyperbolic geometry.
Concurrency: It uses fine-grained locking (parking_lot::RwLock) on every node.

Graph Structure

The graph consists of Layers (0..max).

Layer 0: Contains ALL vectors. This is the base ground truth.
Layer N: Contains a random subset of vectors from Layer N-1.

This creates a skip-list-like structure for navigation.

The "Select Neighbors" Heuristic

When connecting a new node $U$ to neighbors in HNSW, we use a heuristic to ensure diversity.

Standard Euclidean HNSW checks:

Add neighbor $V$ if $dist(U, V)$ is minimal.
Skip $V$ if it is closer to an already selected neighbor than to $U$.

Hyperbolic Adaptation: We use the Poincaré distance for this check. Because the space expands exponentially, "diversity" is easier to achieve, but "closeness" is tricky because points near the boundary (norm $\approx$ 1) have massive distances even if they look close in Euclidean space.

Our heuristic strictly respects the Poincaré metric, preventing "short-circuiting" through the center of the ball unless mathematically valid.

Locking Strategy

We do not use a global lock.

Reading: Search traverses nodes acquiring brief Read Locks.
Writing: Indexer acquires Write Locks only on the specific adjacency lists (layers) it is modifying.

This allows insert and search to run in parallel with high throughput.

Batch Search Acceleration

For high-throughput batch search operations, HNSW can offload Minkowski distance computations to the GPU using WGSL compute shaders. This is particularly effective when combined with Lorentz SQ8 Quantization.

HyperspaceDB Documentation