API Reference

HyperspaceDB operates on a Dual-API architecture:

  1. gRPC (Data Plane): High-performance ingestion and search.
  2. HTTP (Control Plane): Management, monitoring, and dashboard integration.

📡 gRPC API (Data Plane)

Defined in hyperspace.proto. Used by SDKs (Python, Rust, Go).

Collection Management

CreateCollection

Creates a new independent vector index.

rpc CreateCollection (CreateCollectionRequest) returns (StatusResponse);

message CreateCollectionRequest {
  string name = 1;
  uint32 dimension = 2; // e.g. 1536, 1024, 64
  string metric = 3;    // "l2", "euclidean", "cosine", "poincare", "lorentz"
}

DeleteCollection

Drops a collection and all its data.

rpc DeleteCollection (DeleteCollectionRequest) returns (StatusResponse);

ListCollections

Retrieves all active collections for the current tenant, including their metadata.

rpc ListCollections (Empty) returns (ListCollectionsResponse);

message ListCollectionsResponse {
  repeated CollectionSummary collections = 1;
}

message CollectionSummary {
  string name = 1;
  uint64 count = 2;
  uint32 dimension = 3;
  string metric = 4;
}

GetCollectionStats

Returns real-time statistics for a single collection.

rpc GetCollectionStats (CollectionStatsRequest) returns (CollectionStatsResponse);

message CollectionStatsResponse {
  uint64 count = 1;
  uint32 dimension = 2;
  string metric = 3;
  uint64 indexing_queue = 4;
}

Vector Operations

Insert

Ingests a vector into a specific collection.

rpc Insert (InsertRequest) returns (InsertResponse);

message InsertRequest {
  string collection = 1;      // Collection name
  repeated double vector = 2; // Data point
  uint32 id = 3;              // External ID
  map<string, string> metadata = 4; // Metadata tags
  DurabilityLevel durability = 7; // Durability override
  map<string, MetadataValue> typed_metadata = 8; // Typed metadata (int/float/bool/string)
}

enum DurabilityLevel {
  DEFAULT_LEVEL = 0; // Use server config
  ASYNC = 1;         // Flush OS cache (Fastest)
  BATCH = 2;         // Background fsync (Balanced)
  STRICT = 3;        // Fsync every write (High Safety)
}

typed_metadata is the preferred metadata path for new clients. String metadata remains as a compatibility path.

Finds nearest neighbors.

rpc Search (SearchRequest) returns (SearchResponse);

message SearchRequest {
  string collection = 1;
  repeated double vector = 2;
  uint32 top_k = 3;
  // Metadata string filter (e.g. "category:book")
  map<string, string> filter = 4;
  // Complex filter object
  repeated Filter filters = 5;
  // Hybrid search
  optional string hybrid_query = 6;
  optional float hybrid_alpha = 7;
  // Wasserstein 1D CDF O(N) distance
  optional bool use_wasserstein = 8;
}

#### Geometric Filters (New in v3.0)

HyperspaceDB v3.0 introduces native spatial constraints. These run on the bitset level inside the engine and are significantly faster than application-level filtering.

```protobuf
message Filter {
  oneof condition {
    Match match = 1;
    Range range = 2;
    InCone in_cone = 3;
    InBox in_box = 4;
    InBall in_ball = 5;
  }
}

// 1. Proximity Filter
message InBall {
  repeated double center = 1;
  double radius = 2;
}

// 2. N-Dimensional Bounding Box
message InBox {
  repeated double min_bounds = 1;
  repeated double max_bounds = 2;
}

// 3. Angular Cone (for ConE-style embeddings)
message InCone {
  repeated double axes = 1;      // Vector direction
  repeated double apertures = 2; // Angular width (radians)
  double cen = 3;                // Centrality offset
}

`SearchResult` now includes both `metadata` and `typed_metadata`.
Range filters are evaluated with numeric semantics (`f64`) against typed metadata numeric values.
For gRPC clients, decimal thresholds are supported via `Range.gte_f64` / `Range.lte_f64` (`gte/lte` `int64` remains as compatibility path).

gRPC `Range` examples:

```protobuf
// Integer threshold (compatibility path)
Filter {
  range: {
    key: "depth",
    gte: 2,
    lte: 10
  }
}

// Decimal threshold (recommended for typed numeric metadata)
Filter {
  range: {
    key: "energy",
    gte_f64: 0.8,
    lte_f64: 1.0
  }
}

SearchBatch

Finds nearest neighbors for multiple queries in a single RPC call.

rpc SearchBatch (BatchSearchRequest) returns (BatchSearchResponse);

message BatchSearchRequest {
  repeated SearchRequest searches = 1;
}

message BatchSearchResponse {
  repeated SearchResponse responses = 1;
}

Recommended for high-concurrency clients and benchmarks to reduce per-request gRPC overhead.

SubscribeToEvents

Streams CDC events for post-insert/delete hooks.

rpc SubscribeToEvents (EventSubscriptionRequest) returns (stream EventMessage);

enum EventType {
  EVENT_UNKNOWN = 0;
  VECTOR_INSERTED = 1;
  VECTOR_DELETED = 2;
}

message EventSubscriptionRequest {
  repeated EventType types = 1;
  optional string collection = 2;
}

message EventMessage {
  EventType type = 1;
  oneof payload {
    VectorInsertedEvent vector_inserted = 2;
    VectorDeletedEvent vector_deleted = 3;
  }
}

Use this stream to build external pipelines (audit, Elasticsearch sync, graph projections, Neo4j updaters). SDKs (Python/TypeScript/Rust) expose convenience subscription methods for this stream.

Reliability note:

  • stream consumers may lag under burst load; server now handles lagged broadcast reads without dropping the whole stream task;
  • tune HS_EVENT_STREAM_BUFFER for higher event fan-out pressure.

Delete

Removes a single vector from a collection by its external ID.

rpc Delete (DeleteRequest) returns (DeleteResponse);

message DeleteRequest {
  string collection = 1;
  uint32 id = 2;
}

message DeleteResponse {
  bool success = 1;
}

🔁 Delta Sync Protocol

Advanced synchronization for consistency verification and recovery.

SyncHandshake

Computes the difference between client and server states using Merkle-like bucket hashes.

rpc SyncHandshake (SyncHandshakeRequest) returns (SyncHandshakeResponse);

SyncPull

Streams missing vectors from the server based on differing buckets.

rpc SyncPull (SyncPullRequest) returns (stream SyncVectorData);

SyncPush

Streams client-side unique vectors to the server to achieve global consistency.

rpc SyncPush (stream SyncVectorData) returns (SyncPushResponse);

MetadataValue (Typed Metadata)

message MetadataValue {
  oneof kind {
    string string_value = 1;
    int64 int_value = 2;
    double double_value = 3;
    bool bool_value = 4;
  }
}

Graph Traversal API (v2.3)

rpc GetNode (GetNodeRequest) returns (GraphNode);
rpc GetNeighbors (GetNeighborsRequest) returns (GetNeighborsResponse);
rpc GetConceptParents (GetConceptParentsRequest) returns (GetConceptParentsResponse);
rpc Traverse (TraverseRequest) returns (TraverseResponse);
rpc FindSemanticClusters (FindSemanticClustersRequest) returns (FindSemanticClustersResponse);

Key safety guards:

  • GetNeighborsRequest.limit and offset for bounded pagination.
  • TraverseRequest.max_depth and max_nodes to prevent unbounded graph walks.
  • FindSemanticClustersRequest.max_clusters and max_nodes for bounded connected-component scans.

TraverseRequest is filter-aware and supports both:

  • filter (map<string,string>)
  • filters (Match / Range)

GetNeighborsResponse now includes edge_weights, where edge_weights[i] is the distance from source node to neighbors[i].

RebuildIndex with pruning filter (v2.2.1)

message RebuildIndexRequest {
  string name = 1;
  optional VacuumFilterQuery filter_query = 2;
}

message VacuumFilterQuery {
  string key = 1;
  string op = 2; // "lt" | "lte" | "gt" | "gte" | "eq" | "ne"
  double value = 3;
}

Use this API for pruning cycles when you need to rebuild an index and drop low-value vectors in one server-side operation.

TriggerReconsolidation (v3.0.1)

Trigger AI Sleep Mode (Riemannian SGD / Flow Matching) directly on the engine to algorithmically shift vectors.

rpc TriggerReconsolidation (ReconsolidationRequest) returns (StatusResponse);

message ReconsolidationRequest {
  string collection = 1;
  repeated double target_vector = 2;
  double learning_rate = 3;
}

InsertText (v3.0.1)

Inserts raw text to be embedded and stored on the server.

rpc InsertText (InsertTextRequest) returns (InsertResponse);

message InsertTextRequest {
  string collection = 1;
  string text = 2;
  uint32 id = 3;
  map<string, MetadataValue> typed_metadata = 4;
}

Vectorize (v3.0.1)

Converts text to a vector using the server's embedding engine.

rpc Vectorize (VectorizeRequest) returns (VectorizeResponse);

message VectorizeRequest {
  string text = 1;
  string metric = 2; // "l2", "cosine", "poincare", "lorentz"
}

message VectorizeResponse {
  repeated double vector = 1;
}

SearchText (v3.0.1)

Searches the collection using a text query.

rpc SearchText (SearchTextRequest) returns (SearchResponse);

message SearchTextRequest {
  string collection = 1;
  string text = 2;
  uint32 top_k = 3;
  repeated Filter filters = 4;
}

🌐 HTTP API (Control Plane)

Served on port 50050 (default). All endpoints under /api.

Authentication & Multi-Tenancy

Every request should include:

  • x-api-key: API Key (optional if disabled, but recommended)
  • x-hyperspace-user-id: Tenant Identifier (e.g. client_123). If omitted, defaults to default_admin.

Cluster Status

GET /api/cluster/status

Returns the node's identity and topology role.

Swarm Peers (Gossip Protocol)

GET /api/swarm/peers

Returns active peers discovered via UDP multicast (Edge-to-Edge Sync).

{
  "gossip_enabled": true,
  "peer_count": 2,
  "peers": [...]
}
{
  "node_id": "uuid...",
  "role": "Leader", // or "Follower"
  "upstream_peer": null,
  "downstream_peers": []
}

Node Status (Compatibility)

GET /api/status

Returns runtime status and node configuration. Dashboard uses this endpoint first, with fallback to /api/cluster/status.

System Metrics

GET /api/metrics

Real-time system resource usage.

{
    "cpu_usage_percent": 12,
    "ram_usage_mb": 512,
    "disk_usage_mb": 1024,
    "total_collections": 5,
    "total_vectors": 1000000
}

Admin / Billing (Since v2.0)

Requires user_id: admin

GET /api/admin/usage

Returns JSON map of user_id -> usage_stats:

{
  "tenant_A": {
    "collection_count": 2,
    "vector_count": 1500,
    "disk_usage_bytes": 1048576
  }
}

List Collections

GET /api/collections

Returns summary of all active collections.

[
  {
    "name": "my_docs",
    "count": 1500,
    "dimension": 1536,
    "metric": "l2"
  }
]

Collection Search (HTTP Playground)

POST /api/collections/{name}/search

Convenience endpoint for dashboard/manual testing.

{
  "vector": [0.1, 0.2, 0.3],
  "top_k": 5
}

Graph HTTP Endpoints (Dashboard / tooling)

  • GET /api/collections/{name}/graph/node?id={id}&layer={layer}
  • GET /api/collections/{name}/graph/neighbors?id={id}&layer={layer}&limit={limit}&offset={offset}
  • GET /api/collections/{name}/graph/parents?id={id}&layer={layer}&limit={limit}
  • POST /api/collections/{name}/graph/traverse
  • POST /api/collections/{name}/graph/clusters