API Reference
HyperspaceDB operates on a Dual-API architecture:
- gRPC (Data Plane): High-performance ingestion and search.
- HTTP (Control Plane): Management, monitoring, and dashboard integration.
📡 gRPC API (Data Plane)
Defined in hyperspace.proto. Used by SDKs (Python, Rust, Go).
Collection Management
CreateCollection
Creates a new independent vector index.
rpc CreateCollection (CreateCollectionRequest) returns (StatusResponse);
message CreateCollectionRequest {
string name = 1;
uint32 dimension = 2; // e.g. 1536, 1024, 64
string metric = 3; // "l2", "euclidean", "cosine", "poincare", "lorentz"
}
DeleteCollection
Drops a collection and all its data.
rpc DeleteCollection (DeleteCollectionRequest) returns (StatusResponse);
ListCollections
Retrieves all active collections for the current tenant, including their metadata.
rpc ListCollections (Empty) returns (ListCollectionsResponse);
message ListCollectionsResponse {
repeated CollectionSummary collections = 1;
}
message CollectionSummary {
string name = 1;
uint64 count = 2;
uint32 dimension = 3;
string metric = 4;
}
GetCollectionStats
Returns real-time statistics for a single collection.
rpc GetCollectionStats (CollectionStatsRequest) returns (CollectionStatsResponse);
message CollectionStatsResponse {
uint64 count = 1;
uint32 dimension = 2;
string metric = 3;
uint64 indexing_queue = 4;
}
Vector Operations
Insert
Ingests a vector into a specific collection.
rpc Insert (InsertRequest) returns (InsertResponse);
message InsertRequest {
string collection = 1; // Collection name
repeated double vector = 2; // Data point
uint32 id = 3; // External ID
map<string, string> metadata = 4; // Metadata tags
DurabilityLevel durability = 7; // Durability override
map<string, MetadataValue> typed_metadata = 8; // Typed metadata (int/float/bool/string)
}
enum DurabilityLevel {
DEFAULT_LEVEL = 0; // Use server config
ASYNC = 1; // Flush OS cache (Fastest)
BATCH = 2; // Background fsync (Balanced)
STRICT = 3; // Fsync every write (High Safety)
}
typed_metadata is the preferred metadata path for new clients. String metadata remains as a compatibility path.
Search
Finds nearest neighbors.
rpc Search (SearchRequest) returns (SearchResponse);
message SearchRequest {
string collection = 1;
repeated double vector = 2;
uint32 top_k = 3;
// Metadata string filter (e.g. "category:book")
map<string, string> filter = 4;
// Complex filter object
repeated Filter filters = 5;
// Hybrid search
optional string hybrid_query = 6;
optional float hybrid_alpha = 7;
// Wasserstein 1D CDF O(N) distance
optional bool use_wasserstein = 8;
}
#### Geometric Filters (New in v3.0)
HyperspaceDB v3.0 introduces native spatial constraints. These run on the bitset level inside the engine and are significantly faster than application-level filtering.
```protobuf
message Filter {
oneof condition {
Match match = 1;
Range range = 2;
InCone in_cone = 3;
InBox in_box = 4;
InBall in_ball = 5;
}
}
// 1. Proximity Filter
message InBall {
repeated double center = 1;
double radius = 2;
}
// 2. N-Dimensional Bounding Box
message InBox {
repeated double min_bounds = 1;
repeated double max_bounds = 2;
}
// 3. Angular Cone (for ConE-style embeddings)
message InCone {
repeated double axes = 1; // Vector direction
repeated double apertures = 2; // Angular width (radians)
double cen = 3; // Centrality offset
}
`SearchResult` now includes both `metadata` and `typed_metadata`.
Range filters are evaluated with numeric semantics (`f64`) against typed metadata numeric values.
For gRPC clients, decimal thresholds are supported via `Range.gte_f64` / `Range.lte_f64` (`gte/lte` `int64` remains as compatibility path).
gRPC `Range` examples:
```protobuf
// Integer threshold (compatibility path)
Filter {
range: {
key: "depth",
gte: 2,
lte: 10
}
}
// Decimal threshold (recommended for typed numeric metadata)
Filter {
range: {
key: "energy",
gte_f64: 0.8,
lte_f64: 1.0
}
}
SearchBatch
Finds nearest neighbors for multiple queries in a single RPC call.
rpc SearchBatch (BatchSearchRequest) returns (BatchSearchResponse);
message BatchSearchRequest {
repeated SearchRequest searches = 1;
}
message BatchSearchResponse {
repeated SearchResponse responses = 1;
}
Recommended for high-concurrency clients and benchmarks to reduce per-request gRPC overhead.
SubscribeToEvents
Streams CDC events for post-insert/delete hooks.
rpc SubscribeToEvents (EventSubscriptionRequest) returns (stream EventMessage);
enum EventType {
EVENT_UNKNOWN = 0;
VECTOR_INSERTED = 1;
VECTOR_DELETED = 2;
}
message EventSubscriptionRequest {
repeated EventType types = 1;
optional string collection = 2;
}
message EventMessage {
EventType type = 1;
oneof payload {
VectorInsertedEvent vector_inserted = 2;
VectorDeletedEvent vector_deleted = 3;
}
}
Use this stream to build external pipelines (audit, Elasticsearch sync, graph projections, Neo4j updaters). SDKs (Python/TypeScript/Rust) expose convenience subscription methods for this stream.
Reliability note:
- stream consumers may lag under burst load; server now handles lagged broadcast reads without dropping the whole stream task;
- tune
HS_EVENT_STREAM_BUFFERfor higher event fan-out pressure.
Delete
Removes a single vector from a collection by its external ID.
rpc Delete (DeleteRequest) returns (DeleteResponse);
message DeleteRequest {
string collection = 1;
uint32 id = 2;
}
message DeleteResponse {
bool success = 1;
}
🔁 Delta Sync Protocol
Advanced synchronization for consistency verification and recovery.
SyncHandshake
Computes the difference between client and server states using Merkle-like bucket hashes.
rpc SyncHandshake (SyncHandshakeRequest) returns (SyncHandshakeResponse);
SyncPull
Streams missing vectors from the server based on differing buckets.
rpc SyncPull (SyncPullRequest) returns (stream SyncVectorData);
SyncPush
Streams client-side unique vectors to the server to achieve global consistency.
rpc SyncPush (stream SyncVectorData) returns (SyncPushResponse);
MetadataValue (Typed Metadata)
message MetadataValue {
oneof kind {
string string_value = 1;
int64 int_value = 2;
double double_value = 3;
bool bool_value = 4;
}
}
Graph Traversal API (v2.3)
rpc GetNode (GetNodeRequest) returns (GraphNode);
rpc GetNeighbors (GetNeighborsRequest) returns (GetNeighborsResponse);
rpc GetConceptParents (GetConceptParentsRequest) returns (GetConceptParentsResponse);
rpc Traverse (TraverseRequest) returns (TraverseResponse);
rpc FindSemanticClusters (FindSemanticClustersRequest) returns (FindSemanticClustersResponse);
Key safety guards:
GetNeighborsRequest.limitandoffsetfor bounded pagination.TraverseRequest.max_depthandmax_nodesto prevent unbounded graph walks.FindSemanticClustersRequest.max_clustersandmax_nodesfor bounded connected-component scans.
TraverseRequest is filter-aware and supports both:
filter(map<string,string>)filters(Match/Range)
GetNeighborsResponse now includes edge_weights, where edge_weights[i] is the distance from source node to neighbors[i].
RebuildIndex with pruning filter (v2.2.1)
message RebuildIndexRequest {
string name = 1;
optional VacuumFilterQuery filter_query = 2;
}
message VacuumFilterQuery {
string key = 1;
string op = 2; // "lt" | "lte" | "gt" | "gte" | "eq" | "ne"
double value = 3;
}
Use this API for pruning cycles when you need to rebuild an index and drop low-value vectors in one server-side operation.
TriggerReconsolidation (v3.0.1)
Trigger AI Sleep Mode (Riemannian SGD / Flow Matching) directly on the engine to algorithmically shift vectors.
rpc TriggerReconsolidation (ReconsolidationRequest) returns (StatusResponse);
message ReconsolidationRequest {
string collection = 1;
repeated double target_vector = 2;
double learning_rate = 3;
}
InsertText (v3.0.1)
Inserts raw text to be embedded and stored on the server.
rpc InsertText (InsertTextRequest) returns (InsertResponse);
message InsertTextRequest {
string collection = 1;
string text = 2;
uint32 id = 3;
map<string, MetadataValue> typed_metadata = 4;
}
Vectorize (v3.0.1)
Converts text to a vector using the server's embedding engine.
rpc Vectorize (VectorizeRequest) returns (VectorizeResponse);
message VectorizeRequest {
string text = 1;
string metric = 2; // "l2", "cosine", "poincare", "lorentz"
}
message VectorizeResponse {
repeated double vector = 1;
}
SearchText (v3.0.1)
Searches the collection using a text query.
rpc SearchText (SearchTextRequest) returns (SearchResponse);
message SearchTextRequest {
string collection = 1;
string text = 2;
uint32 top_k = 3;
repeated Filter filters = 4;
}
🌐 HTTP API (Control Plane)
Served on port 50050 (default). All endpoints under /api.
Authentication & Multi-Tenancy
Every request should include:
x-api-key: API Key (optional if disabled, but recommended)x-hyperspace-user-id: Tenant Identifier (e.g.client_123). If omitted, defaults todefault_admin.
Cluster Status
GET /api/cluster/status
Returns the node's identity and topology role.
Swarm Peers (Gossip Protocol)
GET /api/swarm/peers
Returns active peers discovered via UDP multicast (Edge-to-Edge Sync).
{
"gossip_enabled": true,
"peer_count": 2,
"peers": [...]
}
{
"node_id": "uuid...",
"role": "Leader", // or "Follower"
"upstream_peer": null,
"downstream_peers": []
}
Node Status (Compatibility)
GET /api/status
Returns runtime status and node configuration. Dashboard uses this endpoint first, with fallback to /api/cluster/status.
System Metrics
GET /api/metrics
Real-time system resource usage.
{
"cpu_usage_percent": 12,
"ram_usage_mb": 512,
"disk_usage_mb": 1024,
"total_collections": 5,
"total_vectors": 1000000
}
Admin / Billing (Since v2.0)
Requires user_id: admin
GET /api/admin/usage
Returns JSON map of user_id -> usage_stats:
{
"tenant_A": {
"collection_count": 2,
"vector_count": 1500,
"disk_usage_bytes": 1048576
}
}
List Collections
GET /api/collections
Returns summary of all active collections.
[
{
"name": "my_docs",
"count": 1500,
"dimension": 1536,
"metric": "l2"
}
]
Collection Search (HTTP Playground)
POST /api/collections/{name}/search
Convenience endpoint for dashboard/manual testing.
{
"vector": [0.1, 0.2, 0.3],
"top_k": 5
}
Graph HTTP Endpoints (Dashboard / tooling)
GET /api/collections/{name}/graph/node?id={id}&layer={layer}GET /api/collections/{name}/graph/neighbors?id={id}&layer={layer}&limit={limit}&offset={offset}GET /api/collections/{name}/graph/parents?id={id}&layer={layer}&limit={limit}POST /api/collections/{name}/graph/traversePOST /api/collections/{name}/graph/clusters