Skip to Content

Search reference

API reference for GenSX Cloud search components. Search is powered by turbopuffer, and their documentation for query and write operations is a useful reference to augment this document.

Installation

npm install @gensx/storage

useSearch

Hook that provides access to vector search for a specific namespace.

Import

import { useSearch } from "@gensx/storage";

Signature

function useSearch( name: string, options?: SearchStorageOptions, ): Promise<Namespace>;

Parameters

ParameterTypeDefaultDescription
namestringRequiredThe namespace name to access
optionsSearchStorageOptions{}Optional configuration properties

Returns

Returns a namespace object with methods to interact with vector search.

Example

// Simple usage const namespace = await useSearch("documents"); const results = await namespace.query({ vector: queryEmbedding, includeAttributes: true, }); // With explicit configuration const namespace = await useSearch("documents", { project: "my-project", environment: "production", });

Namespace methods

The namespace object returned by useSearch provides these methods:

write

Inserts, updates, or deletes vectors in the namespace.

async write(options: WriteParams): Promise<{ message: string; rowsAffected: number }>

Parameters

ParameterTypeDefaultDescription
upsertColumnsUpsertColumnsnoneColumn-based format for upserting documents
upsertRowsUpsertRowsnoneRow-based format for upserting documents
patchColumnsPatchColumnsnoneColumn-based format for patching documents
patchRowsPatchRowsnoneRow-based format for patching documents
deletesID[]noneArray of document IDs to delete
deleteByFilterFilternoneFilter to match documents for deletion
distanceMetricDistanceMetricnoneDistance metric for similarity calculations
schemaSchemanoneOptional schema definition for attributes

Example

// Upsert documents in column-based format const result = await namespace.write({ upsertColumns: { id: ["doc-1", "doc-2"], vector: [ [0.1, 0.2, 0.3], [0.4, 0.5, 0.6], ], text: ["Document 1", "Document 2"], category: ["article", "blog"], }, distanceMetric: "cosine_distance", schema: { text: { type: "string" }, category: { type: "string" }, }, }); console.log(result); // { message: "Successfully wrote 2 rows", rowsAffected: 2 } // Upsert documents in row-based format await namespace.write({ upsertRows: [ { id: "doc-1", vector: [0.1, 0.2, 0.3], text: "Document 1", category: "article", }, { id: "doc-2", vector: [0.4, 0.5, 0.6], text: "Document 2", category: "blog", }, ], distanceMetric: "cosine_distance", }); // Delete documents by ID await namespace.write({ deletes: ["doc-1", "doc-2"], }); // Delete documents by filter await namespace.write({ deleteByFilter: [ "And", [ ["category", "Eq", "article"], ["createdAt", "Lt", "2023-01-01"], ], ], }); // Patch documents (update specific fields) await namespace.write({ patchRows: [ { id: "doc-1", category: "updated-category", }, ], });

Return value

Returns an object with a success message and the number of rows affected by the operation:

{ message: "Successfully wrote 2 rows", rowsAffected: 2 }

query

Searches for similar vectors based on a query vector or other ranking criteria.

async query(options: QueryOptions): Promise<QueryResults>

Parameters

ParameterTypeDefaultDescription
rankByRankBynoneVector, text, or attribute-based ranking
topKnumbernoneNumber of results to return
includeAttributesboolean | string[]['id']Include all attributes or specified ones
filtersFilternoneMetadata filters
aggregateByAggregateBynoneAggregate results by specified fields
consistencyConsistencynoneConsistency level for reads

Example

const results = await namespace.query({ topK: 10, // Number of results to return includeAttributes: true, // Include all attributes or specific ones filters: [ // Optional metadata filters "And", [ ["category", "Eq", "article"], ["createdAt", "Gte", "2023-01-01"] ] ], rankBy: ["vector", "ANN", [0.1, 0.2, 0.3, ...]], // Vector similarity search // OR rankBy: ["text", "BM25", "search query"], // Text search // OR rankBy: ["importance", "desc"], // Attribute-based ranking });

Return value

Returns a QueryResults object with an array of matched documents:

{ rows: [ { id: "doc-1", // Document ID $dist: 0.13, // Distance score (lower is more similar for most metrics) vector: number[], // If specified in includeAttributes text: "Document content", // Other attributes specified in includeAttributes category: "article", createdAt: "2023-07-15" }, // ...more results ], aggregations: { // Aggregation results (if aggregateBy was specified) "numberOfDocuments": 100, } }

getSchema

Retrieves the current schema for the namespace.

async getSchema(): Promise<Schema>

Example

const schema = await namespace.getSchema(); console.log(schema); // { // text: "string", // category: "string", // createdAt: "string" // }

updateSchema

Updates the schema for the namespace.

async updateSchema(options: { schema: Schema }): Promise<Schema>

Parameters

ParameterTypeDescription
schemaSchemaNew schema definition

Example

const updatedSchema = await namespace.updateSchema({ schema: { text: "string", category: "string", createdAt: "string", newField: "number", // Add new field tags: "string[]", // Add array field }, });

Return value

Returns the updated schema.

SearchClient

The SearchClient class provides a way to interact with GenSX vector search capabilities outside of the GenSX workflow context, such as from regular Node.js applications or server endpoints.

Import

import { SearchClient } from "@gensx/storage";

Constructor

constructor(options?: SearchStorageOptions);

Parameters

ParameterTypeDefaultDescription
optionsSearchStorageOptions{}Optional configuration properties

Example

// Default client const searchClient = new SearchClient(); // With configuration const searchClient = new SearchClient({ project: "my-project", environment: "production", });

Methods

getNamespace

Get a namespace instance and ensure it exists first.

async getNamespace(name: string): Promise<Namespace>
Example
const namespace = await searchClient.getNamespace("products"); // Then use the namespace to upsert or query vectors await namespace.write({ upsertRows: [ { id: "product-1", vector: [0.1, 0.2, 0.3, ...], name: "Product 1", category: "electronics" } ], distanceMetric: "cosine_distance" });

ensureNamespace

Create a namespace if it doesn’t exist.

async ensureNamespace(name: string): Promise<EnsureNamespaceResult>
Example
const { created } = await searchClient.ensureNamespace("products"); if (created) { console.log("Namespace was created"); }

listNamespaces

List all namespaces.

async listNamespaces(options?: { prefix?: string; limit?: number; cursor?: string; }): Promise<{ namespaces: { name: string; createdAt: Date }[]; nextCursor?: string; }>
Example
const { namespaces, nextCursor } = await searchClient.listNamespaces(); console.log("Available namespaces:", namespaces.map(ns => ns.name)); // ["products", "customers", "orders"]

deleteNamespace

Delete a namespace.

async deleteNamespace(name: string): Promise<DeleteNamespaceResult>
Example
const { deleted } = await searchClient.deleteNamespace("temp-namespace"); if (deleted) { console.log("Namespace was removed"); }

namespaceExists

Check if a namespace exists.

async namespaceExists(name: string): Promise<boolean>
Example
if (await searchClient.namespaceExists("products")) { console.log("Products namespace exists"); } else { console.log("Products namespace doesn't exist yet"); }

Usage in applications

The SearchClient is particularly useful when you need to access vector search functionality from:

  • Regular Express.js or Next.js API routes
  • Background jobs or workers
  • Custom scripts or tools
  • Any Node.js application outside the GenSX workflow context
// Example: Using SearchClient in an Express handler import express from "express"; import { SearchClient } from "@gensx/storage"; import { OpenAI } from "openai"; const app = express(); const searchClient = new SearchClient(); const openai = new OpenAI(); app.post("/api/search", async (req, res) => { try { const { query } = req.body; // Generate embedding for the query const embedding = await openai.embeddings.create({ model: "text-embedding-3-small", input: query, }); // Search for similar documents const namespace = await searchClient.getNamespace("documents"); const results = await namespace.query({ rankBy: ["vector", "ANN", embedding.data[0].embedding], topK: 5, includeAttributes: true, }); res.json(results); } catch (error) { console.error("Search error:", error); res.status(500).json({ error: "Search error" }); } }); app.listen(3000, () => { console.log("Server running on port 3000"); });

Filter operators

Filters use a structured array format with the following pattern:

// Basic filter structure [ "Operation", // And, Or, Not [ // Array of conditions ["field", "Operator", value], ], ];

Available operators:

OperatorDescriptionExample
EqEquals["field", "Eq", "value"]
NeNot equals["field", "Ne", "value"]
GtGreater than["field", "Gt", 10]
GteGreater than or equal["field", "Gte", 10]
LtLess than["field", "Lt", 10]
LteLess than or equal["field", "Lte", 10]
InIn array["field", "In", ["a", "b"]]
NinNot in array["field", "Nin", ["a", "b"]]
ContainsString contains["field", "Contains", "text"]
ContainsAnyContains any of values["tags", "ContainsAny", ["news", "tech"]]
ContainsAllContains all values["tags", "ContainsAll", ["imp", "urgent"]]

RankBy options

The rankBy parameter can be used in two primary ways:

Attribute-based ranking

Sorts by a field in ascending or descending order:

// Sort by the createdAt attribute in ascending order rankBy: ["createdAt", "asc"]; // Sort by price in descending order (highest first) rankBy: ["price", "desc"];

Text-based ranking

For full-text search relevance scoring:

// Basic BM25 text ranking rankBy: ["text", "BM25", "search query"]; // BM25 with multiple search terms rankBy: ["text", "BM25", ["term1", "term2"]]; // Combined text ranking strategies rankBy: [ "Sum", [ ["text", "BM25", "search query"], ["text", "BM25", "another term"], ], ]; // Weighted text ranking (multiply BM25 score by 0.5) rankBy: ["Product", [["text", "BM25", "search query"], 0.5]]; // Alternative syntax for weighted ranking rankBy: ["Product", [0.5, ["text", "BM25", "search query"]]];

Use these options to fine-tune the relevance and ordering of your search results.

SearchStorageOptions

Configuration properties for search operations.

PropTypeDefaultDescription
projectstringAuto-detectedProject to use for cloud storage. If you don’t set this, it’ll first check your GENSX_PROJECT environment variable, then look for the project name in your local gensx.yaml file.
environmentstringAuto-detectedEnvironment to use for cloud storage. If you don’t set this, it’ll first check your GENSX_ENV environment variable, then use whatever environment you’ve selected in the CLI with gensx env select.
Last updated on