Skip to Content

Search reference

API reference for GenSX Cloud search components. Search is powered by turbopuffer, and their documentation for query and upsert operations is a useful reference to augment this document.

Installation

npm install @gensx/storage

useSearch

Hook that provides access to vector search for a specific namespace.

Import

import { useSearch } from "@gensx/storage";

Signature

function useSearch( name: string, options?: SearchStorageOptions, ): Promise<Namespace>;

Parameters

ParameterTypeDefaultDescription
namestringRequiredThe namespace name to access
optionsSearchStorageOptions{}Optional configuration properties

Returns

Returns a namespace object with methods to interact with vector search.

Example

// Simple usage const namespace = await useSearch("documents"); const results = await namespace.query({ vector: queryEmbedding, includeAttributes: true, }); // With configuration const namespace = await useSearch("documents", { project: "my-project", environment: "production", });

Namespace methods

The namespace object returned by useSearch provides these methods:

write

Inserts, updates, or deletes vectors in the namespace.

async write(options: WriteParams): Promise<number>

Parameters

ParameterTypeDefaultDescription
upsertColumnsUpsertColumnsundefinedColumn-based format for upserting documents
upsertRowsUpsertRowsundefinedRow-based format for upserting documents
patchColumnsPatchColumnsundefinedColumn-based format for patching documents
patchRowsPatchRowsundefinedRow-based format for patching documents
deletesId[]undefinedArray of document IDs to delete
deleteByFilterFiltersundefinedFilter to match documents for deletion
distanceMetricDistanceMetricundefinedDistance metric for similarity calculations
schemaSchemaundefinedOptional schema definition for attributes

Example

// Upsert documents in column-based format await namespace.write({ upsertColumns: { id: ["doc-1", "doc-2"], vector: [ [0.1, 0.2, 0.3], [0.4, 0.5, 0.6], ], text: ["Document 1", "Document 2"], category: ["article", "blog"], }, distanceMetric: "cosine_distance", schema: { text: { type: "string" }, category: { type: "string" }, }, }); // Upsert documents in row-based format await namespace.write({ upsertRows: [ { id: "doc-1", vector: [0.1, 0.2, 0.3], text: "Document 1", category: "article", }, { id: "doc-2", vector: [0.4, 0.5, 0.6], text: "Document 2", category: "blog", }, ], distanceMetric: "cosine_distance", }); // Delete documents by ID await namespace.write({ deletes: ["doc-1", "doc-2"], }); // Delete documents by filter await namespace.write({ deleteByFilter: [ "And", [ ["category", "Eq", "article"], ["createdAt", "Lt", "2023-01-01"], ], ], }); // Patch documents (update specific fields) await namespace.write({ patchRows: [ { id: "doc-1", category: "updated-category", }, ], });

Return value

Returns the number of rows affected by the operation.

query

Searches for similar vectors based on a query vector.

async query(options: QueryOptions): Promise<QueryResults>

Parameters

ParameterTypeDefaultDescription
vectornumber[]RequiredQuery vector for similarity search
topKnumber10Number of results to return
includeVectorsbooleanfalseWhether to include vectors in results
includeAttributesboolean | string[]trueInclude all attributes or specified ones
filtersFiltersundefinedMetadata filters
rankByRankByundefinedAttribute-based ranking or text ranking
consistencystringundefinedConsistency level for reads

Example

const results = await namespace.query({ vector: [0.1, 0.2, 0.3, ...], // Query vector topK: 10, // Number of results to return includeVectors: false, // Whether to include vectors in results includeAttributes: true, // Include all attributes or specific ones filters: [ // Optional metadata filters "And", [ ["category", "Eq", "article"], ["createdAt", "Gte", "2023-01-01"] ] ], rankBy: ["attributes.importance", "asc"], // Optional attribute-based ranking });

Return value

Returns an array of matched documents with similarity scores:

[ { id: "doc-1", // Document ID score: 0.87, // Similarity score (0-1) vector?: number[], // Original vector (if includeVectors=true) attributes?: { // Metadata (if includeAttributes=true) text: "Document content", category: "article", createdAt: "2023-07-15" } }, // ...more results ]

getSchema

Retrieves the current schema for the namespace.

async getSchema(): Promise<Schema>

Example

const schema = await namespace.getSchema(); console.log(schema); // { // text: "string", // category: "string", // createdAt: "string" // }

updateSchema

Updates the schema for the namespace.

async updateSchema(options: { schema: Schema }): Promise<Schema>

Parameters

ParameterTypeDescription
schemaSchemaNew schema definition

Example

const updatedSchema = await namespace.updateSchema({ schema: { text: "string", category: "string", createdAt: "string", newField: "number", // Add new field tags: "string[]", // Add array field }, });

Return value

Returns the updated schema.

getMetadata

Retrieves metadata about the namespace.

async getMetadata(): Promise<NamespaceMetadata>

Example

const metadata = await namespace.getMetadata(); console.log(metadata); // { // vectorCount: 1250, // dimensions: 1536, // distanceMetric: "cosine_distance", // created: "2023-07-15T12:34:56Z" // }

SearchClient

The SearchClient class provides a way to interact with GenSX vector search capabilities outside of the GenSX workflow context, such as from regular Node.js applications or server endpoints.

Import

import { SearchClient } from "@gensx/storage";

Constructor

constructor(options?: SearchStorageOptions);

Parameters

ParameterTypeDefaultDescription
optionsSearchStorageOptions{}Optional configuration properties

Example

// Default client const searchClient = new SearchClient(); // With configuration const searchClient = new SearchClient({ project: "my-project", environment: "production", });

Methods

getNamespace

Get a namespace instance and ensure it exists first.

async getNamespace(name: string): Promise<Namespace>
Example
const namespace = await searchClient.getNamespace("products"); // Then use the namespace to upsert or query vectors await namespace.write({ upsertRows: [ { id: "product-1", vector: [0.1, 0.2, 0.3, ...], name: "Product 1", category: "electronics" } ], distanceMetric: "cosine_distance" });

ensureNamespace

Create a namespace if it doesn’t exist.

async ensureNamespace(name: string): Promise<EnsureNamespaceResult>
Example
const { created } = await searchClient.ensureNamespace("products"); if (created) { console.log("Namespace was created"); }

listNamespaces

List all namespaces.

async listNamespaces(options?: { prefix?: string; limit?: number; cursor?: string; }): Promise<{ namespaces: string[]; nextCursor?: string; }>
Example
const { namespaces, nextCursor } = await searchClient.listNamespaces(); console.log("Available namespaces:", namespaces); // ["products", "customers", "orders"]

deleteNamespace

Delete a namespace.

async deleteNamespace(name: string): Promise<DeleteNamespaceResult>
Example
const { deleted } = await searchClient.deleteNamespace("temp-namespace"); if (deleted) { console.log("Namespace was removed"); }

namespaceExists

Check if a namespace exists.

async namespaceExists(name: string): Promise<boolean>
Example
if (await searchClient.namespaceExists("products")) { console.log("Products namespace exists"); } else { console.log("Products namespace doesn't exist yet"); }

Usage in applications

The SearchClient is particularly useful when you need to access vector search functionality from:

  • Regular Express.js or Next.js API routes
  • Background jobs or workers
  • Custom scripts or tools
  • Any Node.js application outside the GenSX workflow context
// Example: Using SearchClient in an Express handler import express from "express"; import { SearchClient } from "@gensx/storage"; import { OpenAI } from "openai"; const app = express(); const searchClient = new SearchClient(); const openai = new OpenAI(); app.post("/api/search", async (req, res) => { try { const { query } = req.body; // Generate embedding for the query const embedding = await openai.embeddings.create({ model: "text-embedding-3-small", input: query, }); // Search for similar documents const namespace = await searchClient.getNamespace("documents"); const results = await namespace.query({ vector: embedding.data[0].embedding, topK: 5, includeAttributes: true, }); res.json(results); } catch (error) { console.error("Search error:", error); res.status(500).json({ error: "Search error" }); } }); app.listen(3000, () => { console.log("Server running on port 3000"); });

Filter operators

Filters use a structured array format with the following pattern:

// Basic filter structure [ "Operation", // And, Or, Not [ // Array of conditions ["field", "Operator", value], ], ];

Available operators:

OperatorDescriptionExample
EqEquals["field", "Eq", "value"]
NeNot equals["field", "Ne", "value"]
GtGreater than["field", "Gt", 10]
GteGreater than or equal["field", "Gte", 10]
LtLess than["field", "Lt", 10]
LteLess than or equal["field", "Lte", 10]
InIn array["field", "In", ["a", "b"]]
NinNot in array["field", "Nin", ["a", "b"]]
ContainsString contains["field", "Contains", "text"]
ContainsAnyContains any of values["tags", "ContainsAny", ["news", "tech"]]
ContainsAllContains all values["tags", "ContainsAll", ["imp", "urgent"]]

RankBy options

The rankBy parameter can be used in two primary ways:

Attribute-based ranking

Sorts by a field in ascending or descending order:

// Sort by the createdAt attribute in ascending order rankBy: ["createdAt", "asc"]; // Sort by price in descending order (highest first) rankBy: ["price", "desc"];

Text-based ranking

For full-text search relevance scoring:

// Basic BM25 text ranking rankBy: ["text", "BM25", "search query"]; // BM25 with multiple search terms rankBy: ["text", "BM25", ["term1", "term2"]]; // Combined text ranking strategies rankBy: [ "Sum", [ ["text", "BM25", "search query"], ["text", "BM25", "another term"], ], ]; // Weighted text ranking (multiply BM25 score by 0.5) rankBy: ["Product", [["text", "BM25", "search query"], 0.5]]; // Alternative syntax for weighted ranking rankBy: ["Product", [0.5, ["text", "BM25", "search query"]]];

Use these options to fine-tune the relevance and ordering of your search results.

SearchStorageOptions

Configuration properties for search operations.

PropTypeDefaultDescription
projectstringAuto-detectedProject to use for cloud storage. If you don’t set this, it’ll first check your GENSX_PROJECT environment variable, then look for the project name in your local gensx.yaml file.
environmentstringAuto-detectedEnvironment to use for cloud storage. If you don’t set this, it’ll first check your GENSX_ENV environment variable, then use whatever environment you’ve selected in the CLI with gensx env select.
Last updated on