Skip to Content

Search reference

API reference for GenSX Cloud search components. Search is powered by turbopuffer, and their documentation for query and upsert operations is a useful reference to augment this document.

Installation

npm install @gensx/storage

SearchProvider

Provides vector search capabilities to its child components.

Import

import { SearchProvider } from "@gensx/storage";

Example

import { SearchProvider } from "@gensx/storage"; const Workflow = gensx.Component("SearchWorkflow", ({ input }) => ( <SearchProvider> <YourComponent input={input} /> </SearchProvider> ));

useSearch

Hook that provides access to vector search for a specific namespace.

Import

import { useSearch } from "@gensx/storage";

Signature

function useSearch(name: string): Namespace;

Parameters

ParameterTypeDescription
namestringThe namespace name to access

Returns

Returns a namespace object with methods to interact with vector search.

Example

const namespace = await useSearch("documents"); const results = await namespace.query({ vector: queryEmbedding, includeAttributes: true, });

Namespace methods

The namespace object returned by useSearch provides these methods:

write

Inserts, updates, or deletes vectors in the namespace.

async write(options: WriteParams): Promise<number>

Parameters

ParameterTypeDefaultDescription
upsertColumnsUpsertColumnsundefinedColumn-based format for upserting documents
upsertRowsUpsertRowsundefinedRow-based format for upserting documents
patchColumnsPatchColumnsundefinedColumn-based format for patching documents
patchRowsPatchRowsundefinedRow-based format for patching documents
deletesId[]undefinedArray of document IDs to delete
deleteByFilterFiltersundefinedFilter to match documents for deletion
distanceMetricDistanceMetricundefinedDistance metric for similarity calculations
schemaSchemaundefinedOptional schema definition for attributes

Example

// Upsert documents in column-based format await namespace.write({ upsertColumns: { id: ["doc-1", "doc-2"], vector: [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], text: ["Document 1", "Document 2"], category: ["article", "blog"] }, distanceMetric: "cosine_distance", schema: { text: { type: "string" }, category: { type: "string" } } }); // Upsert documents in row-based format await namespace.write({ upsertRows: [ { id: "doc-1", vector: [0.1, 0.2, 0.3], text: "Document 1", category: "article" }, { id: "doc-2", vector: [0.4, 0.5, 0.6], text: "Document 2", category: "blog" } ], distanceMetric: "cosine_distance" }); // Delete documents by ID await namespace.write({ deletes: ["doc-1", "doc-2"] }); // Delete documents by filter await namespace.write({ deleteByFilter: [ "And", [ ["category", "Eq", "article"], ["createdAt", "Lt", "2023-01-01"] ] ] }); // Patch documents (update specific fields) await namespace.write({ patchRows: [ { id: "doc-1", category: "updated-category" } ] });

Return value

Returns the number of rows affected by the operation.

query

Searches for similar vectors based on a query vector.

async query(options: QueryOptions): Promise<QueryResults>

Parameters

ParameterTypeDefaultDescription
vectornumber[]RequiredQuery vector for similarity search
topKnumber10Number of results to return
includeVectorsbooleanfalseWhether to include vectors in results
includeAttributesboolean | string[]trueInclude all attributes or specified ones
filtersFiltersundefinedMetadata filters
rankByRankByundefinedAttribute-based ranking or text ranking
consistencystringundefinedConsistency level for reads

Example

const results = await namespace.query({ vector: [0.1, 0.2, 0.3, ...], // Query vector topK: 10, // Number of results to return includeVectors: false, // Whether to include vectors in results includeAttributes: true, // Include all attributes or specific ones filters: [ // Optional metadata filters "And", [ ["category", "Eq", "article"], ["createdAt", "Gte", "2023-01-01"] ] ], rankBy: ["attributes.importance", "asc"], // Optional attribute-based ranking });

Return value

Returns an array of matched documents with similarity scores:

[ { id: "doc-1", // Document ID score: 0.87, // Similarity score (0-1) vector?: number[], // Original vector (if includeVectors=true) attributes?: { // Metadata (if includeAttributes=true) text: "Document content", category: "article", createdAt: "2023-07-15" } }, // ...more results ]

getSchema

Retrieves the current schema for the namespace.

async getSchema(): Promise<Schema>

Example

const schema = await namespace.getSchema(); console.log(schema); // { // text: "string", // category: "string", // createdAt: "string" // }

updateSchema

Updates the schema for the namespace.

async updateSchema(options: { schema: Schema }): Promise<Schema>

Parameters

ParameterTypeDescription
schemaSchemaNew schema definition

Example

const updatedSchema = await namespace.updateSchema({ schema: { text: "string", category: "string", createdAt: "string", newField: "number", // Add new field tags: "string[]", // Add array field }, });

Return value

Returns the updated schema.

getMetadata

Retrieves metadata about the namespace.

async getMetadata(): Promise<NamespaceMetadata>

Example

const metadata = await namespace.getMetadata(); console.log(metadata); // { // vectorCount: 1250, // dimensions: 1536, // distanceMetric: "cosine_distance", // created: "2023-07-15T12:34:56Z" // }

Namespace management

Higher-level operations for managing namespaces (these are accessed directly from the search object, not via useSearch):

import { SearchClient } from "@gensx/storage"; const search = new SearchClient(); // List all namespaces const { namespaces } = await search.listNamespaces(); // Check if namespace exists const exists = await search.namespaceExists("my-namespace"); // Create namespace if it doesn't exist const { created } = await search.ensureNamespace("my-namespace"); // Delete a namespace const { deleted } = await search.deleteNamespace("old-namespace"); // Get a namespace directly for vector operations const namespace = search.getNamespace("products"); // Write vectors using the namespace await namespace.write({ upsertRows: [ { id: "product-1", vector: [0.1, 0.3, 0.5, ...], // embedding vector name: "Ergonomic Chair", category: "furniture", price: 299.99 }, { id: "product-2", vector: [0.2, 0.4, 0.6, ...], name: "Standing Desk", category: "furniture", price: 499.99 } ], distanceMetric: "cosine_distance", schema: { name: { type: "string" }, category: { type: "string" }, price: { type: "number" } } }); // Query vectors directly with the namespace const searchResults = await namespace.query({ vector: [0.15, 0.35, 0.55, ...], // query vector topK: 5, includeAttributes: true, filters: [ "And", [ ["category", "Eq", "furniture"], ["price", "Lt", 400] ] ] });

The SearchClient is a standard typescript library and can be used outside of GenSX workflows in your normal application code as well.

useSearchStorage

Hook that provides direct access to the search storage instance, which includes higher-level namespace management functions.

Import

import { useSearchStorage } from "@gensx/storage";

Signature

function useSearchStorage(): SearchStorage;

Example

const searchStorage = useSearchStorage();

The search storage object provides these management methods:

getNamespace

Get a namespace object for direct interaction.

// Get a namespace directly (without calling useSearch) const searchStorage = useSearchStorage(); const namespace = searchStorage.getNamespace("documents"); // Usage example await namespace.write({ upsertRows: [...], distanceMetric: "cosine_distance" });

listNamespaces

List namespaces in your project.

const searchStorage = useSearchStorage(); const { namespaces, nextCursor } = await searchStorage.listNamespaces(); console.log("Namespaces:", namespaces); // ["docs", "products"]

The method accepts an options object with these properties:

OptionTypeDescription
prefixstringOptional prefix to filter namespace names by
limitnumberMaximum number of results to return per page
cursorstringCursor for pagination from previous response

Returns an object with:

  • namespaces: Array of namespace names
  • nextCursor: Cursor for the next page, or undefined if no more results

ensureNamespace

Create a namespace if it doesn’t exist.

const searchStorage = useSearchStorage(); const { created } = await searchStorage.ensureNamespace("documents"); if (created) { console.log("Namespace was created"); } else { console.log("Namespace already existed"); }

deleteNamespace

Delete a namespace and all its data.

const searchStorage = useSearchStorage(); const { deleted } = await searchStorage.deleteNamespace("old-namespace"); if (deleted) { console.log("Namespace was deleted"); } else { console.log("Namespace could not be deleted"); }

namespaceExists

Check if a namespace exists.

const searchStorage = useSearchStorage(); const exists = await searchStorage.namespaceExists("documents"); if (exists) { console.log("Namespace exists"); } else { console.log("Namespace does not exist"); }

hasEnsuredNamespace

Check if a namespace has been ensured in the current session.

const searchStorage = useSearchStorage(); const isEnsured = searchStorage.hasEnsuredNamespace("documents"); if (isEnsured) { console.log("Namespace has been ensured in this session"); } else { console.log("Namespace has not been ensured yet"); }

SearchClient

The SearchClient class provides a way to interact with GenSX vector search capabilities outside of the GenSX workflow context, such as from regular Node.js applications or server endpoints.

Import

import { SearchClient } from "@gensx/storage";

Constructor

constructor()

Example

const searchClient = new SearchClient();

Methods

getNamespace

Get a namespace instance and ensure it exists first.

async getNamespace(name: string): Promise<Namespace>
Example
const namespace = await searchClient.getNamespace("products"); // Then use the namespace to upsert or query vectors await namespace.write({ upsertRows: [ { id: "product-1", vector: [0.1, 0.2, 0.3, ...], name: "Product 1", category: "electronics" } ], distanceMetric: "cosine_distance" });

ensureNamespace

Create a namespace if it doesn’t exist.

async ensureNamespace(name: string): Promise<EnsureNamespaceResult>
Example
const { created } = await searchClient.ensureNamespace("products"); if (created) { console.log("Namespace was created"); }

listNamespaces

List all namespaces.

async listNamespaces(options?: { prefix?: string; limit?: number; cursor?: string; }): Promise<{ namespaces: string[]; nextCursor?: string; }>
Example
const { namespaces, nextCursor } = await searchClient.listNamespaces(); console.log("Available namespaces:", namespaces); // ["products", "customers", "orders"]

deleteNamespace

Delete a namespace.

async deleteNamespace(name: string): Promise<DeleteNamespaceResult>
Example
const { deleted } = await searchClient.deleteNamespace("temp-namespace"); if (deleted) { console.log("Namespace was removed"); }

namespaceExists

Check if a namespace exists.

async namespaceExists(name: string): Promise<boolean>
Example
if (await searchClient.namespaceExists("products")) { console.log("Products namespace exists"); } else { console.log("Products namespace doesn't exist yet"); }

Usage in applications

The SearchClient is particularly useful when you need to access vector search functionality from:

  • Regular Express.js or Next.js API routes
  • Background jobs or workers
  • Custom scripts or tools
  • Any Node.js application outside the GenSX workflow context
// Example: Using SearchClient in an Express handler import express from 'express'; import { SearchClient } from '@gensx/storage'; import { OpenAI } from 'openai'; const app = express(); const searchClient = new SearchClient(); const openai = new OpenAI(); app.post('/api/search', async (req, res) => { try { const { query } = req.body; // Generate embedding for the query const embedding = await openai.embeddings.create({ model: "text-embedding-3-small", input: query }); // Search for similar documents const namespace = await searchClient.getNamespace('documents'); const results = await namespace.query({ vector: embedding.data[0].embedding, topK: 5, includeAttributes: true }); res.json(results); } catch (error) { console.error('Search error:', error); res.status(500).json({ error: 'Search error' }); } }); app.listen(3000, () => { console.log('Server running on port 3000'); });

Filter operators

Filters use a structured array format with the following pattern:

// Basic filter structure [ "Operation", // And, Or, Not [ // Array of conditions ["field", "Operator", value] ] ]

Available operators:

OperatorDescriptionExample
EqEquals["field", "Eq", "value"]
NeNot equals["field", "Ne", "value"]
GtGreater than["field", "Gt", 10]
GteGreater than or equal["field", "Gte", 10]
LtLess than["field", "Lt", 10]
LteLess than or equal["field", "Lte", 10]
InIn array["field", "In", ["a", "b"]]
NinNot in array["field", "Nin", ["a", "b"]]
ContainsString contains["field", "Contains", "text"]
ContainsAnyContains any of values["tags", "ContainsAny", ["news", "tech"]]
ContainsAllContains all values["tags", "ContainsAll", ["imp", "urgent"]]

RankBy options

The rankBy parameter can be used in two primary ways:

Attribute-based ranking

Sorts by a field in ascending or descending order:

// Sort by the createdAt attribute in ascending order rankBy: ["createdAt", "asc"] // Sort by price in descending order (highest first) rankBy: ["price", "desc"]

Text-based ranking

For full-text search relevance scoring:

// Basic BM25 text ranking rankBy: ["text", "BM25", "search query"] // BM25 with multiple search terms rankBy: ["text", "BM25", ["term1", "term2"]] // Combined text ranking strategies rankBy: ["Sum", [ ["text", "BM25", "search query"], ["text", "BM25", "another term"] ]] // Weighted text ranking (multiply BM25 score by 0.5) rankBy: ["Product", [["text", "BM25", "search query"], 0.5]] // Alternative syntax for weighted ranking rankBy: ["Product", [0.5, ["text", "BM25", "search query"]]]

Use these options to fine-tune the relevance and ordering of your search results.

Last updated on