@logan/libsql-search
Semantic vector search library for LibSQL and Turso databases. Add natural language search to your applications with automatic embedding generation and cosine similarity ranking.
Features
- Semantic search: Natural language queries with vector embeddings
- Multiple embedding providers: Local (onnxruntime), Gemini, or OpenAI
- LibSQL/Turso optimized: Built specifically for LibSQL databases
- TypeScript: Full type safety with TypeScript definitions
- Zero config: Sensible defaults, works out of the box
- Lightweight: Minimal dependencies, small bundle size
Installation
NPM
npm install @logan/libsql-search @libsql/client
pnpm
pnpm add @logan/libsql-search @libsql/client
JSR
npx jsr add @logan/libsql-search
Quick Start
1. Create Database Schema
import { createClient } from '@libsql/client';
import { createTable } from '@logan/libsql-search';
const client = createClient({
url: 'file:local.db'
});
// Create articles table with 768-dimension embeddings
await createTable(client, 'articles', 768);
2. Index Content
import { indexContent } from '@logan/libsql-search';
const articles = [
{
title: 'Getting Started',
content: 'Learn how to use libsql-search...',
tags: ['tutorial', 'basics'],
},
{
title: 'Advanced Features',
content: 'Explore advanced search capabilities...',
tags: ['advanced', 'features'],
},
];
// Generate embeddings and insert into database
await indexContent(
client,
'articles',
articles,
'local', // embedding provider
768 // embedding dimension
);
3. Search Content
import { searchArticles } from '@logan/libsql-search';
const results = await searchArticles(
client,
'articles',
'how to get started',
'local',
10 // limit
);
console.log(results);
// [
// {
// title: 'Getting Started',
// content: 'Learn how to use libsql-search...',
// similarity: 0.89,
// ...
// },
// ...
// ]
API Reference
createTable
Create a table with vector search support.
function createTable(
client: Client,
tableName: string,
dimension: number
): Promise<void>
Parameters:
client
: LibSQL client instancetableName
: Name of the table to createdimension
: Embedding dimension (768 for most models)
Example:
await createTable(client, 'articles', 768);
Generated Schema:
CREATE TABLE IF NOT EXISTS articles (
slug TEXT PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
folder TEXT,
tags TEXT,
embedding F32_BLOB(768)
);
CREATE INDEX IF NOT EXISTS idx_articles_vector
ON articles (libsql_vector_idx(embedding));
indexContent
Index content with automatic embedding generation.
function indexContent(
client: Client,
tableName: string,
articles: Article[],
embeddingProvider: 'local' | 'gemini' | 'openai',
dimension: number
): Promise<void>
Parameters:
client
: LibSQL client instancetableName
: Table name to insert intoarticles
: Array of articles to indexembeddingProvider
: Embedding provider to usedimension
: Embedding dimension (768)
Article Interface:
interface Article {
slug: string;
title: string;
content: string;
folder?: string;
tags?: string[];
}
Example:
await indexContent(
client,
'articles',
[
{
slug: 'getting-started',
title: 'Getting Started',
content: 'Tutorial content...',
folder: 'tutorials',
tags: ['tutorial', 'basics'],
}
],
'local',
768
);
searchArticles
Perform semantic search on indexed content.
function searchArticles(
client: Client,
tableName: string,
query: string,
embeddingProvider: 'local' | 'gemini' | 'openai',
limit?: number
): Promise<SearchResult[]>
Parameters:
client
: LibSQL client instancetableName
: Table name to searchquery
: Natural language search queryembeddingProvider
: Embedding provider to uselimit
: Maximum number of results (default: 10)
Returns:
interface SearchResult {
slug: string;
title: string;
content: string;
folder?: string;
tags?: string;
similarity: number; // 0-1, higher is better
}
Example:
const results = await searchArticles(
client,
'articles',
'semantic search tutorial',
'local',
5
);
results.forEach(result => {
console.log(`${result.title} (${(result.similarity * 100).toFixed(1)}%)`);
});
getAllArticles
Retrieve all articles from the database.
function getAllArticles(
client: Client,
tableName: string
): Promise<Article[]>
Example:
const allArticles = await getAllArticles(client, 'articles');
console.log(`Total articles: ${allArticles.length}`);
getArticleBySlug
Get a single article by its slug.
function getArticleBySlug(
client: Client,
tableName: string,
slug: string
): Promise<Article | null>
Example:
const article = await getArticleBySlug(client, 'articles', 'getting-started');
if (article) {
console.log(article.title);
}
getArticlesByFolder
Get all articles in a specific folder.
function getArticlesByFolder(
client: Client,
tableName: string,
folder: string
): Promise<Article[]>
Example:
const tutorials = await getArticlesByFolder(client, 'articles', 'tutorials');
console.log(`${tutorials.length} tutorials found`);
getFolders
Get list of all unique folders.
function getFolders(
client: Client,
tableName: string
): Promise<string[]>
Example:
const folders = await getFolders(client, 'articles');
console.log('Folders:', folders);
// ['tutorials', 'guides', 'reference']
Embedding Providers
Local (onnxruntime)
Uses onnxruntime-node to run embeddings locally on CPU/GPU.
Pros:
- No API keys required
- No API rate limits
- Free forever
- Works offline
- Privacy-friendly (data stays local)
Cons:
- Slower than cloud providers
- Requires CPU/GPU resources
- Larger dependency size
Setup:
import { indexContent } from '@logan/libsql-search';
await indexContent(
client,
'articles',
articles,
'local', // provider
768
);
Gemini
Uses Google's Gemini API for embeddings.
Pros:
- Fast embedding generation
- High-quality embeddings
- Free tier available (1500 requests/day)
Cons:
- Requires API key
- Rate limits
- Network latency
- Data sent to Google
Setup:
# Set API key
export GEMINI_API_KEY=your_key_here
await indexContent(
client,
'articles',
articles,
'gemini',
768
);
OpenAI
Uses OpenAI's text-embedding-3-small model.
Pros:
- Fast embedding generation
- High-quality embeddings
- Reliable API
Cons:
- Requires API key (paid)
- Rate limits
- Network latency
- Data sent to OpenAI
Setup:
# Set API key
export OPENAI_API_KEY=your_key_here
await indexContent(
client,
'articles',
articles,
'openai',
768
);
Using with Turso
Setup Turso Database
# Install Turso CLI
brew install tursodatabase/tap/turso
# Create database
turso db create my-search-db
# Get connection URL and token
turso db show my-search-db
Connect to Turso
import { createClient } from '@libsql/client';
const client = createClient({
url: process.env.TURSO_DB_URL,
authToken: process.env.TURSO_AUTH_TOKEN,
});
Index to Turso
import { createTable, indexContent } from '@logan/libsql-search';
// Create schema
await createTable(client, 'articles', 768);
// Index content
await indexContent(client, 'articles', articles, 'local', 768);
Search on Turso
const results = await searchArticles(
client,
'articles',
'search query',
'local',
10
);
Performance
Indexing Speed
Provider | Articles/sec | Notes |
---|---|---|
Local | ~5-10 | CPU dependent |
Gemini | ~20-50 | Network dependent |
OpenAI | ~20-50 | Network dependent |
Search Speed
Database | Query time | Notes |
---|---|---|
Local libSQL | ~10-50ms | File system speed |
Turso | ~50-200ms | Network latency |
Optimization Tips
1. Batch indexing:
// Index in batches of 100
const batchSize = 100;
for (let i = 0; i < articles.length; i += batchSize) {
const batch = articles.slice(i, i + batchSize);
await indexContent(client, 'articles', batch, 'local', 768);
}
2. Cache embeddings:
const embeddingCache = new Map<string, number[]>();
function getCachedEmbedding(text: string): number[] | undefined {
return embeddingCache.get(text);
}
function setCachedEmbedding(text: string, embedding: number[]): void {
embeddingCache.set(text, embedding);
}
3. Use indexes:
-- Already created by createTable()
CREATE INDEX idx_articles_vector
ON articles (libsql_vector_idx(embedding));
Advanced Usage
Custom Table Schema
// Create table with additional columns
await client.execute(`
CREATE TABLE IF NOT EXISTS articles (
slug TEXT PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
folder TEXT,
tags TEXT,
author TEXT,
created_at INTEGER,
embedding F32_BLOB(768)
)
`);
Filtered Search
// Search with SQL WHERE clause
const results = await client.execute({
sql: `
SELECT
slug,
title,
content,
vector_distance_cos(embedding, vector(?)) as similarity
FROM articles
WHERE folder = ?
ORDER BY similarity
LIMIT ?
`,
args: [queryEmbedding, 'tutorials', 10],
});
Hybrid Search
Combine semantic search with full-text search:
const results = await client.execute({
sql: `
SELECT
slug,
title,
content,
vector_distance_cos(embedding, vector(?)) as semantic_score,
(title LIKE ? OR content LIKE ?) as text_match
FROM articles
WHERE text_match = 1
ORDER BY semantic_score
LIMIT ?
`,
args: [queryEmbedding, `%${query}%`, `%${query}%`, 10],
});
TypeScript Types
import type {
Article,
SearchResult,
EmbeddingProvider,
} from '@logan/libsql-search';
interface Article {
slug: string;
title: string;
content: string;
folder?: string;
tags?: string[];
}
interface SearchResult extends Article {
similarity: number;
}
type EmbeddingProvider = 'local' | 'gemini' | 'openai';
Examples
Blog Search
import { createClient } from '@libsql/client';
import {
createTable,
indexContent,
searchArticles,
} from '@logan/libsql-search';
// Setup
const client = createClient({ url: 'file:blog.db' });
await createTable(client, 'posts', 768);
// Index blog posts
const posts = [
{
slug: 'first-post',
title: 'My First Post',
content: 'Welcome to my blog...',
tags: ['intro', 'welcome'],
},
// ... more posts
];
await indexContent(client, 'posts', posts, 'local', 768);
// Search
const results = await searchArticles(
client,
'posts',
'getting started with blogging',
'local'
);
Documentation Site
// Index documentation pages
const docs = await fs.promises.readdir('./docs');
const articles = await Promise.all(
docs.map(async (file) => {
const content = await fs.promises.readFile(`./docs/${file}`, 'utf-8');
const slug = file.replace('.md', '');
return {
slug,
title: extractTitle(content),
content,
folder: path.dirname(file),
};
})
);
await indexContent(client, 'docs', articles, 'local', 768);
Product Catalog
interface Product {
slug: string;
name: string;
description: string;
category: string;
price: number;
}
const products: Product[] = [
{
slug: 'laptop-x1',
name: 'Laptop X1',
description: 'High-performance laptop with...',
category: 'electronics',
price: 999,
},
// ... more products
];
// Convert to Article format
const articles = products.map(p => ({
slug: p.slug,
title: p.name,
content: p.description,
folder: p.category,
}));
await indexContent(client, 'products', articles, 'local', 768);
// Search products
const results = await searchArticles(
client,
'products',
'laptop for programming',
'local'
);
Troubleshooting
"Module not found: onnxruntime-node"
Install onnxruntime-node for local embeddings:
pnpm add onnxruntime-node
"API key not found"
Set environment variable for cloud providers:
export GEMINI_API_KEY=your_key
# or
export OPENAI_API_KEY=your_key
"No results found"
Check that:
- Content is indexed:
await getAllArticles(client, 'articles')
- Table exists:
await createTable(client, 'articles', 768)
- Embedding provider matches between indexing and search
Slow search performance
- Ensure vector index is created
- Use Turso for edge-optimized searches
- Limit result count
- Cache frequently searched queries
Resources
- NPM: @logan/libsql-search
- JSR: @logan/libsql-search
- GitHub: Create issues and contribute
- Turso Docs: docs.turso.tech
- LibSQL: github.com/tursodatabase/libsql