Semantic Search - AI-Powered Search
Semantic search uses AI to understand the meaning of queries and documents, not just keywords. It's powered by vector embeddings that represent text as points in high-dimensional space, enabling searches based on conceptual similarity.
How Semantic Search Works
Vector Embeddings
Text is converted to high-dimensional vectors (arrays of numbers):
// Text to embedding
"javascript async programming" → [0.2, 0.8, 0.1, ..., 0.4] // 768 dimensions
// Similar concepts have similar vectors
"javascript async programming" → [0.2, 0.8, 0.1, ..., 0.4]
"js asynchronous code" → [0.3, 0.7, 0.2, ..., 0.5] // Close in vector space
// Different concepts are far apart
"javascript async programming" → [0.2, 0.8, 0.1, ..., 0.4]
"cooking pasta recipes" → [0.9, 0.1, 0.8, ..., 0.2] // Far in vector space
Cosine Similarity
Measures how similar two vectors are:
// Calculate similarity (-1 to 1, higher is more similar)
function cosineSimilarity(vecA, vecB) {
const dotProduct = vecA.reduce((sum, a, i) => sum + a * vecB[i], 0);
const magnitudeA = Math.sqrt(vecA.reduce((sum, a) => sum + a * a, 0));
const magnitudeB = Math.sqrt(vecB.reduce((sum, b) => sum + b * b, 0));
return dotProduct / (magnitudeA * magnitudeB);
}
// Example
queryVector = [0.2, 0.8, 0.1];
doc1Vector = [0.3, 0.7, 0.2]; // similarity: 0.95 (very similar)
doc2Vector = [0.9, 0.1, 0.8]; // similarity: 0.20 (not similar)
Search Process
- Indexing (done once):
// Convert documents to embeddings
const documents = [
"How to deploy a JavaScript application",
"Deploying apps to production servers",
"Best practices for async JavaScript"
];
const embeddings = await Promise.all(
documents.map(doc => generateEmbedding(doc))
);
// Store in database with vector index
await db.execute(`
INSERT INTO articles (content, embedding)
VALUES (?, vector(?))
`, [documents[0], embeddings[0]]);
- Searching (realtime):
// User query
const query = "how do I push my app to production?";
// Convert query to embedding
const queryEmbedding = await generateEmbedding(query);
// Find similar documents
const results = await db.execute(`
SELECT
content,
vector_distance_cos(embedding, vector(?)) as similarity
FROM articles
ORDER BY similarity DESC
LIMIT 10
`, [queryEmbedding]);
// Results ranked by semantic similarity:
// 1. "Deploying apps to production servers" (0.89)
// 2. "How to deploy a JavaScript application" (0.82)
// 3. "Best practices for async JavaScript" (0.45)
Embedding Models
Sentence Transformers
Pre-trained models that convert text to vectors:
import { pipeline } from '@xenova/transformers';
// Load embedding model
const embedder = await pipeline(
'feature-extraction',
'Xenova/all-MiniLM-L6-v2'
);
// Generate embedding
const text = "javascript async programming";
const embedding = await embedder(text, {
pooling: 'mean',
normalize: true
});
// Result: Float32Array of 384 dimensions
console.log(embedding.data); // [0.123, -0.456, 0.789, ...]
Popular Models
Model | Dimensions | Speed | Quality |
---|---|---|---|
all-MiniLM-L6-v2 | 384 | Fast | Good |
all-mpnet-base-v2 | 768 | Medium | Better |
text-embedding-3-small (OpenAI) | 1536 | API | Excellent |
text-embedding-ada-002 (OpenAI) | 1536 | API | Excellent |
textembedding-gecko (Gemini) | 768 | API | Excellent |
Implementation in Astro Vault
Indexing Content
// scripts/index-content.ts
import { indexContent } from '@logan/libsql-search';
import { getTursoClient } from './lib/turso';
const client = getTursoClient();
const articles = [
{
slug: 'javascript-async',
title: 'JavaScript Async Programming',
content: 'Learn how to use async/await...',
tags: ['javascript', 'async'],
},
// ... more articles
];
// Generate embeddings and store with 768-dimensional vectors
await indexContent(
client,
'articles',
articles,
'local', // Use local embedding model
768 // Embedding dimensions
);
Searching
// src/pages/api/search.json.ts
import { searchArticles } from '@logan/libsql-search';
import { getTursoClient } from '../../lib/turso';
export async function GET({ request }) {
const url = new URL(request.url);
const query = url.searchParams.get('q') || '';
const client = getTursoClient();
// Semantic search
const results = await searchArticles(
client,
'articles',
query,
'local',
10
);
return new Response(JSON.stringify({ results }), {
headers: { 'Content-Type': 'application/json' },
});
}
Semantic vs Full-Text Search
Example Queries
Query: "how do I deploy my app?"
Full-Text Search:
-- Looks for keywords: "deploy", "app"
SELECT * FROM articles
WHERE content LIKE '%deploy%' AND content LIKE '%app%';
-- Results (keyword matching):
1. "Deploy your application to production"
2. "App deployment best practices"
3. "Deploying Docker apps"
Semantic Search:
// Understands: user wants to publish/release software
const results = await searchArticles(client, 'articles',
"how do I deploy my app?", 'local', 10);
// Results (meaning-based):
1. "Pushing to production servers" (0.91 similarity)
2. "Publishing your application" (0.88)
3. "CI/CD deployment pipelines" (0.85)
4. "Deploy your application to production" (0.83)
Feature Comparison
Feature | Full-Text | Semantic |
---|---|---|
Speed | 10-20ms | 50-100ms |
Setup | Built-in DB | Requires embeddings |
Synonyms | Manual dictionary | Automatic |
Typos | Poor | Better |
Concept match | No | Yes |
Natural queries | Poor | Excellent |
Resource usage | Low | Medium |
Index size | Small | Large |
Benefits of Semantic Search
1. Understanding Synonyms
// Query: "car"
// Full-text: Only finds "car"
// Semantic: Finds "car", "automobile", "vehicle", "auto"
2. Natural Language
// Query: "how to make my site faster?"
// Full-text: "how", "to", "make", "my", "site", "faster"
// Semantic: Understands user wants performance optimization
// Finds "speed up website", "optimize performance", etc.
3. Conceptual Understanding
// Query: "best laptop for coding"
// Full-text: Finds documents with those exact words
// Semantic: Understands "laptop for coding" = "developer laptop",
// "programming computer", "development machine"
4. Typo Tolerance
// Query: "javascrpt async" (typo)
// Full-text: No results (exact match only)
// Semantic: Still finds JavaScript async content (similar embedding)
5. Cross-Language Concepts
// Query in English: "error handling"
// Can find similar concepts even if expressed differently
// "exception management", "dealing with failures", etc.
Limitations
1. Slower Than Full-Text
Full-text search: 10ms
Semantic search: 50-100ms
Reason: Vector distance calculations are expensive
2. Requires Vector Index
-- LibSQL/SQLite
CREATE INDEX idx_embedding ON articles(libsql_vector_idx(embedding));
-- Index size for 10,000 documents with 768-dim embeddings:
-- ~30 MB (vs 5 MB for full-text index)
3. Cold Start Problem
// First query in session
const embedding = await generateEmbedding(query);
// Takes 200-500ms to initialize model
// Subsequent queries
const embedding2 = await generateEmbedding(query2);
// Takes 20-50ms (model cached)
4. Context Window Limits
// Most models have token limits
const maxTokens = 512; // ~400 words
// Long documents need chunking
const chunks = splitIntoChunks(longDocument, maxTokens);
const embeddings = await Promise.all(
chunks.map(chunk => generateEmbedding(chunk))
);
5. Exact Match Can Be Worse
// Query: "React.useState"
// Full-text: Finds exact "React.useState"
// Semantic: Might return general React state management docs
// (less precise for exact API names)
Hybrid Search (Best of Both)
Combine full-text and semantic search:
async function hybridSearch(query: string) {
// Full-text search
const fulltextResults = await db.execute(`
SELECT id, ts_rank(search_vector, to_tsquery(?)) * 2 as score
FROM articles
WHERE search_vector @@ to_tsquery(?)
`, [query, query]);
// Semantic search
const queryEmbedding = await generateEmbedding(query);
const semanticResults = await db.execute(`
SELECT id, vector_distance_cos(embedding, vector(?)) as score
FROM articles
ORDER BY score DESC
LIMIT 20
`, [queryEmbedding]);
// Merge and re-rank
const combined = mergeResults(fulltextResults, semanticResults);
return combined.sort((a, b) => b.totalScore - a.totalScore);
}
When to Use Hybrid
- Technical documentation: Exact API names (full-text) + concepts (semantic)
- E-commerce: Product codes (full-text) + descriptions (semantic)
- Code search: Function names (full-text) + purpose (semantic)
Embedding Providers
Local (Xenova Transformers)
// Pros: Free, private, no API limits
// Cons: Slower, uses CPU/GPU
import { pipeline } from '@xenova/transformers';
const embedder = await pipeline('feature-extraction',
'Xenova/all-MiniLM-L6-v2');
const embedding = await embedder(text);
OpenAI
// Pros: High quality, fast
// Cons: Costs money, rate limits
import { OpenAI } from 'openai';
const openai = new OpenAI();
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
const embedding = response.data[0].embedding;
Gemini
// Pros: High quality, generous free tier
// Cons: Rate limits
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: 'embedding-001' });
const result = await model.embedContent(text);
const embedding = result.embedding.values;
Use Cases
✅ Excellent For
- Documentation search: Natural language queries
- Customer support: Find relevant help articles
- Content discovery: "More like this" recommendations
- Question answering: Match questions to answers
- Knowledge bases: Conceptual search across docs
⚠️ Consider Alternatives
- Exact code search: Use full-text or grep
- Product SKUs: Use full-text or database queries
- Date/numeric filtering: Use traditional indexes
- Very large scale: Specialized vector databases (Pinecone, Weaviate)
Resources
- Xenova Transformers: huggingface.co/docs/transformers.js
- Sentence Transformers: sbert.net
- OpenAI Embeddings: platform.openai.com/docs/guides/embeddings
- Vector Search Explained: pinecone.io/learn/vector-database