Knowledge Package
The @toolpack-sdk/knowledge package provides Retrieval-Augmented Generation (RAG) capabilities for your AI agents. Build knowledge bases from documentation, code, or any text source and enable semantic search within your agent conversations.
Installation
npm install @toolpack-sdk/knowledge
Quick Start
Development Setup (Memory Provider)
Perfect for prototyping and development with zero configuration:
import { Knowledge, MemoryProvider, MarkdownSource, OllamaEmbedder } from '@toolpack-sdk/knowledge';
const kb = await Knowledge.create({
provider: new MemoryProvider(),
sources: [new MarkdownSource('./docs/**/*.md')],
embedder: new OllamaEmbedder({ model: 'nomic-embed-text' }),
description: 'Documentation for search queries',
});
// Search your knowledge base
const results = await kb.query('how to configure authentication');
console.log(results[0].chunk.content);
Production Setup (Persistent Provider)
For CLI tools and production applications with persistent storage:
import { Knowledge, PersistentKnowledgeProvider, MarkdownSource, OpenAIEmbedder } from '@toolpack-sdk/knowledge';
const kb = await Knowledge.create({
provider: new PersistentKnowledgeProvider({
namespace: 'my-app',
reSync: false, // Use existing index if available
}),
sources: [new MarkdownSource('./docs/**/*.md')],
embedder: new OpenAIEmbedder({
model: 'text-embedding-3-small',
apiKey: process.env.OPENAI_API_KEY!,
}),
description: 'My application documentation',
onEmbeddingProgress: (event) => {
console.log(`Embedding: ${event.percent}% (${event.current}/${event.total})`);
},
});
Providers
Providers handle vector storage and similarity search. Choose based on your use case:
MemoryProvider
In-memory storage ideal for development:
import { MemoryProvider } from '@toolpack-sdk/knowledge';
const provider = new MemoryProvider({
maxChunks: 10000, // Optional: limit memory usage
});
Best for: Development, prototyping, short-lived processes
Limitations: Data lost on process exit, memory constraints
PersistentKnowledgeProvider
SQLite-backed persistent storage:
import { PersistentKnowledgeProvider } from '@toolpack-sdk/knowledge';
const provider = new PersistentKnowledgeProvider({
namespace: 'my-app', // Creates ~/.toolpack/knowledge/my-app.db
storagePath: './custom/path', // Optional: override default location
reSync: false, // Skip re-indexing if DB exists
});
Best for: CLI tools, desktop apps, production workloads
Features: WAL mode, transactions, metadata filtering
Sources
Sources extract and chunk content from various formats:
MarkdownSource
Chunks Markdown files by heading hierarchy:
import { MarkdownSource } from '@toolpack-sdk/knowledge';
const source = new MarkdownSource('./docs/**/*.md', {
maxChunkSize: 2000, // Max tokens per chunk
chunkOverlap: 200, // Overlap between chunks
minChunkSize: 100, // Merge small sections
namespace: 'docs', // Prefix for chunk IDs
metadata: { type: 'documentation' }, // Added to all chunks
});
Features:
- Heading-based chunking (preserves document structure)
- YAML frontmatter extraction
- Code block detection (
hasCode: truemetadata) - Deterministic chunk IDs for deduplication
WebUrlSource
Crawl and index websites automatically:
import { WebUrlSource } from '@toolpack-sdk/knowledge';
const source = new WebUrlSource('https://example.com/docs', {
maxDepth: 3, // Crawl up to 3 levels deep
maxPages: 100, // Limit to 100 pages
allowedDomains: ['example.com'], // Only crawl these domains
delayMs: 1000, // Respectful crawling delay
userAgent: 'MyApp/1.0', // Custom user agent
blockedPaths: ['/admin', '/private'], // Skip these paths
followExternalLinks: false, // Stay within allowed domains
});
Features:
- Recursive website crawling with depth control
- Domain and path filtering
- Content extraction (removes scripts, styles, navigation)
- Link discovery and following
- Rate limiting and respectful crawling
- Metadata preservation (title, URL, crawl date, links)
ApiSource
Index data from REST APIs with pagination support:
import { ApiSource } from '@toolpack-sdk/knowledge';
const source = new ApiSource('https://api.example.com', '/posts', {
method: 'GET', // HTTP method
headers: { 'Accept': 'application/json' },
auth: {
type: 'bearer', // 'bearer' | 'basic' | 'api-key'
token: process.env.API_TOKEN,
},
pagination: {
type: 'cursor', // 'offset' | 'cursor' | 'page'
cursorParam: 'after',
nextCursorPath: 'data.next_cursor',
resultsPath: 'data.posts',
},
rateLimit: {
requestsPerSecond: 2, // Rate limiting
},
transformResponse: (data, url) => {
return data.posts.map((post: any) => ({
content: `${post.title}\n\n${post.content}`,
metadata: { id: post.id, author: post.author, tags: post.tags },
}));
},
});
Features:
- REST API data ingestion with authentication
- Multiple pagination strategies (offset, cursor, page-based)
- Rate limiting and request throttling
- Custom response transformation
- Error handling and retries
- Support for all HTTP methods
Embedders
Embedders convert text to vector embeddings for semantic search:
OllamaEmbedder
Local embeddings using Ollama (zero API cost):
import { OllamaEmbedder } from '@toolpack-sdk/knowledge';
const embedder = new OllamaEmbedder({
model: 'nomic-embed-text', // or 'mxbai-embed-large', 'all-minilm'
baseUrl: 'http://localhost:11434', // default
retries: 3,
retryDelay: 1000,
});
Supported models:
nomic-embed-text(768 dimensions)mxbai-embed-large(1024 dimensions)all-minilm(384 dimensions)- Custom models with
dimensionsoverride
OpenAIEmbedder
OpenAI text-embedding models:
import { OpenAIEmbedder } from '@toolpack-sdk/knowledge';
const embedder = new OpenAIEmbedder({
model: 'text-embedding-3-small', // or 'text-embedding-3-large'
apiKey: process.env.OPENAI_API_KEY!,
retries: 3,
retryDelay: 1000,
timeout: 30000,
});
Supported models:
text-embedding-3-small(1536 dimensions)text-embedding-3-large(3072 dimensions)text-embedding-ada-002(1536 dimensions)
Integration with Toolpack SDK
Connect your knowledge base to Toolpack SDK agents:
import { Toolpack } from 'toolpack-sdk';
import { Knowledge, MemoryProvider, MarkdownSource, OllamaEmbedder } from '@toolpack-sdk/knowledge';
const kb = await Knowledge.create({
provider: new MemoryProvider(),
sources: [new MarkdownSource('./docs/**/*.md')],
embedder: new OllamaEmbedder({ model: 'nomic-embed-text' }),
description: 'Search this when users ask about setup, configuration, or API usage.',
});
const toolpack = await Toolpack.init({
provider: 'anthropic',
knowledge: kb, // Automatically registers knowledge_search tool
});
// The agent can now search your knowledge base
const response = await toolpack.chat('How do I configure authentication?');
Enhanced Knowledge Tools
The knowledge base provides a tool with advanced search capabilities:
const tool = kb.toTool();
// Semantic search (default)
const semanticResults = await tool.execute({
query: 'machine learning',
limit: 5,
});
// Hybrid search (semantic + keyword)
const hybridResults = await tool.execute({
query: 'machine learning algorithms',
searchType: 'hybrid',
keywordWeight: 0.4,
semanticWeight: 0.6,
filter: { category: 'tutorial' },
});
Tool parameters:
query: Search query stringsearchType:'semantic'or'hybrid'(default:'semantic')keywordWeight: Weight for keyword matching (0-1, default: 0.3)semanticWeight: Weight for semantic matching (0-1, default: 0.7)limit: Maximum results (default: 10)threshold: Minimum similarity score (default: 0.7)filter: Metadata filters
Querying
Basic Semantic Search
const results = await kb.query('authentication setup');
// Returns: Array of { chunk, score, distance }
Hybrid Search (Semantic + Keyword)
Combine semantic similarity with keyword matching for better results:
const results = await kb.hybridQuery('machine learning algorithms', {
keywordWeight: 0.3, // 30% keyword relevance
semanticWeight: 0.7, // 70% semantic relevance
keywordFields: ['content', 'title'], // Fields to search
limit: 10,
threshold: 0.7,
});
Hybrid search advantages:
- Better precision for technical terms and proper nouns
- Improved ranking for exact matches
- Balanced semantic understanding with keyword accuracy
Advanced Query Options
const results = await kb.query('authentication setup', {
limit: 5, // Max results (default: 10)
threshold: 0.8, // Minimum similarity 0-1 (default: 0.7)
filter: { // Metadata filters
hasCode: true,
category: { $in: ['api', 'guide'] },
source: 'web', // Filter by source type
},
includeMetadata: true, // Include chunk metadata (default: true)
includeVectors: false, // Include embedding vectors (default: false)
});
Metadata Filters
// Exact match
{ category: 'api' }
// In array
{ category: { $in: ['api', 'guide', 'tutorial'] } }
// Numeric comparisons
{ priority: { $gt: 5 } }
{ priority: { $lt: 10 } }
// Source-specific filters
{ source: 'web', url: { $in: ['https://docs.example.com'] } }
{ source: 'api', statusCode: 200 }
Streaming Ingestion
Process large datasets with real-time progress tracking:
// Traditional sync (blocks until complete)
await kb.sync();
// Streaming sync (progress updates)
for await (const progress of kb.syncStream()) {
switch (progress.type) {
case 'count':
console.log(`📊 Total chunks to process: ${progress.total}`);
break;
case 'progress':
console.log(`⏳ Processed ${progress.processed}/${progress.total} chunks (${progress.percent}%)`);
break;
case 'complete':
console.log(`✅ Sync complete! Processed ${progress.total} chunks`);
break;
case 'error':
console.error('❌ Sync failed:', progress.error);
break;
}
}
Streaming benefits:
- Real-time progress feedback
- Memory-efficient batch processing (100 chunks per batch)
- Non-blocking UI updates
- Early error detection
- Better user experience for large datasets
Error Handling
Handle embedding failures gracefully:
const kb = await Knowledge.create({
provider,
sources,
embedder,
description: 'Knowledge base',
onError: (error, context) => {
console.error(`Failed: ${context.file} — ${error.message}`);
if (error instanceof EmbeddingError) {
return 'skip'; // Skip this chunk, continue with others
}
return 'abort'; // Stop the entire process
},
});
Error Types:
KnowledgeError— Base error classEmbeddingError— Embedding API failuresIngestionError— Source parsing failuresDimensionMismatchError— Vector dimension mismatchKnowledgeProviderError— Provider operation failuresChunkTooLargeError— Chunk exceeds max size
API Reference
Knowledge.create()
interface KnowledgeOptions {
provider: KnowledgeProvider;
sources: KnowledgeSource[];
embedder: Embedder;
description: string; // Required: used as tool description
reSync?: boolean; // default: true
onError?: ErrorHandler;
onSync?: SyncEventHandler;
onEmbeddingProgress?: EmbeddingProgressHandler;
}
Knowledge Methods
query(text, options?)
Standard semantic search using vector similarity.
async query(text: string, options?: QueryOptions): Promise<QueryResult[]>
hybridQuery(text, options?)
Advanced search combining semantic similarity with keyword matching.
async hybridQuery(text: string, options?: HybridQueryOptions): Promise<QueryResult[]>
interface HybridQueryOptions extends QueryOptions {
keywordWeight?: number; // Weight for keyword matching (default: 0.3)
semanticWeight?: number; // Weight for semantic matching (default: 0.7)
keywordFields?: string[]; // Fields to search (default: ['content'])
}
sync()
Synchronous ingestion of all sources.
async sync(): Promise<void>
syncStream()
Streaming ingestion with progress updates.
async *syncStream(): AsyncIterable<SyncProgress>
interface SyncProgress {
type: 'count' | 'progress' | 'complete' | 'error';
total?: number;
processed?: number;
percent?: number;
error?: Error;
}
toTool()
Convert knowledge base to a Toolpack SDK tool.
toTool(): KnowledgeTool
Sync Events
onSync: (event) => {
// event.type: 'start' | 'file' | 'chunk' | 'complete' | 'error'
// event.file?: string
// event.chunksAffected?: number
// event.error?: Error
}
Embedding Progress
onEmbeddingProgress: (event) => {
// event.source: string
// event.current: number
// event.total: number
// event.percent: number
}
Best Practices
-
Choose the right provider:
- Development:
MemoryProvider - Production CLI:
PersistentKnowledgeProvider
- Development:
-
Use appropriate chunk sizes:
- Small docs: 1000-1500 tokens
- Large docs: 2000-3000 tokens
- Code: 1500-2000 tokens (with
hasCodemetadata) - Web content: 1500-2000 tokens (after HTML cleaning)
- API data: 1000-1500 tokens (depends on content structure)
-
Handle embedding failures:
- Always provide
onErrorfor production - Use
skipfor transient failures - Use
abortfor critical errors
- Always provide
-
Leverage metadata filtering:
- Tag chunks with
category,hasCode,version - Use source-specific metadata (
source,url,statusCode) - Filter by relevance in queries
- Tag chunks with
-
Monitor progress:
- Use
onEmbeddingProgressfor large knowledge bases - Use
syncStream()for real-time progress in UIs - Show loading indicators in CLI apps
- Use
-
Web crawling best practices:
- Set appropriate
delayMs(1000ms+ for respectful crawling) - Use
allowedDomainsto stay within your site - Limit
maxDepthandmaxPagesto avoid excessive crawling - Check
robots.txtcompliance manually
- Set appropriate
-
API data ingestion:
- Implement rate limiting to respect API limits
- Use
transformResponsefor complex data structures - Handle pagination correctly for large datasets
- Cache API responses when possible
-
Hybrid search optimization:
- Use hybrid search for technical content with proper nouns
- Adjust
keywordWeighthigher for exact term matching - Use
semanticWeighthigher for conceptual searches - Experiment with
keywordFieldsfor different content types
Troubleshooting
Common Issues
"Dimension mismatch" error:
// Ensure embedder dimensions match provider
// OllamaEmbedder with nomic-embed-text = 768 dimensions
// PersistentKnowledgeProvider persists dimensions in DB
"No files found" with MarkdownSource:
// Check glob pattern - use forward slashes even on Windows
new MarkdownSource('./docs/**/*.md') // ✓
new MarkdownSource('.\\docs\\**\\*.md') // ✗
Slow embedding:
// Use embedBatch() when possible
// OllamaEmbedder.embedBatch is optimized
// OpenAIEmbedder.embedBatch makes single API call
Web crawling blocked:
// Check robots.txt and respect site policies
// Add user agent and delay between requests
// Use allowedDomains to limit crawling scope
const source = new WebUrlSource(url, {
userAgent: 'MyApp/1.0 (contact@example.com)',
delayMs: 2000,
allowedDomains: ['example.com'],
});
API rate limiting:
// Implement rate limiting in ApiSource
const source = new ApiSource(baseUrl, endpoint, {
rateLimit: {
requestsPerSecond: 1,
requestsPerMinute: 60,
},
});
Hybrid search not returning expected results:
// Adjust weights based on content type
// Technical docs: higher keywordWeight
// General content: higher semanticWeight
const results = await kb.hybridQuery(query, {
keywordWeight: 0.4, // Increase for technical terms
semanticWeight: 0.6, // Increase for concepts
keywordFields: ['content', 'title', 'metadata.tags'],
});
Streaming sync memory issues:
// Streaming processes in batches automatically
// For very large datasets, consider:
// - Smaller batch sizes (not currently configurable)
// - More frequent progress updates
// - Error handling for partial failures