Architecture Overview
KnowledgeFlowDB is built with a layered architecture designed for performance, scalability, and flexibility.
Layer Architecture
┌─────────────────────────────────────┐
│ kfdb CLI │
│ (User Interface) │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ kfdb-query │
│ (KQL Parser & Query Executor) │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ kfdb-graph │
│ (Graph Storage & Traversal) │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ kfdb-vector │
│ (HNSW Vector Search) │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ kfdb-storage │
│ (Storage Engine Interface) │
└─────────────────────────────────────┘
↓
┌──────────────┬──────────────┬────────────┐
│ Memory │ RocksDB │ ScyllaDB │
│ (Testing) │ (Local) │(Production)│
└──────────────┴──────────────┴────────────┘
Core Components
kfdb-core
Foundation types and traits used across all crates:
NodeId,EdgeId: Type-safe identifiersValue: Comprehensive value typesEmbedding: Semantic vectors (1024-1536 dimensions)Timestamp: Time-aware operations
kfdb-storage
Abstract storage layer with three implementations:
- Memory: In-memory BTreeMap (fastest, testing)
- RocksDB: Embedded LSM-tree (local, persistent)
- ScyllaDB: Distributed NoSQL (production, scalable)
kfdb-graph
Graph data structures and algorithms:
- AdjacencyList: Efficient in-memory graph
- Traversal: BFS, DFS, shortest path
- Multi-hop: N-hop neighborhoods, path finding
kfdb-vector
Vector similarity search:
- HNSW Index: Fast approximate nearest neighbor search
- Multiple Metrics: Euclidean, Cosine, Manhattan, Dot Product
- Configurable: Tune M, efConstruction, efSearch
kfdb-query
Query parsing and execution:
- KQL Parser: Pest-based grammar
- Query Executor: Match, filter, project, sort
- Optimizer: Filter pushdown, limit pushdown
kfdb (CLI)
User-facing command-line interface:
- Interactive REPL with history
- Direct query execution
- Runtime backend selection
Design Principles
- Modularity: Each crate has a single responsibility
- Performance: Microsecond latencies for core operations
- Scalability: Linear scaling with nodes (ScyllaDB)
- Flexibility: Runtime backend selection
- Testing: Comprehensive test coverage (264 tests)
Data Flow
Query Execution
User Query (KQL)
↓
Parser (pest) → AST
↓
Query Executor → Match patterns
↓
Graph Traversal → Find nodes/edges
↓
Filter → Apply WHERE clause
↓
Project → SELECT columns
↓
Sort & Paginate → ORDER BY, LIMIT
↓
Result Set