Skip to main content

Storage Architecture

KnowledgeFlowDB supports three storage backends with a unified StorageEngine trait.

Storage Engine Trait

#[async_trait]
pub trait StorageEngine: Send + Sync {
async fn get(&self, key: &[u8]) -> Result<Option<Vec<u8>>>;
async fn put(&self, key: &[u8], value: &[u8]) -> Result<()>;
async fn delete(&self, key: &[u8]) -> Result<()>;
async fn scan(&self, start: &[u8], end: &[u8]) -> Result<Vec<(Vec<u8>, Vec<u8>)>>;
async fn batch_put(&self, pairs: &[(&[u8], &[u8])]) -> Result<()>;
}

Three Implementations

1. Memory Storage

Use Case: Testing, CI/CD

let storage = MemoryStorage::new();

Performance:

  • GET: 32 ns
  • PUT: 45 ns
  • Throughput: 20M+ QPS

Pros: Fastest, deterministic, no I/O Cons: No persistence, limited by RAM

2. RocksDB Storage

Use Case: Local development, single-node production

let config = RocksDBConfig::default();
let storage = RocksDBStorage::open(&path, config)?;

Performance:

  • GET: 6 µs
  • PUT: 8 µs
  • Throughput: 100K QPS

Pros: Persistent, embedded, battle-tested Cons: Single-node only, no replication

3. ScyllaDB Storage

Use Case: Production clusters, horizontal scaling

let config = ScyllaDBConfig {
nodes: vec!["127.0.0.1:9042".to_string()],
keyspace: "knowledgeflow".to_string(),
replication_factor: 3,
create_keyspace: true,
};
let storage = ScyllaDBStorage::connect(config).await?;

Performance:

  • GET: 1 ms (p50), 5 ms (p99)
  • PUT: 2 ms (p50), 6 ms (p99)
  • Throughput: 100K QPS per node (linear scaling)

Pros: Horizontally scalable, HA, multi-datacenter Cons: Higher latency, operational complexity

Performance Comparison

OperationMemoryRocksDBScyllaDB
GET32 ns6 µs1-5 ms
PUT45 ns8 µs2-6 ms
Max QPS20M+100K100K/node
Persistence✅ Local✅ Replicated
ScalingVerticalVerticalHorizontal

Choosing a Backend

Need persistence?
├─ No → Use Memory (testing)
└─ Yes
Need horizontal scaling?
├─ No → Use RocksDB (single-node)
└─ Yes → Use ScyllaDB (cluster)

Next Steps