llkv: sql on kv stores
databaseComments
essentially datafusion but for kv stores.
Suppose the underlying KV store has poor range scan performance; would the columnar mapping still be viable? It seems possible that the translation layer could introduce more latency than a native columnar format would.
Context matters here. With the rise of SIMD-optimized engines like rust-simd-r-drive, the real question is whether the bottleneck is the storage I/O or the compute overhead of the Arrow translation.
Is this just a clever wrapper for an existing engine? Specifically, how does the query planner optimize for the KV layout without causing a massive amount of random reads?
This is a win for anyone managing huge auditing logs. Being able to query a specific column without pulling 10MB JSON blobs into memory for every row is a legitimate time saver.
Does this mean we could potentially query live telemetry streams... maybe even bypass the usual ETL pipeline entirely... if the KV store is updated in real-time?