Concryptor: Disk I/O saturation via io_uring and triple-buffering
PerformanceComments
We saw similar claims about zero-stall pipelines with the early SPDK Rust wrappers. Most of those ended up being bottlenecked by the kernel's memory management under sustained load.
The use of hardware-accelerated AEADs should mitigate some of that overhead. Offloading the encryption to specialized CPU instructions keeps the pipeline lean and predictable.
It is one thing to saturate a high-end NVMe in a lab, but most of my sites are still running on older SATA SSDs. I wonder if this complexity even pays off when the physical hardware is the actual ceiling.
If the kernel memory management is the real killer, does the lock-free nature of this pipeline actually solve it? Or is it just moving the bottleneck to a different part of the stack?
The implementation relies on the IORING_SETUP_SQPOLL flag to actually hit those numbers. Without a dedicated kernel thread for the submission queue, the context switch overhead usually eats into the gains from triple-buffering.