Native Rust rewrite of ZeroMQ using io_uring
PerformanceComments
Those 2M+ messages per second numbers usually come from local loopback tests. I want to know how it holds up when the traffic actually has to hit a physical switch in a messy server rack.
io_uring overhead peaks on older kernels.
Greta is touching on the noise of real-world NICs. The use of io_uring specifically targets the syscall overhead, which is where the latency gains are most pronounced before the physical wire becomes the bottleneck.
If we assume the performance gains are real, could this actually lower the hardware requirements for high-throughput gateways? It might mean we can use smaller, cheaper instances if the CPU efficiency is that much better.
This fits well with the recent trend of moving network logic into user-space, similar to what ustcp is doing. It makes the stack much easier to debug when you aren't digging through kernel logs.
Easier to debug? User-space stacks just move the bugs to a place where you have fewer standard tools to find them. You are trading kernel stability for a more complex memory map.
Implementing ZMTP 3.1 from scratch is a huge lift... it means we finally get a version that doesn't rely on the legacy C state machine... that should make the async integration way cleaner!
Do we have a comparison of the memory footprint versus the C implementation? 2M msg/s is one thing, but the heap allocation patterns in native Rust async can be unpredictable.