- Sandook software coordinates multiple SSDs to avoid slowdowns due to garbage collection
- Two-tier control system redirects workloads across pooled disks in real time
- Performance gains approach theoretical limits but depend on large clustered storage environments
Researchers at MIT and Tufts University have built a storage management system called Sandook that pushes bundled SSDs closer to their theoretical limits. The project targets a long-standing problem within large storage clusters where identical drives rarely perform the same.
SSDs slow down for several reasons, including internal garbage collection cycles and the slower nature of write operations compared to reads. These slowdowns can impact workloads when multiple applications share the same storage pool.
Rather than letting each SSD handle performance issues on its own, the system divides control tasks between two coordinated layers that manage activity across the entire disk pool.
Article continues below
Unlocking the potential of data center SSDs
As Blocks and files According to reports, a central controller collects performance telemetry from each large SSD and reviews scheduling choices approximately 5 times per second.
Local agents inside storage servers transmit performance signals and congestion warnings as workloads change.
When a drive begins maintenance tasks such as garbage collection, the system lowers its priority and shifts traffic to healthier drives in the pool. This rerouting is carried out without requiring modifications to the applications accessing the storage.
The method builds on techniques already used in enterprise storage, including block replication for log-structured reads and writes that can land on any available device.
The trials included database processing, neural network training, large-scale image compression, and latency-critical storage services, and the system reportedly delivered between 30 and 82 percent higher raw input and output throughput compared to prior approaches targeting single bottlenecks.
Across multi-tenant workloads, application performance gains ranged from 12% to 94%, with latency reductions of up to 88%. In some cases, storage throughput has reached approximately 1.7 times previous levels.
The gains come entirely from software, meaning commercially available SSDs remain unchanged. The CPU and memory overhead required to monitor dozens of disks per server was described as minimal.
The research paper, titled “Unlocking the potential of data center SSDs by harnessing performance variability,” is available here.
Despite the big numbers, it’s not something most consumers could use at home. The design depends on large groups of SSDs working together, as well as Linux-based infrastructure and enterprise networking configurations common in data centers.
This pooling effect is responsible for most of the performance improvement. Without spare disks to transfer workloads to, a single disk system would provide little benefit.
Blocks and files notes that the work will be discussed at the USENIX NSDI 2026 event in May, where researchers plan to show how coordinated scheduling helps resolve the unpredictable behavior of SSDs in large clusters.
Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds.




