Using Computational Storage to Help Reduce Data Movement for Real-Time Analytic Solutions

One way to reduce the impact of data movement on real-time analytic solutions is to eliminate or significantly reduce the amount of data that is being moved. If the amount of data that needs to be moved in the example from our previous blog can be reduced by 95% (from 1PB to 50TB), some significant things happen. The 32-server cluster can now load the entire data set in roughly 26 seconds. Alternatively, the 32-server cluster can be reduced to an 8-server cluster, and still load the entire data set in under 2 minutes. This would also reduce the cost of the cluster to under $1M, reduce the physical footprint to a part of a single server rack, and reduce the cluster power consumption to under 40KW.

Sounds great, right? What if we could match this performance, but do it with only a couple of servers? With computational storage, you can achieve this sort of reduction in data movement. By embedding processing capabilities within the storage device, computational storage enables parallel operations like sorting, indexing, and searching of the data without it needing to leave the storage device. More complex operations such as AI tensor flow and encryption can also be performed on the data while it is in the storage device. One of the other benefits of computational storage is that the processing capacity scales with the storage capacity. With a drive capacity of 32TB (U.2 form factor SSD), one petabyte of data can be stored in 32 drives, which can easily fit into four servers.

What does computational storage mean for the bottom line when it comes to real-time analytic solutions? Using our 95% data movement reduction figure and a four-server cluster, we can now reduce power consumption for the solution to under 20KW, use less than half of a rack unit of physical space, and reduce the capital expense for the cluster to under $500K. The time for transferring the reduced set of data from the SSDs to the server memory in under 4 minutes. If the data movement can be reduced by 99% (we have seen numbers as good or better than this in our benchmarks), then the load time would be under a minute. More importantly, the reduction in the size and cost of the solution means that this solution can be applied to a variety of new applications. If you want to find out how computational storage can help your real-time analytics application, contact us at