Relevant storage metrics for petabyte-scale datasets

Scott Shadley, VP of Marketing (July 9, 2018) – Classical storage vendors have worried about four factors: capacity, performance, and reliability. Cost, which typically is measured as cost per terabyte, was typically a function of the market and the other three factors. The emergence of “cold storage” (storage for data that is infrequently accessed) has added power as a new factor, which is usually expressed as watts per terabyte.

While these are all great for characterising storage at the device level, they miss the impact of storage on overall system cost and performance. For instance, buying the fastest SSDs you can get does little for you if most of the time in an application is spent moving data from storage devices to memory. This is exactly the problem that data scientists face when attempting to perform real-time analysis on petabyte-scale datasets. This blog series will focus on this problem, and how computational storage can effectively address it. For a preview, take a glance at introductory white paper.

2018-11-01T09:16:28+00:00