Of course, every memory-based analytics application workflow must start by moving data from storage or from I/O streams to server memory. For most meaningful problem sets, this requires that the dataset either be broken into pieces and loaded into memory sequentially. It takes a server with a 32-bit PCIe Gen4 bus can load a memory complex of 6TB (the largest size typically found in today’s servers) a little under 100 seconds. While that sounds slow, a petabyte-scale data set (170 times the 6TB memory footprint) poses even bigger problems. Today, this is typically handled by utilizing multiple servers in a cluster.
However, even if the petabyte problem was spread over a cluster of 32 servers, it would still be a stretch to say the solution was capable of real-time performance. Each server would have to be loaded with data nearly six times (the last iteration would only be a partial data set), with a total load time of over eight and a half minutes (515 seconds). The physical and power footprint of such a cluster would also be significant; some metrics on this would include:
- Physical Footprint: Each server would be 4 rack units high, for a total of 128 rack units (this would take nearly four complete racks, including networking and power conditioning equipment).
- Power and Cooling Footprint: Each server would also consume 4800 watts of power, for a total of nearly 154KW for the cluster.
- CapEx Footprint: The cost of each server with this much RAM would be nearly $100K each; the 32-server cluster’s total cost would be well over $3M.
This is the calculus that application engineers must confront when deploying real-time applications and designing the compute clusters that have to support them. Our next blog will examine some alternatives to this option which can both increase performance and reduce the cost, power, and physical footprint of the solution.