The Gap Between Storage Capacity, and CFI Bus, and PCI Express

To understand the performance and capacity gap that exists between flash devices and servers, it is useful to understand the internal architecture of a typical NVMe SSD. The figure below an NVMe SSD with one eight-channel controller and eight NAND chips (for simplicity, the DRAM chips that are also normally part of the SSD design are not shown here). The interface between the controller and each NAND chip is known as the Common Flash Memory Interface (CFI) channel. Some attributes of CFI include:

  • A CFI channel can be 8 or 16 bits wide.
  • Each CFI channel supports a transfer rate of up to 800MT/s.
  • Each flash channel can support a bandwidth of 800MB/s (8 bits wide) or 1.6TB/s (16 bits wide).
  • An 8-channel flash controller, this means that it can support 6.4GT/s, or up to 12.8GB/s for 16-bit wide channels.
  • The newest NGD Systems flash controller (our Newport platform) can support 16 CFI channels (with power consumption comparable to an 8-channel flash controller!).

By comparison, U.2 SSDs have a 4-lane wide (“x4”) PCI Express (PCIe) interface. The bandwidth of x4 PCIe Gen3 interface is 3.94GB/s, or less than a third of the total bandwidth going into an 8-channel flash controller, and less than a sixth of what a 16-channel flash controller can support. Also, while PCIe speeds will double with each of the upcoming generations (a 4-lane Gen4 PCIe interface will be roughly 7.85GB/s; a 4-lane Gen5 PCIe interface will be roughly 15.7GB/s), the clock rate of CFI interfaces will also double in each generation and will roughly follow the timing of new PCIe generations. The bandwidth gap between the CFI channels into a flash controller and the PCIe interface out of the controller will continue to pose a problem for both storage architects, and for the architects of the big data applications that utilize the latest generation of flash storage devices. In our next blog, we will explore how computational storage can reduce the impact of this gap on application performance.