While parallel file systems such as Lustre, GPFS, or BeeGFS have been serving as reliable backbones for HPC clusters for more than two decades, the introduction of data-intensive applications in supercomputers has sparked a reformulation of HPC storage architectures. While magnetic disks have served for decades as the storage backbone of HPC clusters, their technological limitations have led to the introduction of new storage technologies such as non-volatile RAM (NVRAM), that easily outperform magnetic disks by an order of magnitude with little difference between random and sequential I/O.
Most notably, the bandwidth of node-local NVRAM SSDs typically exceeds the peak bandwidth of the attached parallel file system, while the maximum number of I/O operations per second can even be more than 10,000 times higher than that of the parallel file system.
GekkoFS is a file system that aims to exploit this performance advantage to accelerate the I/O from scientific applications. The basic design idea behind GekkoFS is to construct a distributed storage space from node-local storage devices that has the same ephemeral life cycle of a batch-submitted job. By aggressively distributing data and metadata among cluster nodes, GekkoFS efficiently and scalably aggregates the I/O performance of node-local storage. While it has been designed for N-to-N striped I/O, it also offers support for N-to-1 data distributions.
GekkoFS work is primarily funded by European research projects funded by the European Commission and by national projects funded by the Spanish Ministry of Science and Innovation (MICINN) and the German Research Foundation (DFG).
GekkoFS is made possible by the gracious collaboration of the following current and past partners: