3. Set up the necessary environment variables where the compiled direct GekkoFS dependencies will be installed at (we assume the path `/home/foo/gekkofs_deps/install` in the following)
Run the GekkoFS daemon on each node specifying its locally used directory where the file system data and metadata is stored (`-r/--rootdir <fs_data_path>`), e.g., the node-local SSD;
2. the pseudo mount directory used by clients to access GekkoFS (`-m/--mountdir <pseudo_gkfs_mount_dir_path>`); and
3. the hostsfile path (`-H/--hostsfile <hostfile_path>`).
Further options are available:
Further options:
```bash
Allowed options:
-h[--help] Help message
-m[--mountdir] arg Virtual mounting directory where GekkoFS is
available.
-r[--rootdir] arg Local data directory where GekkoFS data for this
daemon is stored.
-i[--metadir] arg Metadata directory where GekkoFS RocksDB data
directory is located. If not set, rootdir is used.
-l[--listen] arg Address or interface to bind the daemon to.
Default: local hostname.
When used with ofi+verbs the FI_VERBS_IFACE
environment variable is set accordingly which
associates the verbs device with the network
interface. In case FI_VERBS_IFACE is already
defined, the argument is ignored. Default 'ib'.
-H[--hosts-file] arg Shared file used by deamons to register their
endpoints. (default './gkfs_hosts.txt')
-P[--rpc-protocol] arg Used RPC protocol for inter-node communication.
Available: {ofi+sockets, ofi+verbs, ofi+psm2}for
TCP, Infiniband, and Omni-Path, respectively.
(Default ofi+sockets)
Allowed options
Usage: bin/gkfs_daemon [OPTIONS]
Options:
-h,--help Print this help message and exit
-m,--mountdir TEXT REQUIRED Virtual mounting directory where GekkoFS is available.
-r,--rootdir TEXT REQUIRED Local data directory where GekkoFS data for this daemon is stored.
-i,--metadir TEXT Metadata directory where GekkoFS RocksDB data directory is located. If not set, rootdir is used.
-l,--listen TEXT Address or interface to bind the daemon to. Default: local hostname.
When used with ofi+verbs the FI_VERBS_IFACE environment variable is set accordingly which associates the verbs device with the network interface. In case FI_VERBS_IFACE is already defined, the argument is ignored. Default 'ib'.
-H,--hosts-file TEXT Shared file used by deamons to register their endpoints. (default './gkfs_hosts.txt')
-P,--rpc-protocol TEXT Used RPC protocol for inter-node communication.
Run the application with the preload library: `LD_PRELOAD=<path>/build/lib/libgkfs_intercept.so ./application`. In the case of
an MPI application use the `{mpirun, mpiexec} -x` argument.
Clients read the hostsfile to determine which daemons are part of the GekkoFS instance.
Because the client is an interposition library that is loaded within the context of the application, this information is passed via the environment variable `LIBGKFS_HOSTS_FILE` pointing to the hostsfile path.
The client library itself is loaded for each application process via the `LD_PRELOAD` environment variable intercepting file system related calls.
If they are within (or hierarchically under) the GekkoFS mount directory they are processed in the library, otherwise they are passed to the kernel.
Note, if `LD_PRELOAD` is not pointing to the library and, hence the client is not loaded, the mounting directory appear to be empty.
For MPI application, the `LD_PRELOAD` variable can be passed with the `-x` argument for `mpirun/mpiexec`.
### Logging
The following environment variables can be used to enable logging in the client
@@ -245,50 +168,58 @@ can be provided to set the path to the log file, and the log module can be
selected with the `GKFS_LOG_LEVEL={off,critical,err,warn,info,debug,trace}`
environment variable.
# Miscellaneous
### External functions
## External functions
GekkoFS allows to use external functions on your client code, via LD_PRELOAD.
Source code needs to be compiled with -fPIC. We include a pfind io500 substitution,
`examples/gfind/gfind.cpp` and a non-mpi version `examples/gfind/sfind.cpp`
### Data distributors
## Data distributors
The data distribution can be selected at compilation time, we have 2 distributors available:
## Simple Hash (Default)
### Simple Hash (Default)
Chunks are distributed randomly to the different GekkoFS servers.
## Guided Distributor
Guided distributor distributes chunks using a shared file with the next format:
`<path> <chunk_number> <host>`
### Guided Distributor
Moreover if you prepend a path with #, all the data from that path will go to the same place as the metadata.
Specifically defined paths (without#) will be prioritary.
#### General
i.e.,
#/mdt-hard 0 0
The guided distributor allows defining a specific distribution of data on a per directory or file basis.
The distribution configurations are defined within a shared file (called `guided_config.txt` henceforth) with the following format:
`<path> <chunk_number> <host>`
GekkoFS will store data and metadata to the same server. The server will still be random (0 0 has no meaning, yet).
To enable the distributor, the following compilation flags are required:
Chunks not specified, are distributed using the Simple Hash distributor.
To use a custom distribution, a path needs to have the prefix `#` (e.g., `#/mdt-hard 0 0`), in which all the data of all files in that directory goes to the same place as the metadata.
Note, that a chunk/host configuration is inherited to all children files automatically even if not using the prefix.
In this example, `/mdt-hard/file1` is therefore also using the same distribution as the `/mdt-hard` directory.
If no prefix is used, the Simple Hash distributor is used.
To generate such file we need to follow a first execution, using the trace_reads log option
#### Guided configuration file
This will enable a `TRACE_READS` level log at the clients offering several lines that can be used to generate the input file.
In this stage, each node should generate a separated file this can be done in SLURM using the next line :
The `trace_reads` module enables a `TRACE_READS` level log at the clients writing the I/O information of the client which is used as the input for a script that creates the guided distributor setting.
Note that capturing the necessary trace records can involve performance degradation.
To capture the I/O of each client within a SLURM environment, i.e., enabling the `trace_reads` module and print its output to a user-defined path, the following example can be used:
Finally, enable the distributor using the next compilation flags:
*`GKFS_USE_GUIDED_DISTRIBUTION` ON
*`GKFS_USE_GUIDED_DISTRIBUTION_PATH``<full path to guided.txt>`
Finally, modify `guided_config.txt` to your distribution requirements.
### Acknowledgment
This software was partially supported by the EC H2020 funded project NEXTGenIO (Project ID: 671951, www.nextgenio.eu).
This software was partially supported by the EC H2020 funded NEXTGenIO project (Project ID: 671951, www.nextgenio.eu).
This software was partially supported by the ADA-FS project under the SPPEXA project (http://www.sppexa.de/) funded by the DFG.
This software is partially supported by the FIDIUM project funded by the DFG.
This software was partially supported by the ADA-FS project under the SPPEXA project funded by the DFG.
This software is partially supported by the ADMIRE project (https://www.admire-eurohpc.eu/) funded by the European Union’s Horizon 2020 JTI-EuroHPC Research and Innovation Programme (Grant 956748).