Verified Commit c58da6a6 authored by Marc Vef's avatar Marc Vef
Browse files

Finalizing Spack support and documentation

parent dfa057c2
Loading
Loading
Loading
Loading
+68 −46
Original line number Diff line number Diff line
@@ -16,12 +16,15 @@ to I/O, which reduces interferences and improves performance.
- Miscellaneous: Libtool, Libconfig

### Debian/Ubuntu

GekkoFS base dependencies: `apt install git curl cmake autoconf automake libtool libconfig-dev`

GekkoFS testing support: `apt install python3-dev python3 python3-venv`

With testing

### CentOS/Red Hat

GekkoFS base dependencies: `yum install gcc-c++ git curl cmake autoconf automake libtool libconfig`

GekkoFS testing support: `python38-devel` (**>Python-3.6 required**)
@@ -42,15 +45,22 @@ GekkoFS testing support: `python38-devel` (**>Python-3.6 required**)
5. Compile GekkoFS and run optional tests
    - Create build directory: `mkdir gekkofs/build && cd gekkofs/build`
    - Configure GekkoFS: `cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=/home/foo/gekkofs_deps/install ..`
       - add `-DCMAKE_INSTALL_PREFIX=<install_path>` where the GekkoFS client library and server executable should be available 
        - add `-DCMAKE_INSTALL_PREFIX=<install_path>` where the GekkoFS client library and server executable should be
          available
        - add `-DGKFS_BUILD_TESTS=ON` if tests should be build
    - Build and install GekkoFS: `make -j8 install`
    - Run tests: `make test`

GekkoFS is now available at:

- GekkoFS daemon (server): `<install_path>/bin/gkfs_daemon`
- GekkoFS client interception library: `<install_path>/lib64/libgkfs_intercept.so`

## Use Spack to install GekkoFS (alternative)

The Spack tool can be used to easily install GekkoFS and its dependencies. Refer to the
following [README](scripts/spack/README.md) for details.

# Run GekkoFS

## General
@@ -70,9 +80,11 @@ The `-P` argument is used for setting another RPC protocol. See below.

## The GekkoFS hostsfile

Each GekkoFS daemon needs to register itself in a shared file (*hostsfile*) which needs to be accessible to _all_ GekkoFS clients and daemons.
Each GekkoFS daemon needs to register itself in a shared file (*hostsfile*) which needs to be accessible to _all_
GekkoFS clients and daemons.
Therefore, the hostsfile describes a file system and which node is part of that specific GekkoFS file system instance.
In a typical cluster environment this hostsfile should be placed within a POSIX-compliant parallel file system, such as GPFS or Lustre.
In a typical cluster environment this hostsfile should be placed within a POSIX-compliant parallel file system, such as
GPFS or Lustre.

*Note: NFS is not strongly consistent and cannot be used for the hosts file!*

@@ -80,7 +92,9 @@ In a typical cluster environment this hostsfile should be placed within a POSIX-

tl;dr example: `<install_path>/bin/gkfs_daemon -r <fs_data_path> -m <pseudo_gkfs_mount_dir_path> -H <hostsfile_path>`

Run the GekkoFS daemon on each node specifying its locally used directory where the file system data and metadata is stored (`-r/--rootdir <fs_data_path>`), e.g., the node-local SSD;
Run the GekkoFS daemon on each node specifying its locally used directory where the file system data and metadata is
stored (`-r/--rootdir <fs_data_path>`), e.g., the node-local SSD;

2. the pseudo mount directory used by clients to access GekkoFS (`-m/--mountdir <pseudo_gkfs_mount_dir_path>`); and
3. the hostsfile path (`-H/--hostsfile <hostfile_path>`).

@@ -235,22 +249,27 @@ Source code needs to be compiled with -fPIC. We include a pfind io500 substituti
`examples/gfind/gfind.cpp` and a non-mpi version `examples/gfind/sfind.cpp`

## Data distributors

The data distribution can be selected at compilation time, we have 2 distributors available:

### Simple Hash (Default)

Chunks are distributed randomly to the different GekkoFS servers.

### Guided Distributor

The guided distributor allows defining a specific distribution of data on a per directory or file basis.
The distribution configurations are defined within a shared file (called `guided_config.txt` henceforth) with the following format:
The distribution configurations are defined within a shared file (called `guided_config.txt` henceforth) with the
following format:
`<path> <chunk_number> <host>`

To enable the distributor, the following CMake compilation flags are required:

* `GKFS_USE_GUIDED_DISTRIBUTION` ON
* `GKFS_USE_GUIDED_DISTRIBUTION_PATH` `<path_guided_config.txt>`

To use a custom distribution, a path needs to have the prefix `#` (e.g., `#/mdt-hard 0 0`), in which all the data of all files in that directory goes to the same place as the metadata.
To use a custom distribution, a path needs to have the prefix `#` (e.g., `#/mdt-hard 0 0`), in which all the data of all
files in that directory goes to the same place as the metadata.
Note, that a chunk/host configuration is inherited to all children files automatically even if not using the prefix.
In this example, `/mdt-hard/file1` is therefore also using the same distribution as the `/mdt-hard` directory.
If no prefix is used, the Simple Hash distributor is used.
@@ -260,12 +279,15 @@ If no prefix is used, the Simple Hash distributor is used.
Creating a guided configuration file is based on an I/O trace file of a previous execution of the application.
For this the `trace_reads` tracing module is used (see above).

The `trace_reads` module enables a `TRACE_READS` level log at the clients writing the I/O information of the client which is used as the input for a script that creates the guided distributor setting.
The `trace_reads` module enables a `TRACE_READS` level log at the clients writing the I/O information of the client
which is used as the input for a script that creates the guided distributor setting.
Note that capturing the necessary trace records can involve performance degradation.
To capture the I/O of each client within a SLURM environment, i.e., enabling the `trace_reads` module and print its output to a user-defined path, the following example can be used:
To capture the I/O of each client within a SLURM environment, i.e., enabling the `trace_reads` module and print its
output to a user-defined path, the following example can be used:
`srun -N 10 -n 320 --export="ALL" /bin/bash -c "export LIBGKFS_LOG=trace_reads;LIBGKFS_LOG_OUTPUT=${HOME}/test/GLOBAL.txt;LD_PRELOAD=${GKFS_PRLD} <app>"`

Then, the `examples/distributors/guided/generate.py` scrpt is used to create the guided distributor configuration file:

* `python examples/distributors/guided/generate.py ~/test/GLOBAL.txt >> guided_config.txt`

Finally, modify `guided_config.txt` to your distribution requirements.
+53 −19
Original line number Diff line number Diff line
## Spack

### Spack
Spack is a package manager for supercomputers and Linux. It makes it easy to install scientific software for regular
users.
Spack is another method to install GekkoFS where Spack handles all the dependencies and setting up the environment.

You can use Spack to install GekkoFS and let it handle all the dependencies. First, you will need to install Spack:
### Install Spack

```
First, install Spack. You can find the instructions here: https://spack.readthedocs.io/en/latest/getting_started.html

```bash
git clone https://github.com/spack/spack.git
. spack/share/spack/setup-env.sh
```

Once Spack is installed and available in your path, add gekkofs to the Spack namespace.
Note that the second line needs to be executed every time you open a new terminal. It sets up the environment for Spack
and the corresponding environment variables, e.g., $PATH.

```
### Install GekkoFS with Spack

To install GekkoFS with Spack, the GekkoFS repository needs to be added to Spack as it is not part of the official Spack
repository.

```bash
spack repo add gekkofs/scripts/spack
```

You can then check that Spack can find GekkoFS by typing:
When added, the GekkoFS package is available. Its installation variants and options can be checked via:

```
```bash
spack info gekkofs
```

Finally, just install GekkoFS. You can also install variants (tests, forwarding mode, AGIOS scheduling).
Then install GekkoFS with Spack:

```bash
spack install gekkofs
# for installing tests dependencies and running tests
spack install -v --test=root gekkofs +tests
spack install -v --test=root gekkofs
```

Remember to load GekkoFS to run:
Finally, GekkoFS is loaded into the currently used environment:

```
```bash
spack load gekkofs
```

If you want to enable the forwarding mode:
This installs the latest release version including its required Git submodules. The installation directory is
`$SPACK_ROOT/opt/spack/linux-<arch>/<compiler>/<version>/gekkofs-<version>`. The GekkoFS daemon (`gkfs_daemon`) is
located in the `bin` directory and the GekkoFS client (`libgkfs_intercept.so`) is located in the `lib` directory.

```
spack install gekkofs +forwarding
```
Note that loading the environment adds the GekkoFS daemon to the `$PATH` environment variable. Therefore, the GekkoFS
daemon is started by running `gkfs_daemon`. Loading GekkoFS in Spack further provides the `$GKFS_CLIENT` environment
variable pointing to the interception library.

If you want to enable the AGIOS scheduling library for the forwarding mode:
Therefore, the following commands can be run to use GekkoFS:

```bash
# Consult `-h` or the Readme for further options
gkfs_daemon -r /tmp/gkfs_rootdir -m /tmp/gkfs_mountdir &
LD_PRELOAD=$GKFS_CLIENT ls -l /tmp/gkfs_mountdir
LD_PRELOAD=$GKFS_CLIENT touch /tmp/gkfs_mountdir/foo
LD_PRELOAD=$GKFS_CLIENT ls -l /tmp/gkfs_mountdir
```
spack install gekkofs +forwarding +agios

When done using GekkoFS, unload it from the environment:

```bash
spack unload gekkofs
```

If you want to use the latest developer branch of GekkoFS:
### Alternative deployment (on many nodes)

`gekkofs/scripts/run/gkfs` provides a script to deploy GekkoFS in a single command on several nodes by using `srun`.
Consult the main [README](../../README.md) or GekkoFS documentation for details.

### Miscellaneous

Use GekkoFS's latest version (master branch) with Spack:

```
spack install gekkofs@latest
```

The default is using version 0.9.1 the last stable release.
Use a specific compiler on your system, e.g., gcc-11.2.0:

```bash
spack install gekkofs@latest%gcc@11.2.0
```
 No newline at end of file
+34 −34
Original line number Diff line number Diff line
# Copyright 2013-2021 Lawrence Livermore National Security, LLC and other
# Copyright 2013-2023 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)

# from spack import *
from spack.build_systems.cmake import *
from spack.directives import *
from spack.multimethod import when
from spack.util.executable import which
from spack import *

class Gekkofs(CMakePackage):
    """GekkoFS is a file system capable of aggregating the local I/O capacity and performance of each compute node
in a HPC cluster to produce a high-performance storage space that can be accessed in a distributed manner.
This storage space allows HPC applications and simulations to run in isolation from each other with regards
to I/O, which reduces interferences and improves performance."""

# for Clion. Comment out when using spack
# from spack.build_systems.cmake import *
# from spack.directives import *
# from spack.multimethod import when
# from spack.util.executable import which
# from spack.package import PackageBase

class Gekkofs(CMakePackage):
    """GekkoFS is a distributed burst buffer file system in user space. It is capable of aggregating the local I/O
    capacity and performance of each compute node in a HPC cluster to produce a high-performance storage space that
    can be accessed in a distributed manner. This storage space allows HPC applications and simulations to run in
    isolation from each other with regards to I/O, which reduces interferences and improves performance."""
    homepage = "https://storage.bsc.es/gitlab/hpc/gekkofs"
    git = "https://storage.bsc.es/gitlab/hpc/gekkofs.git"
    url = "https://storage.bsc.es/projects/gekkofs/releases/gekkofs-v0.9.1.tar.gz"

    maintainers = ['jeanbez', 'marcvef']
    maintainers = ['marc_vef', 'ramon_nou']
    # set various versions
    version('latest', branch='master', submodules=True)
    version('0.8.0', sha256='106c032d8cdab88173ab116c213201aa5aaad8d7dfc7b5087c94db329e7090e3')
    version('0.9.0', sha256='f6f7ec9735417d71d68553b6a4832e2c23f3e406d8d14ffb293855b8aeec4c3a')
    version('0.9.0', sha256='f6f7ec9735417d71d68553b6a4832e2c23f3e406d8d14ffb293855b8aeec4c3a', deprecated=True)
    version('0.9.1', sha256='1772b8a9d4777eca895f88cea6a1b4db2fda62e382ec9f73508e38e9d205d5f7')
    # apply patches
    patch('date-tz.patch')
    patch('daemon.patch', when='@0.8')
    # set arguments
    variant('build_type',
            default='Release',
@@ -35,38 +36,33 @@ to I/O, which reduces interferences and improves performance."""
            values=('Debug', 'Release', 'RelWithDebInfo')
            )

    variant('tests', default=False, description='Build and runs GekkoFS tests.')
    variant('forwarding', default=False, description='Enables the GekkoFS I/O forwarding mode.')
    variant('agios', default=False, description='Enables the AGIOS scheduler for the forwarding mode.')
    variant('guided_distributor', default=False, description='Enables the guided distributor.')
    variant('prometheus', default=False, description='Enables Prometheus support for statistics.')
    variant('parallax', default=False, description='Enables Parallax key-value database.')
    variant('rename', default=False, description='Enables experimental rename support.')
    variant('parallax', default=False, description='Enables Parallax key-value database.', when='latest')
    variant('rename', default=False, description='Enables experimental rename support.', when='latest')
    variant('dedicated_psm2', default=False, description='Use dedicated _non-system_ opa-psm2 version 11.2.185.')
    variant('compile', default='x86', multi=False, values=('x86','powerpc','arm'), description='Architecture to compile syscall intercept.')
    variant('compile', default='x86', multi=False, values=('x86', 'powerpc', 'arm'),
            description='Architecture to compile syscall intercept.')
    # general dependencies
    depends_on('cmake@3.6.0:', type='build')
    depends_on('lz4', when='@0.8:')
    depends_on('lz4')
    depends_on('argobots')
    depends_on('syscall-intercept@arm', when='compile=arm')
    depends_on('syscall-intercept@powerpc', when='compile=powerpc')
    depends_on('syscall-intercept@x86', when='compile=x86')
    depends_on('date cxxstd=14 +shared +tz tzdb=system')
    depends_on('opa-psm2@11.2.185', when='+dedicated_psm2')
    # 0.8.0 specific
    depends_on('libfabric@1.8.1', when='@0.8')
    depends_on('bzip2', when='@0.8')
    depends_on('zstd', when='@0.8')
    depends_on('uuid', when='@0.8')
    depends_on('bmi', when='@0.8')
    depends_on('mercury@2.0.0 +debug +ofi +mpi +sm +shared +boostsys -checksum', when='@0.8')
    depends_on('margo', when='@0.8')
    depends_on('rocksdb@6.11.4 -shared +static +lz4 +snappy +zlib +rtti', when='@0.8')
    # 0.9.0 specific
    depends_on('libfabric@1.13.2', when='@0.9:,latest')
    depends_on('mercury@2.1.0 -debug +ofi -mpi -bmi +sm +shared +boostsys -checksum', when='@0.9:,latest')
    depends_on('mochi-margo@0.9.6', when='@0.9:,latest')
    depends_on('rocksdb@6.20.3 -shared +static +lz4 -snappy -zlib -zstd -bz2 +rtti', when='@0.9:,latest')
    depends_on('rocksdb@6.20.3 -shared +static +lz4 -snappy -zlib -zstd -bz2 +rtti', when='@0.9.0')
    # 0.9.1 specific
    depends_on('rocksdb@6.26.1 -shared +static +lz4 -snappy -zlib -zstd -bz2 +rtti', when='@0.9.1:,latest')

    # Additional features
    # Agios I/O forwarding
    depends_on('agios@1.0', when='@0.8: +agios')
    depends_on('agios@latest', when='@master +agios')
@@ -74,16 +70,20 @@ to I/O, which reduces interferences and improves performance."""
    depends_on('prometheus-cpp', when='@0.9:,latest +prometheus')
    depends_on('parallax', when='@0.9:,latest +parallax')

    # known incompatbilities
    conflicts('%gcc@11:', when='@:0.9.1')

    def cmake_args(self):
        """Set up GekkoFS CMake arguments"""
        args = [
            self.define_from_variant('GKFS_BUILD_TESTS', 'tests'),
            self.define('CMAKE_INSTALL_LIBDIR', self.prefix.lib),
            self.define('GKFS_BUILD_TESTS', self.run_tests),
            self.define_from_variant('GKFS_ENABLE_FORWARDING', 'forwarding'),
            self.define_from_variant('GKFS_ENABLE_AGIOS', 'agios'),
            self.define_from_variant('GKFS_USE_GUIDED_DISTRIBUTION', 'guided_distributor'),
            self.define_from_variant('GKFS_ENABLE_PROMETHEUS', 'prometheus'),
            self.define_from_variant('GKFS_USE_PARALLAX', 'parallax'),
            self.define_from_variant('RENAME_SUPPORT', 'rename'),
            self.define_from_variant('GKFS_RENAME_SUPPORT', 'rename'),
        ]
        return args

@@ -93,4 +93,4 @@ to I/O, which reduces interferences and improves performance."""
            make('test', parallel=False)

    def setup_run_environment(self, env):
        env.set('GKFS_PRLD', join_path(self.prefix.lib, 'libgkfs_intercept.so'))
        env.set('GKFS_CLIENT', join_path(self.prefix.lib, 'libgkfs_intercept.so'))
+3 −2
Original line number Diff line number Diff line
@@ -14,6 +14,7 @@ class Rocksdb(MakefilePackage):
    git      = 'https://github.com/facebook/rocksdb.git'

    version('master', git=git, branch='master', submodules=True)
    version('6.26.1', sha256='5aeb94677bdd4ead46eb4cefc3dbb5943141fb3ce0ba627cfd8cbabeed6475e7')
    version('6.20.3', sha256='c6502c7aae641b7e20fafa6c2b92273d935d2b7b2707135ebd9a67b092169dca')
    version('6.19.3',  sha256='5c19ffefea2bbe4c275d0c60194220865f508f371c64f42e802b4a85f065af5b')
    version('6.11.4',  sha256='6793ef000a933af4a834b59b0cd45d3a03a3aac452a68ae669fb916ddd270532')