Skip to content
  1. Jul 22, 2024
    • Marc Vef's avatar
      Merge branch 'setup-v0.9.4' into 'master' · e7751d13
      Marc Vef authored
      Setup v0.9.4
      
      Setup dependencies, docker, version numbers, spack.
      
      See merge request !200
      e7751d13
    • Marc Vef's avatar
    • Marc Vef's avatar
      fede0359
    • Marc Vef's avatar
      Spack: Support v0.9.3 · 386f4511
      Marc Vef authored
      386f4511
    • Marc Vef's avatar
      Merge branch 'release-0.9.3' into 'master' · 05874f69
      Marc Vef authored
      Release 0.9.3
      
      
      See merge request !199
      v0.9.3
      05874f69
    • Marc Vef's avatar
      Bump Versions and update changelog for release · c9e64c5e
      Marc Vef authored
      c9e64c5e
    • Marc Vef's avatar
      Restructuring of Readme · 5fbc9f50
      Marc Vef authored
      5fbc9f50
    • Marc Vef's avatar
      Merge branch 'marc/62-shared-file-metadata-congestion-2' into 'master' · c300a359
      Marc Vef authored
      Resolve "Shared file metadata congestion"
      
      During write operations, the client must update the file size on the responsible metadata daemon. The write size cache
      can reduce the metadata load on the daemon and reduce the number of RPCs during write operations, especially for many
      small I/O operations. In the past, we have observed that a daemon can become network-congested, especially for single shared files, many processes, and small I/O operations, which bottlenecks the overall I/O throughput. Nevertheless, the cache can have a broad impact on small I/O operations as 1 RPC for updating the size is removed which already improves small file I/O on a single node.
      
      Note that this cache may impact file size consistency in which stat operations may not reflect the actual file size
      until the file is closed. The cache does not impact the consistency of the file data itself. We did not observe any issues with the cache for HPC applications and benchmarks, but it technically breaks POSIX. So, for now, I suggest it to be experimental and opt-in.
      
      - `LIBGKFS_WRITE_SIZE_CACHE` - Enable caching the write size of files (default: OFF).
      - `LIBGKFS_WRITE_SIZE_CACHE_THRESHOLD` - Set the number of write operations after which the file size is synchronized
        with the corresponding daemon (default: 1000). The file size is further synchronized when the file is `close()`d or
        when `fsync()` is called.
      
      Depends on https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/194
      
      Closes #62
      
      Closes #62
      
      See merge request !193
      c300a359
    • Marc Vef's avatar
      Readme, changelog, and changed defaults · 680fe6b5
      Marc Vef authored
      680fe6b5
    • Marc Vef's avatar
      Adding write_size cache. New envs: LIBGKFS_WRITE_SIZE_CACHE=ON · ef4620f7
      Marc Vef authored
      , enable by config and disable via `LIBGKFS_WRITE_SIZE_CACHE=OFF`. Flush happens on close/fsync. flush threshold can be changed via config or `LIBGKFS_WRITE_SIZE_CACHE_THRESHOLD=100` 
      ef4620f7
    • Marc Vef's avatar
      Merge branch 'marc/298-tests-fail-when-symlink-support-is-disabled' into 'master' · 950ba459
      Marc Vef authored
      Resolve "Tests fail when symlink support is disabled"
      
      This MR does several things:
      1. `SUPPORT_SYMLINKS` is now disabled by default. It didn't do much in the first place and only affects incomplete code. The corresponding README entry has been removed. The only thing it does support is accessing GekkoFS from a foreign namespace via a symbolic link. However, only `stat` seems to be working.
      2. `GKFS_FOLLOW_EXTERNAL_SYMLINKS` was also disabled by default in this [MR](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/183). This caused the `test_symlink` to fail as it tested external symlinks into GekkoFS, not actual symlinks within GekkoFS. This can only be done via `lstat()` for each component of the path which is a performance risk under certain circumstances. Overall, this is the relevant CMake variable for the test.
      3. Unify code formatting for CMake files.
      
      Closes #298
      
      Closes #298
      
      See merge request !198
      950ba459
  2. Jul 19, 2024
  3. Jul 16, 2024
    • Marc Vef's avatar
      Merge branch 'marc/292-add-dentry-cache' into 'master' · 8812ccdf
      Marc Vef authored
      Resolve "Add dentry cache"
      
      This MR adds a directory entry cache for the client to avoid a huge number of stat calls after readdir, e.g., for `ls -l` type operations. It is experimental and thus disabled by default. Can be enabled via `include/config.hpp` or with the env variable `LIBGKFS_DENTRY_CACHE=ON/OFF`.
      
      It works by using the `extended_dir_entry` RPC to receive some metadata along the the dentries from the daemons. This metadata is then placed into the cache and retrieved in a stat operation (for a cache miss, an RPC is sent with vanilla functionality). The cache is discarded upon close but can be changed via `include/config.hpp`. Note, this may cause semantical issues (removed files will remain in the cache forever).
      
      The performance improvements are already noticeable locally for a couple 1000 files.
      
      Depends on https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/195
      
      Closes #292
      
      Closes #292
      
      See merge request !194
      8812ccdf
  4. Jul 15, 2024
  5. Jul 12, 2024
    • Julius Athenstaedt's avatar
      active following symlinks for integration test · e2644038
      Julius Athenstaedt authored and Marc Vef's avatar Marc Vef committed
      e2644038
    • Julius Athenstaedt's avatar
      catch edgecase of relative paths, Changelog · 9997db6a
      Julius Athenstaedt authored and Marc Vef's avatar Marc Vef committed
      9997db6a
    • Julius Athenstaedt's avatar
      path unit test · 8113718a
      Julius Athenstaedt authored and Marc Vef's avatar Marc Vef committed
      8113718a
    • Julius Athenstaedt's avatar
      userlib for tests · 6a960853
      Julius Athenstaedt authored and Marc Vef's avatar Marc Vef committed
      6a960853
    • Julius Athenstaedt's avatar
      add BUILD flags and register new resolve fn · 34efceff
      Julius Athenstaedt authored and Marc Vef's avatar Marc Vef committed
      34efceff
    • Julius Athenstaedt's avatar
      refactor resolve path · c009d4b0
      Julius Athenstaedt authored and Marc Vef's avatar Marc Vef committed
      c009d4b0
    • Marc Vef's avatar
      Merge branch 'marc/294-file-system-expansion-during-runtime' into 'master' · cddedd6f
      Marc Vef authored
      Resolve "File system expansion during runtime"
      
      # Description
      
      GekkoFS supports extending the current daemon configuration to additional compute nodes. This includes redistribution
      of the existing data and metadata and therefore scales file system performance and capacity of existing data. Note,
      that it is the user's responsibility to not access the GekkoFS file system during redistribution. A corresponding feature
      that is transparent to the user is planned. Note also, if the GekkoFS proxy is used, they need to be manually restarted, after expansion.
      
      To enable this feature, the following CMake compilation flags are required to build the `gkfs_malleability` tool: `-DGKFS_BUILD_TOOLS=ON`.
      The `gkfs_malleability` tool is then available in the `build/tools` directory. Please consult `-h` for its arguments.
      While the tool can be used manually to expand the file system, the `scripts/run/gkfs` script should be used instead which invokes the `gkfs_malleability` tool.
      
      The only requirement for extending the file system is a hostfile containing the hostnames/IPs of the new nodes (one line per host).
      Example starting the file system. The `DAEMON_NODELIST` in the `gkfs.conf` is set to a hostfile containing the initial set of file system nodes.:
      ```bash
      ~/gekkofs/scripts/run/gkfs -c ~/run/gkfs_verbs_expandtest.conf start
      * [gkfs] Starting GekkoFS daemons (4 nodes) ...
      * [gkfs] GekkoFS daemons running
      * [gkfs] Startup time: 10.853 seconds
      ```
      ... Some computation ...
      
      Expanding the file system. Using `-e <hostfile>` to specify the new nodes. Redistribution is done automatically with a progress bar. 
      When finished, the file system is ready to use in the new configuration:
      ```bash
      ~/gekkofs/scripts/run/gkfs -c ~/run/gkfs_verbs_expandtest.conf -e ~/hostfile_expand expand
      * [gkfs] Starting GekkoFS daemons (8 nodes) ...
      * [gkfs] GekkoFS daemons running
      * [gkfs] Startup time: 1.058 seconds
      Expansion process from 4 nodes to 12 nodes launched...
      * [gkfs] Expansion progress:
      [####################] 0/4 left
      * [gkfs] Redistribution process done. Finalizing ...
      * [gkfs] Expansion done.
      ```
      Stop the file system:
      ```bash
      ~/gekkofs/scripts/run/gkfs -c ~/run/gkfs_verbs_expandtest.conf stop
      * [gkfs] Stopping daemon with pid 16462
      srun: sending Ctrl-C to StepId=282378.1
      * [gkfs] Stopping daemon with pid 16761
      srun: sending Ctrl-C to StepId=282378.2
      * [gkfs] Shutdown time: 1.032 seconds
      ```
      
      # Results
      IOR results for writing/reading 768 GiB sequentially (192 procs) before and after expansion
      
      ![image](/uploads/57bd8f3a07a56c496b1ae0b096da24ef/image.png)
      
      MDTest results for creating, stating, removing, 19200000 (192 procs) before and after expansion
      
      ![image](/uploads/7e2f58d864789e657140ced3e9e9716e/image.png)
      
      Closes #294
      
      Closes #294
      
      See merge request !196
      v0.9.3_rc1
      cddedd6f
    • Marc Vef's avatar
      Cleanup, Readme, changelog. · 49263be8
      Marc Vef authored
      49263be8
    • Marc Vef's avatar
      Fix protocol for daemon RPC client · 0f42da53
      Marc Vef authored
      0f42da53
    • Marc Vef's avatar
      Rudimentary Proxy support for extended file systems. · 963b9e4f
      Marc Vef authored
      Proxy must be restarted to know about the file system extension.
      963b9e4f
  6. Jul 11, 2024