Skip to content
  1. Jul 12, 2024
    • Marc Vef's avatar
      Merge branch 'marc/294-file-system-expansion-during-runtime' into 'master' · cddedd6f
      Marc Vef authored
      Resolve "File system expansion during runtime"
      
      # Description
      
      GekkoFS supports extending the current daemon configuration to additional compute nodes. This includes redistribution
      of the existing data and metadata and therefore scales file system performance and capacity of existing data. Note,
      that it is the user's responsibility to not access the GekkoFS file system during redistribution. A corresponding feature
      that is transparent to the user is planned. Note also, if the GekkoFS proxy is used, they need to be manually restarted, after expansion.
      
      To enable this feature, the following CMake compilation flags are required to build the `gkfs_malleability` tool: `-DGKFS_BUILD_TOOLS=ON`.
      The `gkfs_malleability` tool is then available in the `build/tools` directory. Please consult `-h` for its arguments.
      While the tool can be used manually to expand the file system, the `scripts/run/gkfs` script should be used instead which invokes the `gkfs_malleability` tool.
      
      The only requirement for extending the file system is a hostfile containing the hostnames/IPs of the new nodes (one line per host).
      Example starting the file system. The `DAEMON_NODELIST` in the `gkfs.conf` is set to a hostfile containing the initial set of file system nodes.:
      ```bash
      ~/gekkofs/scripts/run/gkfs -c ~/run/gkfs_verbs_expandtest.conf start
      * [gkfs] Starting GekkoFS daemons (4 nodes) ...
      * [gkfs] GekkoFS daemons running
      * [gkfs] Startup time: 10.853 seconds
      ```
      ... Some computation ...
      
      Expanding the file system. Using `-e <hostfile>` to specify the new nodes. Redistribution is done automatically with a progress bar. 
      When finished, the file system is ready to use in the new configuration:
      ```bash
      ~/gekkofs/scripts/run/gkfs -c ~/run/gkfs_verbs_expandtest.conf -e ~/hostfile_expand expand
      * [gkfs] Starting GekkoFS daemons (8 nodes) ...
      * [gkfs] GekkoFS daemons running
      * [gkfs] Startup time: 1.058 seconds
      Expansion process from 4 nodes to 12 nodes launched...
      * [gkfs] Expansion progress:
      [####################] 0/4 left
      * [gkfs] Redistribution process done. Finalizing ...
      * [gkfs] Expansion done.
      ```
      Stop the file system:
      ```bash
      ~/gekkofs/scripts/run/gkfs -c ~/run/gkfs_verbs_expandtest.conf stop
      * [gkfs] Stopping daemon with pid 16462
      srun: sending Ctrl-C to StepId=282378.1
      * [gkfs] Stopping daemon with pid 16761
      srun: sending Ctrl-C to StepId=282378.2
      * [gkfs] Shutdown time: 1.032 seconds
      ```
      
      # Results
      IOR results for writing/reading 768 GiB sequentially (192 procs) before and after expansion
      
      ![image](/uploads/57bd8f3a07a56c496b1ae0b096da24ef/image.png)
      
      MDTest results for creating, stating, removing, 19200000 (192 procs) before and after expansion
      
      ![image](/uploads/7e2f58d864789e657140ced3e9e9716e/image.png)
      
      Closes #294
      
      Closes #294
      
      See merge request !196
      v0.9.3_rc1
      cddedd6f
    • Marc Vef's avatar
      Cleanup, Readme, changelog. · 49263be8
      Marc Vef authored
      49263be8
    • Marc Vef's avatar
      Fix protocol for daemon RPC client · 0f42da53
      Marc Vef authored
      0f42da53
    • Marc Vef's avatar
      Rudimentary Proxy support for extended file systems. · 963b9e4f
      Marc Vef authored
      Proxy must be restarted to know about the file system extension.
      963b9e4f
  2. Jul 11, 2024
  3. Jul 04, 2024
  4. Jul 03, 2024
  5. Jun 28, 2024
    • Marc Vef's avatar
      Merge branch 'marc/293-remove-superfluous-rpc-during-remove' into 'master' · 477005cd
      Marc Vef authored
      Resolve "Remove superfluous RPC during remove"
      
      This MR improves remove() performance by avoiding one RPC. 
      
      Before the change, the metadata of a path was fetched to check whether it is a directory or file. This is important because `rmdir()` and `unlink()` should not delete the wrong object. However, this meant that two RPCs were done per remove operation.
      
      This update changes this behavior by checking the directory/file on the server during any remove operation. For this, an additional RPC field was added, which communicates whether the intent is to remove a directory. Overall, the semantics stay the same. The special case for `rename()` still requires the metadata check beforehand and is unchanged.
      
      IO500 has shown that this optimization doubles throughput for latency-sensitive operations, i.e., zero-byte files or small files.
      
      Depends on https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/191
      
      Closes #293
      
      Closes #293
      
      See merge request !195
      477005cd
    • Marc Vef's avatar
      Fix GCC warning bug for <= gcc-12 via #pragma · b0ba5789
      Marc Vef authored
      b0ba5789
    • Marc Vef's avatar
      Added Changelog · 86ec77a2
      Marc Vef authored
      86ec77a2
    • Marc Vef's avatar
      Proxy: Remove optimization added · 9349e366
      Marc Vef authored
      9349e366
    • Marc Vef's avatar
      7e7e789e
    • Marc Vef's avatar
      Merge branch 'marc/proxy_dev' into 'master' · 9dd6ebef
      Marc Vef authored
      New feature: GekkoFS Proxy
      
      The GekkoFS proxy is an additional (alternative) component that runs on each client and acts as gateway between the
      client and daemons. It can improve network stability, e.g., for opa-psm2, and provides a basis for future asynchronous
      I/O as well as client caching techniques to control file system semantics.
      
      The `gkfs` script fully supports the GekkoFS proxy and an example can be found in `scripts/run`. When using the proxy
      manually additional arguments are required on the daemon side, i.e., which network interface and protocol should be
      used:
      
      ```bash
      <daemon args> --proxy-listen eno1 --proxy-protocol ofi+sockets
      ```
      
      The proxy is started thereafter:
      
      ```bash
      ./gkfs_proxy -H ./gkfs_hostfile --pid-path ./vef_gkfs_proxy.pid -p ofi+sockets
      ```
      
      The shared hostfile was generated by the daemons whereas the pid_path is local to the machine and is
      detected by clients. The pid-path defaults to `/tmp/gkfs_proxy.pid`.
      
      Under default operation, clients detect automatically whether to use the proxy. If another proxy path is used, the
      environment variable `LIBGKFS_PROXY_PID_FILE` can be set for the clients.
      
      Alternatively, the `gkfs` automatically sets all required arguments:
      
      ```bash
      scripts/run/gkfs -c scripts/run/gkfs.conf -f start --proxy
      * [gkfs] Starting GekkoFS daemons (1 nodes) ...
      * [gkfs] GekkoFS daemons running
      * [gkfs] Startup time: 2.013 seconds
      * [gkfs] Starting GekkoFS proxies (1 nodes) ...
      * [gkfs] GekkoFS proxies running
      * [gkfs] Startup time: 5.002 seconds
      Press 'q' to exit
      ```
      
      Please consult `include/config.hpp` for additional configuration options. Note, GekkoFS proxy does not support
      replication.
      
      Closes https://storage.bsc.es/gitlab/hpc/gekkofs/-/issues/114
      
      Closes #114
      
      See merge request !191
      9dd6ebef
  6. Jun 27, 2024