Skip to content
Commit c300a359 authored by Marc Vef's avatar Marc Vef
Browse files

Merge branch 'marc/62-shared-file-metadata-congestion-2' into 'master'

Resolve "Shared file metadata congestion"

During write operations, the client must update the file size on the responsible metadata daemon. The write size cache
can reduce the metadata load on the daemon and reduce the number of RPCs during write operations, especially for many
small I/O operations. In the past, we have observed that a daemon can become network-congested, especially for single shared files, many processes, and small I/O operations, which bottlenecks the overall I/O throughput. Nevertheless, the cache can have a broad impact on small I/O operations as 1 RPC for updating the size is removed which already improves small file I/O on a single node.

Note that this cache may impact file size consistency in which stat operations may not reflect the actual file size
until the file is closed. The cache does not impact the consistency of the file data itself. We did not observe any issues with the cache for HPC applications and benchmarks, but it technically breaks POSIX. So, for now, I suggest it to be experimental and opt-in.

- `LIBGKFS_WRITE_SIZE_CACHE` - Enable caching the write size of files (default: OFF).
- `LIBGKFS_WRITE_SIZE_CACHE_THRESHOLD` - Set the number of write operations after which the file size is synchronized
  with the corresponding daemon (default: 1000). The file size is further synchronized when the file is `close()`d or
  when `fsync()` is called.

Depends on https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/194

Closes #62

Closes #62

See merge request !193
parents 950ba459 680fe6b5
Pipeline #4754 passed with stages
in 19 minutes and 1 second