Commit e4245996 authored by Marc Vef's avatar Marc Vef
Browse files

Merge branch 'jathenst/281-refactor-path-resolution' into 'master'

Resolve "Refactor path resolution"

Refactors path resolve mechanism and adds two new CMake options:

- `GKFS_USE_LEGACY_PATH_RESOLVE` - Use the legacy implementation of the resolve function, deprecated (default: OFF)
- `GKFS_FOLLOW_EXTERNAL_SYMLINKS` - Enable support for following external links for resolving the path (default: OFF)
    - This is automatically enabled in the deprecated version and causes an `lstat()` system call on each individual path component. This has been an issue in the past where performance was considerably impacted by the mountpath being placed within the parallel file system. 
    - It is now disabled by default to improve performance. In case, it causes issues, we can re-enable it.


Closes #281

Closes #281

See merge request !183
parents cddedd6f c6f5522a
Loading
Loading
Loading
Loading
Loading
+3 −0
Original line number Diff line number Diff line
@@ -58,6 +58,9 @@ replicas ([!166](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/141)
  - Improves RPC stability
  - Removes manual updates to Mercury public IDs from Hermes-Mercury to Margo
- Updated Spack to support the latest version ([!190](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_request/190)).
- Rewrite of the resolve path function to improve performance by making the
    use of syscall for following symlinks optional
    ([!183](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/183)).


### Removed
+16 −0
Original line number Diff line number Diff line
@@ -234,6 +234,14 @@ gkfs_define_option(
    DEFAULT_VALUE OFF
)


# use old resolve function
gkfs_define_option(
  GKFS_USE_LEGACY_PATH_RESOLVE
  HELP_TEXT "Use the old implementation of the resolve function"
  DEFAULT_VALUE OFF
)

cmake_dependent_option(GKFS_INSTALL_TESTS "Install GekkoFS self tests" OFF "GKFS_BUILD_TESTS" OFF)


@@ -265,6 +273,14 @@ gkfs_define_option(
  DESCRIPTION "Compile with support for rename ops (experimental)"
)

## external link support
gkfs_define_option(
  GKFS_FOLLOW_EXTERNAL_SYMLINKS
  HELP_TEXT "Enable support for following external links for resolving the path"
  DEFAULT_VALUE OFF
  DESCRIPTION "Compile with lstat usage in path resolve"
)


################################################################################
# Options and variables that control how GekkoFS behaves internally
+8 −0
Original line number Diff line number Diff line
@@ -272,6 +272,14 @@ if (GKFS_SYMLINK_SUPPORT)
    add_definitions(-DHAS_SYMLINKS)
endif ()

if (GKFS_USE_LEGACY_PATH_RESOLVE)
    add_definitions(-DGKFS_USE_LEGACY_PATH_RESOLVE)
endif ()

if (GKFS_FOLLOW_EXTERNAL_SYMLINKS)
    add_definitions(-DGKFS_FOLLOW_EXTERNAL_SYMLINKS)
endif ()

if (GKFS_RENAME_SUPPORT)
    # Rename depends on symlink support
    add_definitions(-DHAS_SYMLINKS)
+1 −0
Original line number Diff line number Diff line
@@ -77,6 +77,7 @@
        "GKFS_CHUNK_STATS": true,
        "GKFS_ENABLE_PROMETHEUS": true,
        "GKFS_RENAME_SUPPORT": true,
        "GKFS_FOLLOW_EXTERNAL_SYMLINKS": true,
        "GKFS_MAX_OPEN_FDS": "10000",
        "GKFS_MAX_INTERNAL_FDS": "1024"
      }
+58 −0
Original line number Diff line number Diff line
@@ -469,6 +469,64 @@ srun: sending Ctrl-C to StepId=282378.2
* [gkfs] Shutdown time: 1.032 seconds
```

## All CMake options

#### Core
- `GKFS_BUILD_TOOLS` - Build tools (default: OFF)
- `GKFS_BUILD_TESTS` - Build tests (default: OFF)
- `GKFS_CREATE_CHECK_PARENTS` - Enable checking parent directory for existence before creating children (default: ON)
- `GKFS_MAX_INTERNAL_FDS` - Number of file descriptors reserved for internal use (default: 256)
- `GKFS_MAX_OPEN_FDS` - Maximum number of open file descriptors supported (default: 1024)
- `GKFS_RENAME_SUPPORT` - Enable support for rename (default: OFF)
- `GKFS_SYMLINK_SUPPORT` - Enable support for symlinks (default: ON)
- `GKFS_FOLLOW_EXTERNAL_SYMLINKS` - Enable support for following external links for resolving the path (default: OFF)
- `GKFS_USE_LEGACY_PATH_RESOLVE` - Use the legacy implementation of the resolve function, deprecated (default: OFF)
- `GKFS_USE_GUIDED_DISTRIBUTION` - Use guided data distributor (default: OFF)
- `GKFS_USE_GUIDED_DISTRIBUTION_PATH` - File Path for guided distributor (default: /tmp/guided.txt)

#### Logging
- `GKFS_ENABLE_CLIENT_LOG` - Enable logging messages in clients (default: ON)
- `GKFS_CLIENT_LOG_MESSAGE_SIZE` - Maximum size of a log message in the client library (default: 1024)

#### Statistics
- `GKFS_ENABLE_PROMETHEUS` - Enable pushing daemon statistics to a Prometheus Gateway (default: OFF)
- `GKFS_ENABLE_CLIENT_METRICS` - Enable client metrics via MSGPack (default: OFF)

#### Backends
- `GKFS_ENABLE_ROCKSDB` - Enable RocksDB metadata backend (default: ON)
- `GKFS_ENABLE_PARALLAX` - Enable Parallax metadata support (default: OFF)

## All environment variables
The GekkoFS daemon, client, and proxy support a number of environment variables to augment its functionality:

### Client
#### Core
- `LIBGKFS_HOSTS_FILE` - Path to the hostsfile (created by the daemon and mandatory for the client).
#### Logging
- `LIBGKFS_LOG` - Log module of the client. 
Available modules are: `none`, `syscalls`, `syscalls_at_entry`, `info`, `critical`, `errors`, `warnings`, `mercury`, `debug`, `most`, `all`, `trace_reads`, `help`.
- `LIBGKFS_LOG_OUTPUT` - Path to the log file of the client.
- `LIBGKFS_LOG_PER_PROCESS` - Write separate logs per client process.
- `LIBGKFS_LOG_SYSCALL_FILTER` - Filter out specific system calls from log messages.
- `LIBGKFS_LOG_OUTPUT_TRUNC` - Truncate the file used for logging.
#### Client-metrics
Client-metrics require the CMake argument `-DGKFS_ENABLE_CLIENT_METRICS=ON` (see above).
- `LIBGKFS_ENABLE_METRICS` - Enable capturing client-side metrics.
- `LIBGKFS_METRICS_FLUSH_INTERVAL` - Set the flush interval for client metrics.
- `LIBGKFS_METRICS_PATH` - Path to flush client metrics.
- `LIBGKFS_METRICS_IP_PORT` - Enable flushing to a set ZeroMQ server (replaces `LIBGKFS_METRICS_PATH`).
- `LIBGKFS_PROXY_PID_FILE` - Path to the proxy pid file (when using the GekkoFS proxy).
- `LIBGKFS_NUM_REPL` - Number of replicas for data.

### Daemon
#### Logging
- `GKFS_DAEMON_LOG_PATH` - Path to the log file of the daemon.
- `GKFS_DAEMON_LOG_LEVEL` - Log level of the daemon. Available levels are: `off`, `critical`, `err`, `warn`, `info`, `debug`, `trace`.
### Proxy
#### Logging
- `GKFS_PROXY_LOG_PATH` - Path to the log file of the proxy.
- `GKFS_PROXY_LOG_LEVEL` - Log level of the proxy. Available levels are: `off`, `critical`, `err`, `warn`, `info`, `debug`, `trace`.

## Acknowledgment

This software was partially supported by the EC H2020 funded NEXTGenIO project (Project ID: 671951, www.nextgenio.eu).
Loading