Commit 2ec5ea0e authored by Ramon Nou's avatar Ramon Nou
Browse files

Merge branch 'rnou/thallium_support' into 'master'

Migration to full Thallium

This MR migrates to Thallium removing the Hermes dependency.
- Increased performance
We also included some optimizations:
- DIRENTS compression
- SFIND optimization and server-side computing --server-side (still sends the list of files, can be further optimized with -C)
- INLINE DATA (i.e, 4KB) to the database (rocksdb) for small files
- READ PREFETCH of the inline data on open
- WRITE optimization for small files (from 3RPC to 1RPC)
- DIRENTS pagination and retry
New :
- Some of the variables that are available on the config.hpp are also available as environment variables. 
- New tests to increase coverage
- New 0.9.6 dependencies (to cover thallium)

See merge request !273
parents 09befe84 79551419
Loading
Loading
Loading
Loading
Loading
+16 −0
Original line number Diff line number Diff line
@@ -13,8 +13,24 @@ to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
    - Compress directory data with zstd.
    - Make a new config.hpp option for controlling the compression
    - If directory buffer is not enough it will reattempt with the exact size
  - Metadata server can store small data ([!271](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/271))
    - Using config.hpp use_inline_data = true; and inline_data_size = 4096;
    - Data is stored in base64, as we use string to send the small data content (not bulk transfer)
  - Thallium support ([!273](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/273))
    - Migrated from Margo to Thallium for RPC communication.
    - Updated CMakeLists.txt and dependencies.
  - Inline data support and performance optimizations.
    - Enable inline data for small files (`LIBGKFS_USE_INLINE_DATA`).
    - Create write optimization (`LIBGKFS_CREATE_WRITE_OPTIMIZATION`).
    - Read inline prefetch (`LIBGKFS_READ_INLINE_PREFETCH`).
    - Dirents compression (`LIBGKFS_USE_DIRENTS_COMPRESSION and GKFS_DAEMON_USE_DIRENTS_COMPRESSION`). 
    - Dirents buffer size control (`LIBGKFS_DIRENTS_BUFF_SIZE`).
    - New sfind filtering in the server side
  - Added new tests (and enabling failing ones) to increase coverage


### Changed 
  - Disabled at_parent/at_fork/at_child as it seems unneded now 

### Fixed
  - SYS_lstat does not exists on some architectures, change to newfstatat ([!269](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/269))
+4 −2
Original line number Diff line number Diff line
@@ -154,8 +154,10 @@ message(STATUS "[${PROJECT_NAME}] Checking for Argobots")
find_package(Argobots 1.1 REQUIRED)

### Margo
message(STATUS "[${PROJECT_NAME}] Checking for Margo")
find_package(Margo 0.14.0 REQUIRED)
# message(STATUS "[${PROJECT_NAME}] Checking for Margo")
# find_package(Margo 0.14.0 REQUIRED)
message(STATUS "[${PROJECT_NAME}] Checking for Thallium")
find_package(Thallium REQUIRED)

### syscall-intercept
message(STATUS "[${PROJECT_NAME}] Checking for syscall_intercept")
+15 −2
Original line number Diff line number Diff line
@@ -510,6 +510,10 @@ Note, that a chunk/host configuration is inherited to all children files automat
In this example, `/mdt-hard/file1` is therefore also using the same distribution as the `/mdt-hard` directory.
If no prefix is used, the Simple Hash distributor is used.

## Small Data Store
Small files can be stored using the metadata server, this is controlled with the `config.hpp` options:
`use_inline_data = true` and `inline_data_size`

#### Guided configuration file

Creating a guided configuration file is based on an I/O trace file of a previous execution of the application.
@@ -587,8 +591,12 @@ Client-metrics require the CMake argument `-DGKFS_ENABLE_CLIENT_METRICS=ON` (see
- `LIBGKFS_METRICS_IP_PORT` - Enable flushing to a set ZeroMQ server (replaces `LIBGKFS_METRICS_PATH`).
- `LIBGKFS_PROXY_PID_FILE` - Path to the proxy pid file (when using the GekkoFS proxy).
- `LIBGKFS_NUM_REPL` - Number of replicas for data.
#### Directory optimizations
Set `true` the variable `use_dirents_compression` available at `include/config.hpp` to transfer directories compressed with zstd.
#### Optimization
- `LIBGKFS_USE_INLINE_DATA` - Enable inline data storage for small files (default: ON).
- `LIBGKFS_CREATE_WRITE_OPTIMIZATION` - Optimization for write operations (default: OFF).
- `LIBGKFS_READ_INLINE_PREFETCH` - Prefetch inline data when opening files (default: OFF).
- `LIBGKFS_USE_DIRENTS_COMPRESSION` - Enable compression for directory entries (default: OFF).
- `LIBGKFS_DIRENTS_BUFF_SIZE` - Buffer size for directory entries (default: 8MB).

#### Caching
##### Dentry cache
@@ -624,10 +632,15 @@ Using two environment variables
#### Logging
- `GKFS_DAEMON_LOG_PATH` - Path to the log file of the daemon.
- `GKFS_DAEMON_LOG_LEVEL` - Log level of the daemon. Available levels are: `off`, `critical`, `err`, `warn`, `info`, `debug`, `trace`.
#### Optimization
- `GKFS_DAEMON_USE_INLINE_DATA` - Enable inline data storage (default: ON).
- `GKFS_DAEMON_USE_DIRENTS_COMPRESSION` - Enable compression for directory entries (default: OFF).
### Proxy
#### Logging
- `GKFS_PROXY_LOG_PATH` - Path to the log file of the proxy.
- `GKFS_PROXY_LOG_LEVEL` - Log level of the proxy. Available levels are: `off`, `critical`, `err`, `warn`, `info`, `debug`, `trace`.
#### Optimization
- `GKFS_PROXY_USE_DIRENTS_COMPRESSION` - Enable compression for directory entries (default: OFF).

# Acknowledgment

+1 −4
Original line number Diff line number Diff line
@@ -258,9 +258,6 @@ pfind_parse_args(int argc, char** argv, bool force_print_help) {
    int c;
    optind = 1; // Reset getopt's internal index for repeated calls.
    while((c = getopt(argc, modified_argv.data(), optstring)) != -1) {
        if(c == -1) {
            break;
        }

        switch(c) {
            case 'H':
@@ -414,7 +411,7 @@ dirProcess(const string& path, unsigned long long& checked,
    // Each process loops ONLY over its assigned servers
    for(int server = start_server; server < end_server; server++) {
        struct dirent_extended* entries = nullptr;
        long unsigned int n =
        int n =
                gkfs_getsingleserverdir(path.c_str(), &entries, server);

        if(n <= 0) { // Handle empty or error cases
+47 −23
Original line number Diff line number Diff line
@@ -36,38 +36,62 @@ GKFS_FIND=~/ADMIRE/iodeps/bin/sfind

srun -N $NUM_NODES -n $GKFS_FIND_PROCESS --overlap --overcommit --mem=0 --oversubscribe --export=ALL,LD_PRELOAD=${GKFS} $GKFS_FIND $@ -M $GKFS_MNT -S $GKFS_SERVERS

#!/bin/bash
# scripts/aggregate_sfind_results.sh
# Robustly aggregates gfind_results.rank-*.txt files

# Initialize total counters
total_found=0
total_checked=0

# Check if any result files exist
if ! ls gfind_results.rank-*.txt 1> /dev/null 2>&1; then
# Enable nullglob so *.txt doesn't return literal string if no files match
shopt -s nullglob
files=(gfind_results.rank-*.txt)

if [ ${#files[@]} -eq 0 ]; then
    echo "No result files found (gfind_results.rank-*.txt)."
    exit 1
fi

# Loop through all result files
for file in gfind_results.rank-*.txt; do
    # Read the line "MATCHED found/checked" from the file
    # and extract the numbers.
    read -r _ found_str checked_str < "$file"
echo "Found ${#files[@]} result files. Aggregating..."

for file in "${files[@]}"; do
    if [ ! -s "$file" ]; then
         echo "Warning: File $file is empty or missing. Skipping."
         continue
    fi

    # Read the line. Using -r to prevent backslash interpretation.
    if read -r line < "$file"; then
        # Expected format: MATCHED <found>/<checked>
        # Example: MATCHED 123/4567
        
        # Remove prefix "MATCHED " if present
        if [[ "$line" == MATCHED* ]]; then
            val_str="${line#MATCHED }"
        else
            # Try to handle cases where MATCHED might be missing or different
            val_str="$line" 
        fi
        
    # Use cut to handle the "found/checked" format
    found=$(echo "$found_str" | cut -d'/' -f1)
    checked=$(echo "$checked_str") # this will be the same as found_str's second part
        # Split by '/'
        # found is everything before /
        found="${val_str%%/*}"
        # checked is everything after /
        checked="${val_str##*/}"
        
    # Bash arithmetic to add to totals
        # Validate that they are numbers
        if [[ "$found" =~ ^[0-9]+$ ]] && [[ "$checked" =~ ^[0-9]+$ ]]; then
            total_found=$((total_found + found))
            total_checked=$((total_checked + checked))
        else
            echo "Error: Invalid number format in $file: '$line' -> found='$found' checked='$checked'"
            # If set -e is active in parent script, we might want to exit? 
            # Or just warn and continue. Warn is safer for now.
        fi
    else
        echo "Warning: Could not read line from $file"
    fi
done

# Print the final aggregated result
echo "MATCHED ${total_found}/${total_checked}"

# Optional: Clean up the intermediate files
# Uncomment the line below if you want to automatically remove the partial results
rm gfind_results.rank-*.txt
exit 0

Loading