Verified Commit 6d30ca4e authored by Marc Vef's avatar Marc Vef
Browse files

Merge branch 'marc/stats_review' into rnou/stats_prometheus

parents b5907694 a6344c72
Loading
Loading
Loading
Loading
Loading
+10 −8
Original line number Diff line number Diff line
@@ -10,18 +10,20 @@ to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

### New

- Added Stats ([!132](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/132)) gathering in servers
  - Stats output can be enabled with --output-stats <filename>
  - --enable-collection collects normal stats
  - --enable-chunkstats collects extended chunk stats
- Added statistics gathering on daemons ([!132](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/132)).
    - Stats output can be enabled with:
    - `--enable-collection` collects normal statistics.
    - `--enable-chunkstats` collects extended chunk statistics.
- Statistics output to file is controlled by `--output-stats <filename>`
- Added Prometheus support for outputting
  statistics ([!132](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/132)):
    - Prometheus dependency optional and enabled at compile time with the CMake argument `GKFS_ENABLE_PROMETHEUS`.
    - `--enable-prometheus` enables statistics pushing to Prometheus if statistics are enabled.
    - `--prometheus-gateway` sets an IP and port for the Prometheus connection.
- Added new experimental metadata backend:
  Parallax ([!110](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/110)).
    - Added support to use multiple metadata backends.
    - Added `--clean-rootdir-finish` argument to remove rootdir/metadir at the end when the daemon finishes.
- Added Prometheus Output ([!132](https://storage.bsc.es/gitlab/hpc/gekkofs/-/merge_requests/132))
  - New option to define gateway --prometheus-gateway <gateway:port>
  - Prometheus output is optional with "GKFS_ENABLE_PROMETHEUS"
  - --enable-prometheus creates a thread to push the metrics.

### Changed

+21 −10
Original line number Diff line number Diff line
@@ -109,8 +109,11 @@ Options:
                              RocksDB is default if not set. Parallax support is experimental.
                              Note, parallaxdb creates a file called rocksdbx with 8GB created in metadir.
  --parallaxsize TEXT         parallaxdb - metadata file size in GB (default 8GB), used only with new files
  --output-stats TEXT         Enables the output of the stats on the FILE (each 10s) for debug
  --prometheus-gateway TEXT   Defines the ip:port of the Prometheus Push gateway
  --enable-collection         Enables collection of general statistics. Output requires either the --output-stats or --enable-prometheus argument.
  --enable-chunkstats         Enables collection of data chunk statistics in I/O operations.Output requires either the --output-stats or --enable-prometheus argument.
  --output-stats TEXT         Creates a thread that outputs the server stats each 10s to the specified file.
  --enable-prometheus         Enables prometheus output and a corresponding thread.
  --prometheus-gateway TEXT   Defines the prometheus gateway <ip:port> (Default 127.0.0.1:9091).
  --version                   Print version and exit.
```

@@ -233,22 +236,30 @@ Then, the `examples/distributors/guided/generate.py` scrpt is used to create the
Finally, modify `guided_config.txt` to your distribution requirements.

### Metadata Backends
There are two different metadata backends in GekkoFS. The default one uses `rocksdb`, however an alternative based on `PARALLAX` from `FORTH` 
is available.
To enable it use the `-DGKFS_ENABLE_PARALLAX:BOOL=ON` option, you can also disable `rocksdb` with `-DGKFS_ENABLE_ROCKSDB:BOOL=OFF`.

There are two different metadata backends in GekkoFS. The default one uses `rocksdb`, however an alternative based
on `PARALLAX` from `FORTH`
is available. To enable it use the `-DGKFS_ENABLE_PARALLAX:BOOL=ON` option, you can also disable `rocksdb`
with `-DGKFS_ENABLE_ROCKSDB:BOOL=OFF`.

Once it is enabled, `--dbbackend` option will be functional.

### Stats
Stats from each server are written to the file specified with `--output-stats <FILE>`. Collection is done with two separate flags `--enable-collection` and `--enable-chunkstats`. For normal and extended chunk stats. The extended chunk stats stores each chunk acccess. 
Pushing stats to Prometheus is enabled with the `-DGKFS_ENABLE_PROMETHEUS` and the flag `--enable-prometheus`. We are using a push model.
### Statistics

GekkoFS daemons are able to output general operations (`--enable-collection`) and data chunk
statistics (`--enable-chunkstats`) to a specified output file via `--output-stats <FILE>`. Prometheus can also be used
instead or in addition to the output file. It must be enabled at compile time via the CMake
argument `-DGKFS_ENABLE_PROMETHEUS` and the daemon argument `--enable-prometheus`. The corresponding statistics are then
pushed to the Prometheus instance.

### Acknowledgment

This software was partially supported by the EC H2020 funded NEXTGenIO project (Project ID: 671951, www.nextgenio.eu).

This software was partially supported by the ADA-FS project under the SPPEXA project (http://www.sppexa.de/) funded by the DFG.
This software was partially supported by the ADA-FS project under the SPPEXA project (http://www.sppexa.de/) funded by
the DFG.

This software is partially supported by the FIDIUM project funded by the DFG.

This software is partially supported by the ADMIRE project (https://www.admire-eurohpc.eu/) funded by the European Union’s Horizon 2020 JTI-EuroHPC Research and Innovation Programme (Grant 956748).
This software is partially supported by the ADMIRE project (https://www.admire-eurohpc.eu/) funded by the European
Union’s Horizon 2020 JTI-EuroHPC Research and Innovation Programme (Grant 956748).
+5 −2
Original line number Diff line number Diff line
@@ -79,8 +79,11 @@ Options:
                              RocksDB is default if not set. Parallax support is experimental.
                              Note, parallaxdb creates a file called rocksdbx with 8GB created in metadir.
  --parallaxsize TEXT         parallaxdb - metadata file size in GB (default 8GB), used only with new files
  --output-stats TEXT         Outputs the stats to the file each 10s.
  --prometheus-gateway TEXT   Defines the ip:port of the Prometheus Push gateway
  --enable-collection         Enables collection of general statistics. Output requires either the --output-stats or --enable-prometheus argument.
  --enable-chunkstats         Enables collection of data chunk statistics in I/O operations.Output requires either the --output-stats or --enable-prometheus argument.
  --output-stats TEXT         Creates a thread that outputs the server stats each 10s to the specified file.
  --enable-prometheus         Enables prometheus output and a corresponding thread.
  --prometheus-gateway TEXT   Defines the prometheus gateway <ip:port> (Default 127.0.0.1:9091).
  --version                   Print version and exit.
````

+11 −11
Original line number Diff line number Diff line
@@ -118,16 +118,16 @@ private:


    std::map<IopsOp, std::atomic<unsigned long>>
            IOPS; ///< Stores total value for global mean
            iops_mean; ///< Stores total value for global mean
    std::map<SizeOp, std::atomic<unsigned long>>
            SIZE; ///< Stores total value for global mean
            size_mean; ///< Stores total value for global mean

    std::mutex time_iops_mutex;
    std::mutex size_iops_mutex;

    std::map<IopsOp,
             std::deque<std::chrono::time_point<std::chrono::steady_clock>>>
            TimeIops; ///< Stores timestamp when an operation comes removes if
            time_iops; ///< Stores timestamp when an operation comes removes if
                       ///< first operation if > 10 minutes Different means will
                       ///< be stored and cached 1 minuted

@@ -135,7 +135,7 @@ private:
    std::map<SizeOp, std::deque<std::pair<
                             std::chrono::time_point<std::chrono::steady_clock>,
                             unsigned long long>>>
            TimeSize; ///< For size operations we need to store the timestamp
            time_size; ///< For size operations we need to store the timestamp
                       ///< and the size


@@ -159,10 +159,10 @@ private:

    std::map<std::pair<std::string, unsigned long long>,
             std::atomic<unsigned int>>
            chunkRead; ///< Stores the number of times a chunk/file is read
            chunk_reads; ///< Stores the number of times a chunk/file is read
    std::map<std::pair<std::string, unsigned long long>,
             std::atomic<unsigned int>>
            chunkWrite; ///< Stores the number of times a chunk/file is write
            chunk_writes; ///< Stores the number of times a chunk/file is write

    /**
     * @brief Called by output to generate CHUNK map
@@ -189,8 +189,8 @@ private:
                                        ///< Prometheus cpp)
    Family<Summary>* family_summary;    ///< Prometheus SIZE counter (managed by
                                        ///< Prometheus cpp)
    std::map<IopsOp, Counter*> iops_Prometheus; ///< Prometheus IOPS metrics
    std::map<SizeOp, Summary*> size_Prometheus; ///< Prometheus SIZE metrics
    std::map<IopsOp, Counter*> iops_prometheus; ///< Prometheus IOPS metrics
    std::map<SizeOp, Summary*> size_prometheus; ///< Prometheus SIZE metrics
#endif

public:
+9 −7
Original line number Diff line number Diff line
@@ -52,18 +52,20 @@ target_sources(statistics
if(GKFS_ENABLE_PROMETHEUS)
    find_package(CURL REQUIRED)
    find_package(prometheus-cpp REQUIRED)
    set(PROMETHEUS_LIB
    prometheus-cpp-pull
    prometheus-cpp-push
    prometheus-cpp-core
    set(PROMETHEUS_LINK_LIBRARIES
        prometheus-cpp::pull
        prometheus-cpp::push
        prometheus-cpp::core
        curl)
    target_include_directories(statistics PRIVATE ${prometheus-cpp_INCLUDE_DIR})
endif()

  target_link_libraries(statistics
      PRIVATE
  ${PROMETHEUS_LIB}
      ${PROMETHEUS_LINK_LIBRARIES}
  )


if(GKFS_ENABLE_CODE_COVERAGE)
  target_code_coverage(distributor AUTO)
  target_code_coverage(statistics AUTO)
Loading