Commits on Source (29)
-
Ramon Nou authored8dfac938
-
Marc Vef authored
-
Marc Vef authored
-
Alberto Miranda authored
Minor dependency script updates: * adding more optional versions to the dependency scripts: RocksDB and syscall_intercept with glibc3 fix as more systems are updated to >glibc3 * Cloning libfabric instead of downloading the tarball. This is because it is configured with a specific autotools version which is not available on all systems. Cloning allows generating `configure` dynamically. * daemon metadata backend now links to Dynamic Loader which is required by newer systems (not sure when this happened) and newer rocksdb versions * removing python startup scripts since they are no longer supported and are confusing if part of the git repo See merge request !57
dad26735 -
Marc Vef authored
-
Marc Vef authored
-
Marc Vef authored
-
Marc Vef authored
-
The `ChunkStorage` backend class on the daemon was throwing `system_errors` without being caught, crashing the server in the process. `ChunkStorage` now uses a designated error class for errors that might occur. In addition the dependency to Argobots was removed which was used to trigger `ABT_eventuals`, laying ground work for future non-Argobots IO implementations. Further, the whole class was refactored for consistency and failure resistance. A new class `ChunkOperation` is introduced which wraps Argobots' IO task operations which allows the removal of IO queue specific code within RPC handlers, i.e., read and write handlers. The idea is to separate eventuals, tasks and their arguments from handler logic into a designated class. Therefore, an object of an inherited class of `ChunkOperation` is instantiated within the handlers that drives all IO tasks. The corresponding code was added to the read and write RPC handlers. Note, `ChunkOperation` is not thread-safe and is supposed to be called by a single thread. In addition, truncate was reworked for error handling (it crashed the server on error) and that it uses the IO queue as well since truncate causes a write operation and should not overtake IO tasks in the queue. The chunk stat rpc handler was refactored for error handling and to use error codes as well. Further minor changes: - dead chunk stat code has been removed - some namespaces were missing: `gkfs::rpc` - more flexible handler cleanup and response code - fixed a bug where the chunk dir wasn't removed when the metadata didn't exist on the same node
bcb30ac2 -
f989f1f2
-
a9adb5e3
-
f9ddc8a8
-
80a034f6
-
07f62710
-
27839240
-
801f51ae
-
11f4107c
-
9700cccd
-
f00581ef
-
f2927d77
-
9ab18557
-
23483f6c
-
2e2c190b
-
e1054ae7
-
Alberto Miranda authored
Resolve "[daemon] chunk storage backend crashes the server on error" Closes #75 The `ChunkStorage` backend class on the daemon was throwing `system_errors` without being caught, crashing the server in the process. `ChunkStorage` now uses a designated error class for errors that might occur. In addition the dependency to Argobots was removed which was used to trigger `ABT_eventuals`, laying ground work for future non-Argobots IO implementations. Further, the whole class was refactored for consistency and failure resistance. A new class `ChunkOperation` is introduced which wraps Argobots' IO task operations which allows the removal of IO queue specific code within RPC handlers, i.e., read and write handlers. The idea is to separate eventuals, tasks and their arguments from handler logic into a designated class. Therefore, an object of an inherited class of `ChunkOperation` is instantiated within the handlers that drives all IO tasks. The corresponding code was added to the read and write RPC handlers. Note, `ChunkOperation` is not thread-safe and is supposed to be called by a single thread. In addition, truncate was reworked for error handling (it crashed the server on error) and that it uses the IO queue as well since truncate causes a write operation and should not overtake IO tasks in the queue. The chunk stat rpc handler was refactored for error handling and to use error codes as well. Further minor changes: - dead chunk stat code has been removed - some namespaces were missing: `gkfs::rpc` - more flexible handler cleanup and response code - fixed a bug where the chunk dir wasn't removed when the metadata didn't exist on the same node Misc: There was some discussion about putting the removal of the chunk directory into the IO queue as well with the same argument as truncate, but I refrain to do so as it would likely notably increase remove performance. I think, we can put this under *eventual consistency* and call it a day for now. Truncate was another story as glibc makes heavy use of truncate in various operations. See merge request hpc/gekkofs!32
7778f598 -
Marc Vef authoreda765f138
-
Ramon Nou authored
data_integrity test remaining statx() to stat() Closes #115 See merge request hpc/gekkofs!59
12ac0bc9
include/daemon/backend/data/data_module.hpp
0 → 100644
include/daemon/backend/data/file_handle.hpp
0 → 100644
include/daemon/ops/data.hpp
0 → 100644
scripts/shutdown_gkfs.py
deleted
100755 → 0
scripts/startup_gkfs.py
deleted
100755 → 0
scripts/util/__init__.py
deleted
100644 → 0
scripts/util/util.py
deleted
100755 → 0