Loading README.md +73 −0 Original line number Diff line number Diff line Loading @@ -33,6 +33,7 @@ to I/O, which reduces interferences and improves performance. - [Server-side statistics via Prometheus](#server-side-statistics-via-prometheus) - [GekkoFS proxy](#gekkofs-proxy) - [File system expansion](#file-system-expansion) - [File system shrinking](#file-system-shrinking) - [Miscellaneous](#miscellaneous) - [External functions](#external-functions) - [Data placement](#data-placement) Loading Loading @@ -508,6 +509,78 @@ srun: sending Ctrl-C to StepId=282378.2 * [gkfs] Shutdown time: 1.032 seconds ``` ## File system shrinking GekkoFS supports **shrinking** the current daemon configuration, removing one or more nodes from the cluster while safely redistributing all existing data and metadata to the remaining nodes. As with expansion, it is the user's responsibility not to access the file system during redistribution. The same `gkfs_malleability` tool (built with `-DGKFS_BUILD_TOOLS=ON`) is used. Shrinking requires two hostfiles: | File | Description | |---|---| | `gkfs_hosts.txt` | Current (old) hostfile — set via `LIBGKFS_HOSTS_FILE` | | `gkfs_hosts_new.txt` | New hostfile listing **only** the surviving nodes | ### Step-by-step **1. Create the new hostfile** containing only the nodes that should remain after shrink. The format is identical to `gkfs_hosts.txt`. The order does not matter — any nodes present in the old file but absent from the new file will be removed. ```bash # Example: remove node4, keep node1–node3 grep -v node4 gkfs_hosts.txt > gkfs_hosts_new.txt ``` **2. Start the shrink** process. Each surviving node redistributes the data that was owned by the removed nodes, and the removed nodes forward all their data before stopping: ```bash LIBGKFS_HOSTS_FILE=gkfs_hosts.txt \ gkfs_malleability shrink --new-hosts-file gkfs_hosts_new.txt start Shrink process from 4 nodes to 3 nodes launched... ``` The old and new node counts are **auto-detected** from the respective hostfiles. They can be overridden with `--old-nodes <N>` and `--new-nodes <N>` if needed. **3. Poll status** until all nodes have finished: ```bash LIBGKFS_HOSTS_FILE=gkfs_hosts.txt gkfs_malleability shrink status No shrink running/finished. ``` When active: `Shrink in progress: 2 nodes not finished.` **4. Finalize** the shrink. This disables maintenance mode on all remaining daemons and atomically replaces `gkfs_hosts.txt` with `gkfs_hosts_new.txt`: ```bash LIBGKFS_HOSTS_FILE=gkfs_hosts.txt \ gkfs_malleability shrink --new-hosts-file gkfs_hosts_new.txt finalize Shrink finalize 0 Hosts file updated: gkfs_hosts_new.txt -> gkfs_hosts.txt ``` After finalize, `gkfs_hosts.txt` contains only the surviving nodes and all clients automatically use the updated configuration on their next initialization. **5. Shut down the removed daemons** (they have already forwarded all data but are still running): ```bash # Send SIGTERM to each daemon on the removed nodes pdsh -w node4 'kill $(cat /tmp/gkfs_daemon.pid)' ``` ### Environment variables | Variable | Description | |---|---| | `LIBGKFS_HOSTS_FILE` | Path to the **current** (old) hosts file | | `LIBGKFS_HOSTS_FILE_NEW` | Alternative to `--new-hosts-file` for the new hosts file | # Miscellaneous ## External functions Loading include/client/rpc/forward_malleability.hpp +12 −0 Original line number Diff line number Diff line Loading @@ -37,6 +37,8 @@ SPDX-License-Identifier: LGPL-3.0-or-later */ #include <string> #ifndef GEKKOFS_CLIENT_FORWARD_MALLEABILITY_HPP #define GEKKOFS_CLIENT_FORWARD_MALLEABILITY_HPP Loading @@ -50,6 +52,16 @@ forward_expand_status(); int forward_expand_finalize(); int forward_shrink_start(int old_server_conf, int new_server_conf, const std::string& new_hosts_file); int forward_shrink_status(); int forward_shrink_finalize(); } // namespace gkfs::malleable::rpc #endif // GEKKOFS_CLIENT_FORWARD_MALLEABILITY_HPP include/client/user_functions.hpp +26 −0 Original line number Diff line number Diff line Loading @@ -179,6 +179,32 @@ expand_status(); */ int expand_finalize(); /** * @brief Start a shrinking of the file system * @param old_server_conf old number of nodes * @param new_server_conf new number of nodes * @param new_hosts_file path to hostfile containing only the surviving nodes * @return error code */ int shrink_start(int old_server_conf, int new_server_conf, const std::string& new_hosts_file); /** * @brief Check for the current status of the shrinking process * @return 0 when finished, positive numbers indicate how many daemons * are still redistributing data */ int shrink_status(); /** * @brief Finalize the shrinking process * @return error code */ int shrink_finalize(); } // namespace malleable } // namespace gkfs Loading include/common/common_defs.hpp +3 −0 Original line number Diff line number Diff line Loading @@ -129,6 +129,9 @@ namespace malleable::rpc::tag { constexpr auto expand_start = "rpc_srv_expand_start"; constexpr auto expand_status = "rpc_srv_expand_status"; constexpr auto expand_finalize = "rpc_srv_expand_finalize"; constexpr auto shrink_start = "rpc_srv_shrink_start"; constexpr auto shrink_status = "rpc_srv_shrink_status"; constexpr auto shrink_finalize = "rpc_srv_shrink_finalize"; // migrate data uses the write rpc constexpr auto migrate_metadata = "rpc_srv_migrate_metadata"; } // namespace malleable::rpc::tag Loading include/common/rpc/rpc_types_thallium.hpp +11 −0 Original line number Diff line number Diff line Loading @@ -433,6 +433,17 @@ struct rpc_expand_start_in_t { } }; struct rpc_shrink_start_in_t { uint32_t old_server_conf; uint32_t new_server_conf; std::string new_hosts_file; template <class Archive> void serialize(Archive& ar) { ar(old_server_conf, new_server_conf, new_hosts_file); } }; struct rpc_migrate_metadata_in_t { std::string key; std::string value; Loading Loading
README.md +73 −0 Original line number Diff line number Diff line Loading @@ -33,6 +33,7 @@ to I/O, which reduces interferences and improves performance. - [Server-side statistics via Prometheus](#server-side-statistics-via-prometheus) - [GekkoFS proxy](#gekkofs-proxy) - [File system expansion](#file-system-expansion) - [File system shrinking](#file-system-shrinking) - [Miscellaneous](#miscellaneous) - [External functions](#external-functions) - [Data placement](#data-placement) Loading Loading @@ -508,6 +509,78 @@ srun: sending Ctrl-C to StepId=282378.2 * [gkfs] Shutdown time: 1.032 seconds ``` ## File system shrinking GekkoFS supports **shrinking** the current daemon configuration, removing one or more nodes from the cluster while safely redistributing all existing data and metadata to the remaining nodes. As with expansion, it is the user's responsibility not to access the file system during redistribution. The same `gkfs_malleability` tool (built with `-DGKFS_BUILD_TOOLS=ON`) is used. Shrinking requires two hostfiles: | File | Description | |---|---| | `gkfs_hosts.txt` | Current (old) hostfile — set via `LIBGKFS_HOSTS_FILE` | | `gkfs_hosts_new.txt` | New hostfile listing **only** the surviving nodes | ### Step-by-step **1. Create the new hostfile** containing only the nodes that should remain after shrink. The format is identical to `gkfs_hosts.txt`. The order does not matter — any nodes present in the old file but absent from the new file will be removed. ```bash # Example: remove node4, keep node1–node3 grep -v node4 gkfs_hosts.txt > gkfs_hosts_new.txt ``` **2. Start the shrink** process. Each surviving node redistributes the data that was owned by the removed nodes, and the removed nodes forward all their data before stopping: ```bash LIBGKFS_HOSTS_FILE=gkfs_hosts.txt \ gkfs_malleability shrink --new-hosts-file gkfs_hosts_new.txt start Shrink process from 4 nodes to 3 nodes launched... ``` The old and new node counts are **auto-detected** from the respective hostfiles. They can be overridden with `--old-nodes <N>` and `--new-nodes <N>` if needed. **3. Poll status** until all nodes have finished: ```bash LIBGKFS_HOSTS_FILE=gkfs_hosts.txt gkfs_malleability shrink status No shrink running/finished. ``` When active: `Shrink in progress: 2 nodes not finished.` **4. Finalize** the shrink. This disables maintenance mode on all remaining daemons and atomically replaces `gkfs_hosts.txt` with `gkfs_hosts_new.txt`: ```bash LIBGKFS_HOSTS_FILE=gkfs_hosts.txt \ gkfs_malleability shrink --new-hosts-file gkfs_hosts_new.txt finalize Shrink finalize 0 Hosts file updated: gkfs_hosts_new.txt -> gkfs_hosts.txt ``` After finalize, `gkfs_hosts.txt` contains only the surviving nodes and all clients automatically use the updated configuration on their next initialization. **5. Shut down the removed daemons** (they have already forwarded all data but are still running): ```bash # Send SIGTERM to each daemon on the removed nodes pdsh -w node4 'kill $(cat /tmp/gkfs_daemon.pid)' ``` ### Environment variables | Variable | Description | |---|---| | `LIBGKFS_HOSTS_FILE` | Path to the **current** (old) hosts file | | `LIBGKFS_HOSTS_FILE_NEW` | Alternative to `--new-hosts-file` for the new hosts file | # Miscellaneous ## External functions Loading
include/client/rpc/forward_malleability.hpp +12 −0 Original line number Diff line number Diff line Loading @@ -37,6 +37,8 @@ SPDX-License-Identifier: LGPL-3.0-or-later */ #include <string> #ifndef GEKKOFS_CLIENT_FORWARD_MALLEABILITY_HPP #define GEKKOFS_CLIENT_FORWARD_MALLEABILITY_HPP Loading @@ -50,6 +52,16 @@ forward_expand_status(); int forward_expand_finalize(); int forward_shrink_start(int old_server_conf, int new_server_conf, const std::string& new_hosts_file); int forward_shrink_status(); int forward_shrink_finalize(); } // namespace gkfs::malleable::rpc #endif // GEKKOFS_CLIENT_FORWARD_MALLEABILITY_HPP
include/client/user_functions.hpp +26 −0 Original line number Diff line number Diff line Loading @@ -179,6 +179,32 @@ expand_status(); */ int expand_finalize(); /** * @brief Start a shrinking of the file system * @param old_server_conf old number of nodes * @param new_server_conf new number of nodes * @param new_hosts_file path to hostfile containing only the surviving nodes * @return error code */ int shrink_start(int old_server_conf, int new_server_conf, const std::string& new_hosts_file); /** * @brief Check for the current status of the shrinking process * @return 0 when finished, positive numbers indicate how many daemons * are still redistributing data */ int shrink_status(); /** * @brief Finalize the shrinking process * @return error code */ int shrink_finalize(); } // namespace malleable } // namespace gkfs Loading
include/common/common_defs.hpp +3 −0 Original line number Diff line number Diff line Loading @@ -129,6 +129,9 @@ namespace malleable::rpc::tag { constexpr auto expand_start = "rpc_srv_expand_start"; constexpr auto expand_status = "rpc_srv_expand_status"; constexpr auto expand_finalize = "rpc_srv_expand_finalize"; constexpr auto shrink_start = "rpc_srv_shrink_start"; constexpr auto shrink_status = "rpc_srv_shrink_status"; constexpr auto shrink_finalize = "rpc_srv_shrink_finalize"; // migrate data uses the write rpc constexpr auto migrate_metadata = "rpc_srv_migrate_metadata"; } // namespace malleable::rpc::tag Loading
include/common/rpc/rpc_types_thallium.hpp +11 −0 Original line number Diff line number Diff line Loading @@ -433,6 +433,17 @@ struct rpc_expand_start_in_t { } }; struct rpc_shrink_start_in_t { uint32_t old_server_conf; uint32_t new_server_conf; std::string new_hosts_file; template <class Archive> void serialize(Archive& ar) { ar(old_server_conf, new_server_conf, new_hosts_file); } }; struct rpc_migrate_metadata_in_t { std::string key; std::string value; Loading