# Slurm Docker Cluster This is a multi-container Slurm cluster using `docker-compose` with `sshd` and `systemd` enabled. The compose file creates named volumes for persistent storage of MySQL data files as well as Slurm state and log directories. It is heavily based on work by [giovtorres/docker-slurm-cluster]( https://github.com/giovtorres/slurm-docker-cluster). ## Containers, Networks, and Volumes The compose file will run the following containers: * `mysql` * `slurmdbd` * `slurmctld` * `login` (slurmd) * `c1`, `c2`, `c3`. `c4` (slurmd) The compose file will create the following named volumes: * `etc_munge` ( -> `/etc/munge` ) * `slurm_jobdir` ( -> `/data` ) * `var_lib_mysql` ( -> `/var/lib/mysql` ) The compose file will create the `slurm_cluster` network for all containers and will assign the following IPv4 static addresses: * `slurmctld`: 192.18.0.129 * `c1`: 192.18.0.10 * `c2`: 192.18.0.11 * `c3`: 192.18.0.12 * `c4`: 192.18.0.13 * `login`: 192.18.0.128 ## Package contents - `docker-compose.yml`: docker-compose file for running the cluster - `slurm-docker-cluster/Dockerfile`: dockerfile for building the main cluster services. - `slurm-docker-cluster-node/Dockerfile`: dockerfile with specific software for the compute nodes. Specific for scord. NEEDS TO BE BUILT BEFORE RUNNING THE CLUSTER. - `scripts/register_cluster.sh`: script for registering the cluster with the `slurmdbd` daemon. - `scripts/refresh.sh`: script for refreshing the scord installation in the cluster. This script uses the `slurm-docker-cluster-node` image to generate the binaries so that there are no compatibility issues with dependencies. The script relies on the following variables: - `REPO`: The repository where the `scord` source code is located. - `VOLUMES`: The host directory where the output of the build process will be placed. - `USER`: The container user that should be used to run the build process (so that ownership matches with the host and container user). The `scord` build process relies on a CMake Preset for Rocky Linux that has been configured to match the container environment: ```json { "name": "rocky", "displayName": "Rocky Linux", "description": "Build options for Rocky Linux", "inherits": "base", "environment" : { "PKG_CONFIG_PATH": "/usr/lib/pkgconfig;/usr/lib64/pkgconfig" }, "generator": "Unix Makefiles", "cacheVariables": { "CMAKE_CXX_COMPILER_LAUNCHER": "", "CMAKE_C_COMPILER_LAUNCHER": "", "CMAKE_CXX_FLAGS": "-fdiagnostics-color=always", "CMAKE_C_FLAGS": "-fdiagnostics-color=always", "CMAKE_PREFIX_PATH": "/usr/lib;/usr/lib64", "CMAKE_INSTALL_PREFIX": "/scord_prefix", "SCORD_BUILD_EXAMPLES": true, "SCORD_BUILD_TESTS": true, "SCORD_BIND_ADDRESS": "192.18.0.128" } } ``` - `volumes`: directory for the volumes used by the cluster: - `etc_munge`: munge configuration files. A shared `munge.key` needs to be generated and placed here. - `etc_slurm`: slurm configuration files. At least a `slurm.conf` file needs to be placed here. The `slurm.conf` file should be configured with the compute node and partition information. For example: ```conf # COMPUTE NODES NodeName=c[1-4] RealMemory=1000 State=UNKNOWN # PARTITIONS PartitionName=normal Default=yes Nodes=c[1-4] Priority=50 DefMemPerCPU=500 Shared=NO MaxNodes=4 MaxTime=5-00:00:00 DefaultTime=5-00:00:00 State=UP ``` - `etc_ssh`: ssh configuration files. Server keys and configuration files should be placed here. - `ld.so.conf.d`: ld.so configuration files. - `scord_prefix`: scord installation directory. The scord installation should be placed here. This should match with the directory outside the container where we are generating the binaries. - `user_home`: user home directory. Any files and directories that we want to have available in all compute nodes (e.g. `.ssh`), should be added here. - `docker-entrypoint.sh`: Overridden entry point ## Build arguments The following build arguments are available: * `SLURM_TAG`: The Slurm Git tag to build. Defaults to `slurm-21-08-6-1`. * `GOSU_VERSION`: The gosu version to install. Defaults to `1.11`. * `SHARED_USER_NAME`: The name of the user that will be shared with the cluster. Defaults to `user`. * `SHARED_USER_UID`: The UID of the user that will be shared with the cluster. Defaults to `1000`. * `SHARED_GROUP_NAME`: The name of the group that will be shared with the cluster. Defaults to `user`. * `SHARED_GROUP_GID`: The GID of the group that will be shared with the cluster. Defaults to `1000`. ## Configuration To run, the cluster services expect some files to be present in the host system. The simpler way to do this is to provide the files in the `volumes` directory with the correct ownership and permissions so that they can be mounted in the containers. The `volumes` directory should be placed in the same directory as the `docker-compose.yml` file. The `volumes` directory should have the following structure: ```bash volumes/ ├── docker-entrypoint.sh -> /usr/local/bin/docker-entrypoint.sh ├── etc_munge -> /etc/munge ├── etc_slurm -> /etc/slurm ├── etc_ssh -> /etc/ssh ├── ld.so.conf.d -> /etc/ld.so.conf.d └── user_home -> /home/$SHARED_USER_NAME ``` The following ownership and permissions should be set for the cluster to work properly. The `slurm` and `munge` users are not actually required to exist in the host system as they are created automatically whie building the images, though it helps to actually create them rather than having a weird number pop up each time `ls` is called. Note however, that if created in the host, the `slurm` and `munge` users/groups need to have the same UIDs/GIDs in the host and container systems. ```bash volumes ├── [-rwxrwxr-x example-user example-user 1.9K Jun 29 16:30] docker-entrypoint.sh ├── [drwxrwxr-x munge munge 4.0K Jun 17 09:11] etc_munge │   └── [-r-------- munge munge 1.0K Jun 17 09:11] munge.key ├── [drwxrwxr-x slurm slurm 4.0K Jul 4 09:49] etc_slurm │   ├── [-rw-r--r-- slurm slurm 216 Jun 16 15:48] cgroup.conf.example │   ├── [-rw-r--r-- slurm slurm 213 Jun 30 14:28] plugstack.conf │   ├── [drwxrwxr-x slurm slurm 4.0K Jun 16 16:13] plugstack.conf.d │   ├── [-rw-r--r-- slurm slurm 2.2K Jun 23 15:24] slurm.conf │   ├── [-rw-r--r-- slurm slurm 3.0K Jun 16 15:48] slurm.conf.example │   ├── [-rw------- slurm slurm 722 Jun 16 15:48] slurmdbd.conf │   └── [-rw-r--r-- slurm slurm 745 Jun 16 15:48] slurmdbd.conf.example ├── [drwxrwxr-x example-user example-user 4.0K Jun 29 12:46] etc_ssh │   ├── [-rw------- root root 3.6K May 9 19:14] sshd_config │   ├── [drwx------ root root 4.0K Jun 29 12:46] sshd_config.d [error opening dir] │   ├── [-rw------- root root 1.4K Jun 29 11:17] ssh_host_dsa_key │   ├── [-rw-r--r-- root root 600 Jun 29 11:17] ssh_host_dsa_key.pub │   ├── [-rw------- root root 505 Jun 29 11:26] ssh_host_ecdsa_key │   ├── [-rw-r--r-- root root 172 Jun 29 11:26] ssh_host_ecdsa_key.pub │   ├── [-rw------- root root 399 Jun 29 11:26] ssh_host_ed25519_key │   ├── [-rw-r--r-- root root 92 Jun 29 11:26] ssh_host_ed25519_key.pub │   ├── [-rw------- root root 2.5K Jun 29 11:26] ssh_host_rsa_key │   └── [-rw-r--r-- root root 564 Jun 29 11:26] ssh_host_rsa_key.pub ├── [drwxrwxr-x example-user example-user 4.0K Jun 19 10:46] ld.so.conf.d ├── [drwxrwxr-x example-user example-user 4.0K Jun 20 11:20] scord_prefix └── [drwxr-xr-x example-user example-user 4.0K Jul 7 08:27] user_home 42 directories, 149 files ``` ## Optional configurations ### Cluster registration Though it's not required for the cluster to work properly, the newly created cluster can be registered with the internal `slurmdbd` daemon. To do so, run the `scripts/register_cluster.sh` script: ```console scripts/register_cluster.sh ``` ### Enabling name resolution Though the cluster internally will be able to properly resolve the names of each service, the host will be unable to do so. The simplest solution is to edit the `/etc/hosts` file and add entries for the services which have static IPv4 addresses assigned: ``` 192.18.0.128 login 192.18.0.129 slurmctld 192.18.0.10 c1 192.18.0.11 c2 192.18.0.12 c3 192.18.0.13 c4 ``` ## Usage 1. Find out the UID and GID of the host user that will be shared with the cluster. This can be done by running `id` in the host machine. 2. Build the services, making sure sure to set the `SHARED_USER_NAME`, `SHARED_USER_UID`, `SHARED_GROUP_NAME`, and `SHARED_GROUP_GID` build arguments to the values obtained in Step 1: ```shell $ docker compose build \ --build-arg SHARED_USER_NAME=example-user \ --build-arg SHARED_USER_UID=1000 \ --build-arg SHARED_GROUP_NAME=example-user \ --build-arg=SHARED_GROUP_GID=1000 ``` 3. Start the cluster with `docker compose up -d`. 4. You can log into the cluster containers as root with `docker compose exec bash`. 5. Alternatively, if ssh keys for the shared user have been configured in the `user_home` volume and the host's `/etc/hosts` file has been updated to include the cluster's IP addresses and hostnames, you can log into the cluster login or compute nodes as `$SHARED_USER_NAME` with ssh. For example, if the shared user is `example-user`: ```bash [example-user@host]$ ssh example-user@login ``` 6. Jobs can be submitted to the cluster by ssh-ing into the `login` container and using the typical Slurm commands: ```bash [example-user@host]$ ssh example-user@login [example-user@login]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST normal* up 5-00:00:00 4 idle c[1-4] [example-user@login]$ srun -N 4 hostname c2 c3 c1 c4 ```