# Slurm Docker Cluster This is a multi-container Slurm cluster using `docker-compose` with `sshd` and `systemd` enabled. The compose file creates named volumes for persistent storage of MySQL data files as well as Slurm state and log directories. It is heavily based on work by [giovtorres/docker-slurm-cluster]( https://github.com/giovtorres/slurm-docker-cluster). ## Containers, Networks, and Volumes The compose file will run the following containers: * `mysql` * `slurmdbd` * `slurmctld` * `login` (slurmd) * `c1`, `c2`, `c3`. `c4` (slurmd) The compose file will create the following named volumes: * `etc_munge` ( -> `/etc/munge` ) * `slurm_jobdir` ( -> `/data` ) * `var_lib_mysql` ( -> `/var/lib/mysql` ) The compose file will create the `slurm_cluster` network for all containers and will assign the following IPv4 static addresses: * slurmctld: 192.18.0.129 * c1: 192.18.0.10 * c2: 192.18.0.11 * c3: 192.18.0.12 * c4: 192.18.0.13 * login: 192.18.0.128 ## Package contents - `docker-compose.yml`: docker-compose file for running the cluster - `slurm-docker-cluster/Dockerfile`: dockerfile for building the main cluster services. - `slurm-docker-cluster-node/Dockerfile`: dockerfile with specific software for the compute nodes. Specific for scord. NEEDS TO BE BUILT BEFORE RUNNING THE CLUSTER. - `refresh.sh`: script for refreshing the scord installation in the cluster. This script uses the `slurm-docker-cluster-node` image to generate the binaries so that there are no compatibility issues with dependencies. The script relies on the following variables: - `REPO`: The repository where the `scord` source code is located. - `VOLUMES`: The host directory where the output of the build process will be placed. - `USER`: The container user that should be used to run the build process (so that ownership matches with the host and container user). The `scord` build process relies on a CMake Preset for Rocky Linux that has been configured to match the container environment: ```json { "name": "rocky", "displayName": "Rocky Linux", "description": "Build options for Rocky Linux", "inherits": "base", "environment" : { "PKG_CONFIG_PATH": "/usr/lib/pkgconfig;/usr/lib64/pkgconfig" }, "generator": "Unix Makefiles", "cacheVariables": { "CMAKE_CXX_COMPILER_LAUNCHER": "", "CMAKE_C_COMPILER_LAUNCHER": "", "CMAKE_CXX_FLAGS": "-fdiagnostics-color=always", "CMAKE_C_FLAGS": "-fdiagnostics-color=always", "CMAKE_PREFIX_PATH": "/usr/lib;/usr/lib64", "CMAKE_INSTALL_PREFIX": "/scord_prefix", "SCORD_BUILD_EXAMPLES": true, "SCORD_BUILD_TESTS": true, "SCORD_BIND_ADDRESS": "192.18.0.128" } } ``` - `volumes`: directory for the volumes used by the cluster: - `etc_munge`: munge configuration files. A shared `munge.key` needs to be generated and placed here. - `etc_slurm`: slurm configuration files. At least a `slurm.conf` file needs to be placed here. The `slurm.conf` file should be configured with the compute node and partition information. For example: ```conf # COMPUTE NODES NodeName=c[1-4] RealMemory=1000 State=UNKNOWN # PARTITIONS PartitionName=normal Default=yes Nodes=c[1-4] Priority=50 DefMemPerCPU=500 Shared=NO MaxNodes=4 MaxTime=5-00:00:00 DefaultTime=5-00:00:00 State=UP ``` - `etc_ssh`: ssh configuration files. Server keys and configuration files should be placed here. - `ld.so.conf.d`: ld.so configuration files. - `scord_prefix`: scord installation directory. The scord installation should be placed here. This should match with the directory outside the container where we are generating the binaries. - `user_home`: user home directory. Any files and directories that we want to have available in all compute nodes (e.g. `.ssh`), should be added here. - `docker-entrypoint.sh`: Overridden entry point ## Build arguments The following build arguments are available: * `SLURM_TAG`: The Slurm Git tag to build. Defaults to `slurm-21-08-6-1`. * `GOSU_VERSION`: The gosu version to install. Defaults to `1.11`. * `SHARED_USER_NAME`: The name of the user that will be shared with the cluster. Defaults to `user`. * `SHARED_USER_UID`: The UID of the user that will be shared with the cluster. Defaults to `1000`. * `SHARED_GROUP_NAME`: The name of the group that will be shared with the cluster. Defaults to `user`. * `SHARED_GROUP_GID`: The GID of the group that will be shared with the cluster. Defaults to `1000`. ## Configuration To run, the cluster services expect some files to be present in the host system. The simpler way to do this is to provide the files in the `volumes` directory with the correct ownership and permissions so that they can be mounted in the containers. The `volumes` directory should be placed in the same directory as the `docker-compose.yml` file. The `volumes` directory should have the following structure: ```bash volumes/ ├── docker-entrypoint.sh -> /usr/local/bin/docker-entrypoint.sh ├── etc_munge -> /etc/munge ├── etc_slurm -> /etc/slurm ├── etc_ssh -> /etc/ssh ├── ld.so.conf.d -> /etc/ld.so.conf.d └── user_home -> /home/$SHARED_USER_NAME ``` The following ownership and permissions should be set for the cluster to work properly. The `slurm` and `munge` users are not actually required to exist in the host system as they are created automatically whie building the images, though it helps to actually create them rather than having a weird number pop up each time `ls` is called. Note however, that if created in the host, the `slurm` and `munge` users/groups need to have the same UIDs/GIDs in the host and container systems. ```bash volumes ├── [-rwxrwxr-x example-user example-user 1.9K Jun 29 16:30] docker-entrypoint.sh ├── [drwxrwxr-x munge munge 4.0K Jun 17 09:11] etc_munge │   └── [-r-------- munge munge 1.0K Jun 17 09:11] munge.key ├── [drwxrwxr-x slurm slurm 4.0K Jul 4 09:49] etc_slurm │   ├── [-rw-r--r-- slurm slurm 216 Jun 16 15:48] cgroup.conf.example │   ├── [-rw-r--r-- slurm slurm 213 Jun 30 14:28] plugstack.conf │   ├── [drwxrwxr-x slurm slurm 4.0K Jun 16 16:13] plugstack.conf.d │   ├── [-rw-r--r-- slurm slurm 2.2K Jun 23 15:24] slurm.conf │   ├── [-rw-r--r-- slurm slurm 3.0K Jun 16 15:48] slurm.conf.example │   ├── [-rw------- slurm slurm 722 Jun 16 15:48] slurmdbd.conf │   └── [-rw-r--r-- slurm slurm 745 Jun 16 15:48] slurmdbd.conf.example ├── [drwxrwxr-x example-user example-user 4.0K Jun 29 12:46] etc_ssh │   ├── [-rw------- root root 3.6K May 9 19:14] sshd_config │   ├── [drwx------ root root 4.0K Jun 29 12:46] sshd_config.d [error opening dir] │   ├── [-rw------- root root 1.4K Jun 29 11:17] ssh_host_dsa_key │   ├── [-rw-r--r-- root root 600 Jun 29 11:17] ssh_host_dsa_key.pub │   ├── [-rw------- root root 505 Jun 29 11:26] ssh_host_ecdsa_key │   ├── [-rw-r--r-- root root 172 Jun 29 11:26] ssh_host_ecdsa_key.pub │   ├── [-rw------- root root 399 Jun 29 11:26] ssh_host_ed25519_key │   ├── [-rw-r--r-- root root 92 Jun 29 11:26] ssh_host_ed25519_key.pub │   ├── [-rw------- root root 2.5K Jun 29 11:26] ssh_host_rsa_key │   └── [-rw-r--r-- root root 564 Jun 29 11:26] ssh_host_rsa_key.pub ├── [drwxrwxr-x example-user example-user 4.0K Jun 19 10:46] ld.so.conf.d ├── [drwxrwxr-x example-user example-user 4.0K Jun 20 11:20] scord_prefix └── [drwxr-xr-x example-user example-user 4.0K Jul 7 08:27] user_home 42 directories, 149 files ``` ## Usage 1. Find out the UID and GID of the host user that will be shared with the cluster. This can be done by running `id` in the host machine. 2. Build the services, making sure sure to set the `SHARED_USER_NAME`, `SHARED_USER_UID`, `SHARED_GROUP_NAME`, and `SHARED_GROUP_GID` build arguments to the values obtained in Step 1: ```shell $ docker compose build \ --build-arg SHARED_USER_NAME=example-user \ --build-arg SHARED_USER_UID=1000 \ --build-arg SHARED_GROUP_NAME=example-user \ --build-arg=SHARED_GROUP_GID=1000 ``` 3. Run the cluster with `docker compose up -d`. 4. You can log into the cluster containers as root with `docker compose exec bash`. 5. Alternatively, if ssh keys for the shared user have been configured in the `user_home` volume and the host's `/etc/hosts` file has been updated to include the cluster's IP addresses and hostnames, you can log into the cluster login or compute nodes as `$SHARED_USER_NAME` with ssh. For example, if the shared user is `example-user`: ```bash [example-user@host]$ ssh example-user@login ``` 6. Jobs can be submitted to the cluster by ssh-ing into the `login` container and using the typical Slurm commands: ```bash [example-user@host]$ ssh example-user@login [example-user@login]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST normal* up 5-00:00:00 4 idle c[1-4] [example-user@login]$ srun -N 4 hostname c2 c3 c1 c4 ```