Siku is a high-performance computer cluster installed in 2019 at Memorial University in St. John's, Newfoundland.
It is funded in large part by the Atlantic Canada Opportunities Agency (ACOA) with the intention of generating regional economic benefits through industry engagement, while recognizing the important work that ACENET does for academic research in the region.
- Globus end point, data transfer node:
- Multi-Processing using
libverbsis not working as expected. MPI implementations, however, should work.
- Directories are automatically created at first logon. This may produce a race condition that results in errors like the following:
Could not chdir to home directory /home/username: No such file or directory /usr/bin/xauth: error in locking authority file /home/username/.Xauthority Lmod has detected the following error: Unable to load module because of error when evaluating modulefile: ...
Should this occur on first login, simply log out, wait a minute, and log back in again.
Similarities and differences with national GP clusters
- The filesystem is similarly structured. See Storage and file management.
- There is no "Nearline" archival filesystem.
- The same scheduler is used, Slurm, although with simpler policies. See "Job Scheduling", below.
- The same modules system provides access to the same list of available software.
Tasks taking more than 10 CPU-minutes or 4 GB of RAM should not be run directly on a login node, but submitted to the job scheduler, Slurm.
Scheduling policies on Siku are simpler than those on Compute Canada general-purpose systems.
- Maximum run-time limit is 24 hours for unpaid accounts, 72 hours for paid accounts.
- Paid clients have higher priority than academic (free) clients, but with usage limited by contract. See Tracking paid accounts.
- GPUs should be requested following this example:
#SBATCH --gres=gpu:v100:2 #SBATCH --partition=all_gpus
- See "Node characteristics" below for the numbers of GPUs installed.
- Your account name is not necessarily the same as your account name on Compute Canada clusters. If you see the message "Invalid account or account/partition combination specified", try submitting without the
Storage quotas and filesystem characteristics
|Filesystem||Default Quota||Backed up?||Purged?||Mounted on Compute Nodes?|
|Home Space||52 GB and 512K files per user||Yes||No||Yes|
|Scratch Space||20 TB and 1M files per user||No||Not yet implemented||Yes|
|Project Space||1 TB and 512K files per group||Yes||No||Yes|
|40||40||186G or 191000M||2 x Intel Xeon Gold 6248 @ 2.5GHz||~720G||-|
|6||40||376G or 385024M||2 x Intel Xeon Gold 6248 @ 2.5GHz||~720G||-|
|1||40||186G or 191000M||2 x Intel Xeon Gold 6148 @ 2.4GHz||~720G||3 x NVIDIA Tesla V100 (32GB memory)|
|1||40||186G or 191000M||2 x Intel Xeon Gold 6148 @ 2.4GHz||~720G||2 x NVIDIA Tesla V100 (32GB memory)|
- "Available memory" is the amount of memory configured for use by Slurm jobs. Actual memory is slightly larger to allow for operating system overhead.
- "Storage" is node-local storage. Access it via the $SLURM_TMPDIR environment variable.
- Hyperthreading is turned off.
Operating system: CentOS 7
SSH host keys
- ED25519 (256b)
- RSA (2048b)