Scheduling & Job Management: How to Get the Most from a Cluster

October 24, 2017 @ 4:00 pm – October 26, 2017 @ 6:00 pm

This online workshop is targeted at current Compute Canada account holders and focuses on learning how to use Compute Canada’s new scheduler, Slurm, on the Cedar and Graham clusters. Split into three-parts, this series will provide hands-on training and offer users a chance to experiment with job submission techniques and learn how to get the most out of the new Compute Canada systems using the Slurm scheduler.

Tuesday, October 24 – Thursday, October 26
4:00 – 6:00 pm ADT (each day)

Part 1 will discuss how Compute Canada’s new Slurm scheduler works and how to use this knowledge to your advantage to get the most out of the Cedar and Graham clusters. Participants will practice how to submit basic jobs in Slurm with a focus on productivity. Other topics covered include optimizing walltime, MPI and OpenMP jobs, job arrays, and interactive jobs.

Part 2 will discuss how to submit and run more advanced jobs using the Slurm scheduler. Participants will practice examining jobs, and understand their state. Knowledge and practice of the topics covered in Part 1 is a prerequisite for attending this workshop. Other topics covered include jobs and memory, partitions, GPUs, software licenses, job dependencies, accounting groups and advanced requests.

Part 3 will discuss how a cluster chooses which jobs are to be run first by examining the topics of fairness, priority, and reservations. Participants will examine the state of the cluster and their jobs in order to troubleshoot problems. Knowledge and practice of the topics covered in Parts 1 and 2 is a prerequisite for attending this workshop.

This session is intended for current Compute Canada account holders who are unfamiliar with using a Slurm scheduler. Note: attendees should have some previous experience with submitting jobs to a Compute Canada system. If you have no experience submitting jobs, we recommend you attend our session “Using Compute Canada: How to Submit Jobs & Move Data on a Supercomputer” first.

This session is not recommended for new users or beginners. You must have a Compute Canada account to participate. Basic knowledge of Unix/Linux and scripting (or similar experience) is required. Participants should know: what a man page is; how to edit, copy, and delete files; how to use top and ps to see resources used of a process; what unix environment variables are; and how to set and display them.