Migrating between clusters: Difference between revisions

Line 15: Line 15:
=Job submission=
=Job submission=


All of our clusters use Slurm for job submission, so many parts of a job submission script will work across clusters. However, you should note that the number of CPU cores per node may vary significantly across clusters, from 24 up to 64 cores, so check the page of the cluster you are using to verify how many cores can be used on a node. The amount of memory also varies so you may need to adapt your script to account for this as well, in addition to differences among GPUs that are available, if any.  
All of our clusters use Slurm for job submission, so many parts of a job submission script will work across clusters. However, you should note that the number of CPU cores per node varies significantly across clusters, from 24 up to 64 cores, so check the page of the cluster you are using to verify how many cores can be used on a node. The amount of memory per node or per core also varies, so you may need to adapt your script to account for this as well.  Likewise, there are differences among the GPUs that are available.  


On [[Cedar]], you may not submit jobs from your home directory and the compute nodes have direct Internet access; on [[Graham]], [[Béluga/en|Béluga]] and [[Narval/en|Narval]], the compute nodes do not have Internet access. The maximum job duration is seven days on Béluga and Narval but 28 days on Cedar and Graham. All of the clusters except Cedar also restrict the number of jobs per user, both running and queued, to be no more than 1000.
On [[Cedar]], you may not submit jobs from your home directory and the compute nodes have direct Internet access; on [[Graham]], [[Béluga/en|Béluga]] and [[Narval/en|Narval]], the compute nodes do not have Internet access. The maximum job duration is seven days on Béluga and Narval but 28 days on Cedar and Graham. All of the clusters except Cedar also restrict the number of jobs per user, both running and queued, to be no more than 1000.


Each research group has access to a default allocation on every cluster, e.g. <code>#SBATCH --account=def-jsmith</code>, however special compute allocations like RRG or contributed allocations are tied to a particular cluster and will not be available on other clusters.
Each research group has access to a default allocation on every cluster, e.g. <code>#SBATCH --account=def-jsmith</code>, however special compute allocations like RRG or contributed allocations are tied to a particular cluster and will not be available on other clusters.
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits