38,760
edits
(Updating to match new version of source page) |
(Updating to match new version of source page) |
||
Line 13: | Line 13: | ||
Béluga is a general purpose cluster designed for a variety of workloads and situated at the [http://www.etsmtl.ca/ École de technologie supérieure] in Montreal. The cluster is named in honour of the St. Lawrence River's [https://en.wikipedia.org/wiki/Beluga_whale Beluga whale] population. | Béluga is a general purpose cluster designed for a variety of workloads and situated at the [http://www.etsmtl.ca/ École de technologie supérieure] in Montreal. The cluster is named in honour of the St. Lawrence River's [https://en.wikipedia.org/wiki/Beluga_whale Beluga whale] population. | ||
<div class="mw-translate-fuzzy"> | |||
=Site-specific policies= | =Site-specific policies= | ||
By policy, Béluga's compute nodes cannot access the internet. If you need an exception to this rule, contact [[Technical_support|technical support]] with information about the IP address, port number(s) and protocol(s) needed as well as the duration and a contact person. | By policy, Béluga's compute nodes cannot access the internet. If you need an exception to this rule, contact [[Technical_support|technical support]] with information about the IP address, port number(s) and protocol(s) needed as well as the duration and a contact person. | ||
Crontab is not offered on Béluga. | Crontab is not offered on Béluga. | ||
</div> | |||
Each job on Béluga should have a duration of at least one hour (five minutes for test jobs) and a user cannot have more than 1000 jobs, running and queued, at any given moment. The maximum duration for a job on Béluga is 7 days (168 hours). | Each job on Béluga should have a duration of at least one hour (five minutes for test jobs) and a user cannot have more than 1000 jobs, running and queued, at any given moment. The maximum duration for a job on Béluga is 7 days (168 hours). | ||
<div class="mw-translate-fuzzy"> | |||
=Storage= | =Storage= | ||
</div> | |||
{| class="wikitable sortable" | {| class="wikitable sortable" | ||
Line 58: | Line 62: | ||
For transferring data via Globus, you should use the endpoint <code>computecanada#beluga-dtn</code>, while for tools like rsync and scp you can use a login node. | For transferring data via Globus, you should use the endpoint <code>computecanada#beluga-dtn</code>, while for tools like rsync and scp you can use a login node. | ||
<div class="mw-translate-fuzzy"> | |||
=High-performance interconnect= | =High-performance interconnect= | ||
</div> | |||
A Mellanox Infiniband EDR (100 Gb/s) network connects together all the nodes of the cluster. A central switch of 324 ports links the cluster's island topology with a maximum blocking factor of 5:1. The storage servers are networked with a non-blocking connection. The architecture permits multiple parallel jobs with up to 640 cores (or more) thanks to a non-blocking network. For jobs requiring greater parallelism, the blocking factor is 5:1 but even for jobs executed across several islands, the interconnection is high-performance. | A Mellanox Infiniband EDR (100 Gb/s) network connects together all the nodes of the cluster. A central switch of 324 ports links the cluster's island topology with a maximum blocking factor of 5:1. The storage servers are networked with a non-blocking connection. The architecture permits multiple parallel jobs with up to 640 cores (or more) thanks to a non-blocking network. For jobs requiring greater parallelism, the blocking factor is 5:1 but even for jobs executed across several islands, the interconnection is high-performance. | ||
<div class="mw-translate-fuzzy"> | |||
=Node characteristics= | =Node characteristics= | ||
Turbo mode is activated on all compute nodes of Béluga. | Turbo mode is activated on all compute nodes of Béluga. | ||
Line 80: | Line 87: | ||
| 172 || 40 || 186G or 191000M ||2 x Intel Gold 6148 Skylake @ 2.4 GHz || 1 x NVMe SSD 1.6T || 4 x NVidia V100SXM2 (16G memory), connected via NVLink | | 172 || 40 || 186G or 191000M ||2 x Intel Gold 6148 Skylake @ 2.4 GHz || 1 x NVMe SSD 1.6T || 4 x NVidia V100SXM2 (16G memory), connected via NVLink | ||
|} | |} | ||
</div> | |||
* To get a larger <code>$SLURM_TMPDIR</code> space, a job can be submitted with <code>--tmp=xG</code>, where <code>x</code> is a value between 350 and 2490. | * To get a larger <code>$SLURM_TMPDIR</code> space, a job can be submitted with <code>--tmp=xG</code>, where <code>x</code> is a value between 350 and 2490. |