Hélios/en: Difference between revisions

Jump to navigation Jump to search
3,127 bytes removed ,  2 years ago
Updating to match new version of source page
No edit summary
(Updating to match new version of source page)
Line 3: Line 3:
{{Warning|title=Retirement of Helios|content=Helios will be decommissionned at the end of 2022. In preparation of this, interactive access using SSH will be blocked starting on October 26th, 2022. It will be possible to retrieve files via Globus until November 28th, at which point Helios will be decommissionned.}}
{{Warning|title=Retirement of Helios|content=Helios will be decommissionned at the end of 2022. In preparation of this, interactive access using SSH will be blocked starting on October 26th, 2022. It will be possible to retrieve files via Globus until November 28th, at which point Helios will be decommissionned.}}


<div class="mw-translate-fuzzy">
{| class="wikitable"
{| class="wikitable"
|-
|-
Line 13: Line 14:
| Data transfer node (rsync, scp, sftp,...): '''helios.calculquebec.ca'''
| Data transfer node (rsync, scp, sftp,...): '''helios.calculquebec.ca'''
|}
|}
</div>


<div class="mw-translate-fuzzy">
Hélios is a supercomputer with general purpose graphics processor nodes (GPGPU). It was installed in Laval University's computing centre in the spring of 2014. The server was purchased with funds from Calcul Québec and from researchers at Laval University and Université de Montréal. Until the commissioning of [[Cedar]] and [[Graham]], Helios had been the largest GPU deployment administered by Compute Canada, both in terms of computing power and number of GPUs.
Hélios is a supercomputer with general purpose graphics processor nodes (GPGPU). It was installed in Laval University's computing centre in the spring of 2014. The server was purchased with funds from Calcul Québec and from researchers at Laval University and Université de Montréal. Until the commissioning of [[Cedar]] and [[Graham]], Helios had been the largest GPU deployment administered by Compute Canada, both in terms of computing power and number of GPUs.
 
</div>
= Transferring your data =
As of January 2020, Hélios is being aligned with the new Compute Canada standards.
* Connection identifiers are migrated from Calcul Québec to Compute Canada;
* Scheduling is migrated from Moab to Slurm;
* Filesystems are reorganised to conform to the structure used on the other national clusters;
* The modules system and the software are updated.
 
In this context, '''you must transfer your files from the old to the new filesystem'''. We suggest you use [[Globus]] from endpoint '''computecanada#colosse''' to endpoint '''computecanada#helios-dtn'''.
 
=Site-specific policies=
By policy, Hélios compute nodes cannot access the internet. If you need an exception to this rule, contact [[Technical_support|technical support]] justifying your needs.
 
The <code>crontab</code> tool is not offered.
 
Each job should have a duration of at least one hour (at least five minutes for test jobs) and a user cannot have more than 1000 jobs, running and queued, at any given moment. The maximum duration for a job is 7 days (168 hours).
 
== Backups ==
A difference between Hélios and the new national clusters is that '''Hélios filesystems are not backed up'''. Please make sure your files are safely backed up elsewhere.
 
== Jupyter ==
In addition to the traditional SSH access interface, you can also use a JupyterHub interface by connecting through [https://jupyterhub.helios.calculquebec.ca/hub/spawn https://jupyterhub.helios.calculquebec.ca/hub/spawn].
 
== Software environment ==
Hélios uses version AVX of the Compute Canada software environment. See the [[Available software|list of available software under the AVX (Helios) tab]].
 
== Available GPUs ==
Two types of GPUs are available: NVIDIA K20 and NVIDIA K80. To indicate the type you require, use the following options in your submission script:
 
  #SBATCH --gres=gpu:k20:1
 
or
 
  #SBATCH --gres=gpu:k80:1
 
=Storage=
Hélios has only one 392TB Lustre filesystem. As is the case with the new clusters, there are three distinct areas which each have different quotas. The filesystem is not backed up.
 
{| class="wikitable sortable"
|-
| HOME <br> Lustre filesystem ||
*this is a small space that cannot be enlarged: we suggest you use your <code>project</code> space for large backups
*50GB storage space, 500K files per user
|-
| SCRATCH <br> Lustre filesystem ||
* large space to store temporary files during compute operations
* 20TB storage space, 1M files per user
|-
| PROJECT <br> Lustre filesystem ||
*space designed for sharing data within a group and storing a high-volume data
*1TB storage space, 500K files per group
|}
 
For transferring data via Globus, you should use the endpoint <code>computecanada#beluga-dtn</code>, while for tools like <code>rsync</code> and <code>scp</code> you can use a login node.
 
=Node characteristics=
{| class="wikitable sortable"
! nœuds !! cœurs !! mémoire disponible !! CPU !! stockage !! GPU
|-
| 15 || 20 || 110G || 2 x Intel Xeon Ivy Bridge E5-2670 v2 @ 2.5 GHz || 1 x HDD of 2T || 8 x NVidia K20 (5G memory)
|-
| 8 || 24 || 256G || 2 x Intel Xeon Ivy Bridge E5-2697 v2 @ 2.7 GHz || 2 x SSD of 180G (330G usable) || 8 x NVidia K80 (16 GPU, 12G/GPU memory)
|}
38,757

edits

Navigation menu