Hélios

From Alliance Doc
Revision as of 19:00, 13 January 2020 by Diane27 (talk | contribs) (Created page with "{| class="wikitable sortable" |- | HOME <br> Lustre filesystem || *this is a small space that cannot be enlarged: we suggest you use your <code>project</code> space for large...")
Jump to navigation Jump to search
Other languages:
Availability: March 2014
Login node: helios3.calculquebec.ca
Globus endpoint: computecanada#helios-dtn
Data transfer node (rsync, scp, sftp,...): helios3.calculquebec.ca

Hélios is a supercomputer with general purpose graphics processor nodes (GPGPU) that was installed in Université Laval's computing centre in the spring of 2014. The server was purchased with funds from researchers at Laval and Montreal universities, as well as funds from Calcul Québec. Until the commissioning of Cedar and Graham, Helios had been the largest GPU deployment administered by Compute Canada, both in terms of computing power and number of GPUs.

Transferring your data

As of January 2020, Hélios is being aligned with the new Compute Canada standards.

  • Connection identifiers are migrated from Calcul Québec to Compute Canada;
  • Scheduling is migrated from Moab to Slurm;
  • Filesystems are reorganised to conform to the structure used on the other national clusters;
  • The modules system and the software are updated.

In this context, you must transfer your files from the old to the new filesystem. We suggest you use Globus from endpoint computecanada#colosse to endpoint computecanada#helios-dtn.

Particularités

Notre politique veut que les nœuds de calcul de Hélios n'aient pas accès à l'internet. Pour y faire exception, contactez le soutien technique en expliquant ce dont vous avez besoin et pourquoi. Notez que l'outil crontab n'est pas offert.

Chaque tâche devrait être d'une durée d’au moins une heure (au moins cinq minutes pour les tâches de test) et un utilisateur ne peut avoir plus de 1000 tâches (en exécution et en attente) à la fois. La durée maximale d'une tâche est 7 jours (168 heures).

Backups

A difference between Hélios and the new national clusters is that Hélios filesystems are not backed up. Please make sure your files are safely backed up elsewhere.

Jupyter

In addition to the traditional SSH access interface, you can also use a JupyterHub interface by connecting through https://jupyterhub.helios.calculquebec.ca/hub/spawn.

Software environment

Hélios uses version AVX of the Compute Canada software environment. See the list of available software under the AVX (Helios) tab.

Available GPUs

Two types of GPUs are available: NVIDIA K20 and NVIDIA K80. To indicate the type you require, use the following options in your submission script:

 #SBATCH --gres=gpu:k20:1

or

 #SBATCH --gres=gpu:k80:1

Storage

Hélios has only one 392TB Luster filesystem. As is the case with the new clusters, there are three distinct areas which each have different quotas. The filesystem is not backed up.

HOME
Lustre filesystem
  • this is a small space that cannot be enlarged: we suggest you use your project space for large backups
  • 50GB storage space, 500K files per user
SCRATCH
Lustre filesystem
  • large space to store temporary files during compute operations
  • 20TB storage space, 1M files per user
PROJECT
Lustre filesystem
  • space designed for sharing data within a group and store a high-volume data
  • 1TB storage space, 500K files per group

Pour les transferts de données par Globus, on devrait utiliser le point de chute computecanada#helios-dtn, alors que pour les outils comme rsync et scp, on peut utiliser un nœud de connexion.

Caractéristiques des nœuds

nœuds cœurs mémoire disponible CPU stockage GPU
15 20 110G 2 x Intel Xeon Ivy Bridge E5-2670 v2 @ 2.5 GHz 1 x HDD de 2T 8 x NVidia K20 (mémoire 5G)
8 24 256G 2 x Intel Xeon Ivy Bridge E5-2697 v2 @ 2.7 GHz 2 x SSD de 180G (330G utilisable) 8 x NVidia K80 (16 GPU, mémoire de 12G/GPU)