Translations:Tutoriel Apprentissage machine/34/en: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
No edit summary
 
Line 1: Line 1:
The shared storage on Compute Canada clusters are not designed to handle lots of small files (they are optimized for very large files). Make sure that the data set which you need for your training is an archive format like ''tar'', which you can then transfer to your job's compute node when the job starts. '''If you do not respect these rules, you risk causing enormous numbers of I/O operations on the shared filesystem, leading to performance issues on the cluster for all of its users.''' If you want to learn more about how to handle collections of large number of files, we recommend that you spend some time reading [[Handling_large_collections_of_files|this page]].
Shared storage on our clusters is not designed to handle lots of small files (they are optimized for very large files). Make sure that the data set which you need for your training is an archive format like <code>tar</code>, which you can then transfer to your job's compute node when the job starts. <b>If you do not respect these rules, you risk causing enormous numbers of I/O operations on the shared filesystem, leading to performance issues on the cluster for all of its users.</b> If you want to learn more about how to handle collections of large number of files, we recommend that you spend some time reading [[Handling_large_collections_of_files|this page]].

Latest revision as of 19:08, 3 April 2023

Information about message (contribute)
This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.
Message definition (Tutoriel Apprentissage machine)
Les stockages partagés sur nos grappes ne sont pas optimisés pour gérer un grand nombre de petits fichiers (ils sont plutôt optimisés pour les très gros fichiers). Assurez-vous que l'ensemble de données dont vous aurez besoin pour votre entraînement se trouve dans un fichier archive (tel que "tar"), que vous transférerez sur votre nœud de calcul au début de votre tâche. '''Si vous ne le faites pas, vous risquez de causer des lectures de fichiers à haute fréquence du noeud de stockage vers votre nœud de calcul, nuisant ainsi à la performance globale du système'''. Si vous voulez apprendre davantage sur la gestion des grands ensembles de fichiers, on vous recommande la lecture de [https://docs.alliancecan.ca/wiki/Handling_large_collections_of_files/fr cette page].

Shared storage on our clusters is not designed to handle lots of small files (they are optimized for very large files). Make sure that the data set which you need for your training is an archive format like tar, which you can then transfer to your job's compute node when the job starts. If you do not respect these rules, you risk causing enormous numbers of I/O operations on the shared filesystem, leading to performance issues on the cluster for all of its users. If you want to learn more about how to handle collections of large number of files, we recommend that you spend some time reading this page.