rsnt_translations
56,420
edits
No edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
<translate> | <translate> | ||
<!--T:1--> | <!--T:1--> | ||
To get the most out of our clusters for machine learning applications, special care must be taken. A cluster is a complicated beast that is very different from your local machine that you use for prototyping. Notably, a cluster uses a distributed filesystem, linking many storage devices seamlessly. Accessing a file on < | To get the most out of our clusters for machine learning applications, special care must be taken. A cluster is a complicated beast that is very different from your local machine that you use for prototyping. Notably, a cluster uses a distributed filesystem, linking many storage devices seamlessly. Accessing a file on <code>/project</code> may <i>feel the same</i> as accessing one from the current node, but under the hood, these two IO operations have very different performance implications. In short, you need to [[#Managing_your_datasets|choose wisely where to put your data]]. | ||
<!--T:2--> | <!--T:2--> | ||
Line 19: | Line 19: | ||
<!--T:4--> | <!--T:4--> | ||
Python is very popular in the field of machine learning. If you (plan to) use it on our clusters, please refer to [[Python|our documentation about Python]] to get important information about Python versions, virtual environments on login or on compute nodes, < | Python is very popular in the field of machine learning. If you (plan to) use it on our clusters, please refer to [[Python|our documentation about Python]] to get important information about Python versions, virtual environments on login or on compute nodes, <code>multiprocessing</code>, Anaconda, Jupyter, etc. | ||
=== Avoid Anaconda === <!--T:21--> | === Avoid Anaconda === <!--T:21--> | ||
Line 65: | Line 65: | ||
<!--T:13--> | <!--T:13--> | ||
* filesystem [[Storage and file management#Filesystem_quotas_and_policies|quotas]] on our clusters limit the number of filesystem objects; | * filesystem [[Storage and file management#Filesystem_quotas_and_policies|quotas]] on our clusters limit the number of filesystem objects; | ||
* your software could be significantly slowed down from streaming lots of small files from < | * your software could be significantly slowed down from streaming lots of small files from <code>/project</code> (or <code>/scratch</code>) to a compute node. | ||
<!--T:14--> | <!--T:14--> |