cc_staff
353
edits
(Marked this version for translation) |
No edit summary |
||
Line 58: | Line 58: | ||
<!--T:35--> | <!--T:35--> | ||
* If your dataset is around 10 GB | * If your dataset is around 10 GB or below, it can probably fit in memory, depending on how much memory your job has. You should not read the data from disk during your machine learning task. | ||
* If your dataset is around 100 GB | * If your dataset is around 100 GB or below, it can fit in the local storage of the compute node; please transfer it there at the beginning of the job. This storage is orders of magnitude faster more reliable than shared storage (home, project, scratch). A temporary directory is available for each job at $SLURM_TMPDIR. An example is given in [[Tutoriel_Apprentissage_machine/en|our tutorial]]. A caveat of local node storage is that another job might be using it fully, leaving you no space (we currently studying this problem). However, you might also get lucky and have a whole terabyte at your disposal. | ||
* If your dataset is larger, you may have to leave it in the shared storage. You can leave your datasets permanently in your project space. Scratch space can be faster, but it is not for permanent storage. Also, all shared storage (home, project, scratch) are for storing and reading large chunks of data at low frequencies / large intervals (1 second or more). | * If your dataset is larger, you may have to leave it in the shared storage. You can leave your datasets permanently in your project space. Scratch space can be faster, but it is not for permanent storage. Also, all shared storage (home, project, scratch) are for storing and reading large chunks of data at low frequencies / large intervals (1 second or more). | ||