Storage and file management: Difference between revisions

Precisions on storage types - SLURM_TMPDIR
No edit summary
(Precisions on storage types - SLURM_TMPDIR)
Line 29: Line 29:
Unlike your personal computer, a Compute Canada system will typically have several storage spaces or filesystems and you should ensure that you are using the right space for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with some of its characteristics.  
Unlike your personal computer, a Compute Canada system will typically have several storage spaces or filesystems and you should ensure that you are using the right space for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with some of its characteristics.  
* '''HOME:''' While your home directory may seem like the logical place to store all your files and do all your work, in general this isn't the case - your home normally has a relatively small quota and doesn't have especially good performance for the writing and reading of large amounts of data. The most logical use of your home directory is typically source code, small parameter files and job submission scripts.  
* '''HOME:''' While your home directory may seem like the logical place to store all your files and do all your work, in general this isn't the case - your home normally has a relatively small quota and doesn't have especially good performance for the writing and reading of large amounts of data. The most logical use of your home directory is typically source code, small parameter files and job submission scripts.  
* '''PROJECT:''' The project space has a significantly larger quota and is well-adapted to [[Sharing data | sharing data]] among members of a research group since it, unlike the home or scratch, is linked to a professor's account rather than an individual user. The data stored in the project space should be fairly static, since frequently changing data - including just moving and renaming directories - in project can become a heavy burden on the tape-based backup system.  
* '''PROJECT:''' The project space has a significantly larger quota and is well-adapted to [[Sharing data | sharing data]] among members of a research group since it, unlike the home or scratch, is linked to a professor's account rather than an individual user. The data stored in the project space should be fairly static, that is to say the data is not likely to be changed many times in a month. Otherwise, frequently changing data - including just moving and renaming directories - in project can become a heavy burden on the tape-based backup system.  
* '''SCRATCH''': For intensive read/write operations, scratch is the best choice. Remember however that important files must be copied off scratch since they are not backed up there, and older files are subject to [[Scratch purging policy|purging]]. The scratch storage should therefore be used for temporary files: checkpoint files, output from jobs and other data that can easily be recreated.
* '''SCRATCH''': For intensive read/write operations on large files (>100MB per file), scratch is the best choice. Remember however that important files must be copied off scratch since they are not backed up there, and older files are subject to [[Scratch purging policy|purging]]. The scratch storage should therefore be used for temporary files: checkpoint files, output from jobs and other data that can easily be recreated.
* '''SLURM_TMPDIR''': While a job is running, <code>$SLURM_TMPDIR</code> is a unique path to a temporary folder on a local fast filesystem on each compute node reserved for the job. This is the best location to temporarily store large collections of small files (<1MB per file). Note: this space is shared between jobs on each node, and the total available space depends on the node specifications. Finally, when the job ends, this folder is deleted.


== Best practices == <!--T:9-->
== Best practices == <!--T:9-->
cc_staff
782

edits