Handling large collections of files: Difference between revisions

Handling large collections of files (view source)

Revision as of 15:47, 21 February 2020

66 bytes added , 4 years ago

no edit summary

Skhan

cc_staff

318

edits

@@ Line 26: / Line 26: @@
 ==Local disk== <!--T:6-->
-Note that one option is the use of the attached local disk for the compute node, which offers roughly 190GB of disk space without quotas of any sort and in general, it will have a performance that is considerably better than the project or scratch filesystems. You can access this local disk inside of a job using the environment variable <tt>$SLURM_TMPDIR</tt>. One approach therefore would be to keep your dataset archived as a single <tt>tar</tt> file in the project space and then copy it to the local disk at the beginning of your job, extract it and use the dataset during the job. If any changes were made, at the job's end you could again archive the contents to a <tt>tar</tt> file and copy it back to the project space.
+Note that one option is the use of the attached local disk for the compute node, which offers roughly 190GB of disk space. Local disk is shared by all running jobs on that node without being allocated by the scheduler. In general, it will have a performance that is considerably better than the project or scratch filesystems. You can access this local disk inside of a job using the environment variable <tt>$SLURM_TMPDIR</tt>. One approach therefore would be to keep your dataset archived as a single <tt>tar</tt> file in the project space and then copy it to the local disk at the beginning of your job, extract it and use the dataset during the job. If any changes were made, at the job's end you could again archive the contents to a <tt>tar</tt> file and copy it back to the project space.
 {{File
 |name=job_script.sh

Handling large collections of files: Difference between revisions

Handling large collections of files (view source)

Revision as of 15:47, 21 February 2020

Navigation menu

Search