Handling large collections of files: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 32: Line 32:
* for other clusters you can assume the available disk size to be at least 190GB
* for other clusters you can assume the available disk size to be at least 190GB


You can access this local disk inside of a job using the environment variable <tt>$SLURM_TMPDIR</tt>. One approach therefore would be to keep your dataset archived as a single <tt>tar</tt> file in the project space and then copy it to the local disk at the beginning of your job, extract it and use the dataset during the job. If any changes were made, at the job's end you could again archive the contents to a <tt>tar</tt> file and copy it back to the project space.  
You can access this local disk inside of a job using the environment variable <tt>$SLURM_TMPDIR</tt>. One approach therefore would be to keep your dataset archived as a single <tt>tar</tt> file in the project space and then copy it to the local disk at the beginning of your job, extract it and use the dataset during the job. If any changes were made, at the job's end you could again archive the contents to a <tt>tar</tt> file and copy it back to the project space. Here is an example of a submission scrip that allocates an entire node
{{File
{{File
|name=job_script.sh
|name=job_script.sh
cc_staff
318

edits

Navigation menu