Handling large collections of files: Difference between revisions

Jump to navigation Jump to search
no edit summary
(mention where sqlite is found)
No edit summary
Line 26: Line 26:


==Local disk== <!--T:6-->
==Local disk== <!--T:6-->
Note that one option is the use of the attached local disk for the compute node, which offers roughly 190GB of disk space without quotas of any sort and in general, it will have a performance that is considerably better than the project or scratch filesystems. You can access this local disk inside of a job using the environment variable <tt>$SLURM_TMPDIR</tt>. One approach therefore would be to keep your dataset archived as a single <tt>tar</tt> file in the project space and then copy it to the local disk at the beginning of your job, extract it and use the dataset during the job. If any changes were made, at the job's end you could again archive the contents to a <tt>tar</tt> file and copy it back to the project space.  
Note that one option is the use of the attached local disk for the compute node, which offers roughly 190GB of disk space. Local disk is shared by all running jobs on that node without being allocated by the scheduler. In general, it will have a performance that is considerably better than the project or scratch filesystems. You can access this local disk inside of a job using the environment variable <tt>$SLURM_TMPDIR</tt>. One approach therefore would be to keep your dataset archived as a single <tt>tar</tt> file in the project space and then copy it to the local disk at the beginning of your job, extract it and use the dataset during the job. If any changes were made, at the job's end you could again archive the contents to a <tt>tar</tt> file and copy it back to the project space.  
{{File
{{File
|name=job_script.sh
|name=job_script.sh
cc_staff
318

edits

Navigation menu