Using node-local storage: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 5: Line 5:


Because this directory resides on local disk, input and output (I/O) to it
Because this directory resides on local disk, input and output (I/O) to it
is almost always faster than I/O to one a network file system (/project, /scratch, or /home).
is almost always faster than I/O to a network file system (/project, /scratch, or /home).
Any job doing substantial input and output (which is most jobs!) may expect
Any job doing substantial input and output (which is most jobs!) may expect
to run more quickly if it uses $SLURM_TMPDIR instead of network disk.
to run more quickly if it uses $SLURM_TMPDIR instead of network disk.
Line 13: Line 13:
=== Input ===
=== Input ===


In order to *read* data from $SLURM_TMPDIR, the data must first be copied there.   
In order to ''read'' data from $SLURM_TMPDIR, the data must first be copied there.   
MORE TO COME...
MORE TO COME...


Line 25: Line 25:
=== Multi-node jobs ===
=== Multi-node jobs ===


If a job spans multiple nodes and some data is needed on every node, then a simple 'cp' or 'tar -x' will not suffice.
If a job spans multiple nodes and some data is needed on every node, then a simple <code>cp</code> or <code>tar -x</code> will not suffice.
The Slurm utility 'sbcast' can be useful here.
The Slurm utility [https://slurm.schedmd.com/sbcast.html sbcast] may be useful here.
It will distribute a file to every node assigned to a job. 
It only operates on a single file, though.
 
MORE TO COME ...
MORE TO COME ...


=== Less-than-node jobs ===
=== Amount of space ===
 
At '''[[Niagara]]''' $SLURM_TMPDIR is implemented as "RAMdisk",
so the amount of space available is limited by the memory on the node,
less the amount of RAM used by your application.
See [[Data_management_at_Niagara#.24SLURM_TMPDIR_.28RAM.29|Data management at Niagara]] for more.
 
At the general-purpose clusters [[Béluga/en|Béluga]], [[Cedar]], and [[Graham]],
the amount of space available depends on the cluster and the node to which your job is assigned.


MORE TO COME...
{| class="wikitable sortable"
! cluster !! space in $SLURM_TMPDIR !! size of disk
|-
| Béluga  || 370G || 480G
|-
| Cedar  || 840G || 960G
|-
| Graham  || 750G || 960G
|}


=== Site differences ===
The table above gives the typical amount of free space in $SLURM_TMPDIR on the smallest node in each cluster. 
If your job reserves [[Advanced_MPI_scheduling#Whole_nodes|whole nodes]]
then you can reasonably assume that this much space is available to you in $SLURM_TMPDIR on each node.
However, if the job requests less than a whole node, then other jobs may also write to the same filesystem
(but not the same directory!), reducing the space available to your job.


[Amount of space. RAM-disk at Niagara. MORE TO COME...
Some nodes at each site have more local disk than shown above.
See "Node characteristics" at the appropriate page ([[Béluga/en|Béluga]], [[Cedar]], [[Graham]]) for guidance.
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits

Navigation menu