Using node-local storage: Difference between revisions

Using node-local storage (view source)

Revision as of 22:34, 6 April 2020

1,350 bytes added , 4 years ago

no edit summary

Rdickson

Bureaucrats, cc_docs_admin, cc_staff

2,879

edits

@@ Line 5: / Line 5: @@
 Because this directory resides on local disk, input and output (I/O) to it
-is almost always faster than I/O to one a network file system (/project, /scratch, or /home).
+is almost always faster than I/O to a network file system (/project, /scratch, or /home).
 Any job doing substantial input and output (which is most jobs!) may expect
 to run more quickly if it uses $SLURM_TMPDIR instead of network disk.
@@ Line 13: / Line 13: @@
 === Input ===
-In order to *read* data from $SLURM_TMPDIR, the data must first be copied there.
+In order to ''read'' data from $SLURM_TMPDIR, the data must first be copied there.
 MORE TO COME...
@@ Line 25: / Line 25: @@
 === Multi-node jobs ===
-If a job spans multiple nodes and some data is needed on every node, then a simple 'cp' or 'tar -x' will not suffice.
+If a job spans multiple nodes and some data is needed on every node, then a simple <code>cp</code> or <code>tar -x</code> will not suffice.
-The Slurm utility 'sbcast' can be useful here.
+The Slurm utility [https://slurm.schedmd.com/sbcast.html sbcast] may be useful here.
+It will distribute a file to every node assigned to a job.
+It only operates on a single file, though.
 MORE TO COME ...
-=== Less-than-node jobs ===
+=== Amount of space ===
+At '''[[Niagara]]''' $SLURM_TMPDIR is implemented as "RAMdisk",
+so the amount of space available is limited by the memory on the node,
+less the amount of RAM used by your application.
+See [[Data_management_at_Niagara#.24SLURM_TMPDIR_.28RAM.29|Data management at Niagara]] for more.
+At the general-purpose clusters [[Béluga/en|Béluga]], [[Cedar]], and [[Graham]],
+the amount of space available depends on the cluster and the node to which your job is assigned.
-MORE TO COME...
+{| class="wikitable sortable"
+! cluster !! space in $SLURM_TMPDIR !! size of disk
+|-
+| Béluga  || 370G || 480G
+|-
+| Cedar   || 840G || 960G
+|-
+| Graham  || 750G || 960G
+|}
-=== Site differences ===
+The table above gives the typical amount of free space in $SLURM_TMPDIR on the smallest node in each cluster.
+If your job reserves [[Advanced_MPI_scheduling#Whole_nodes|whole nodes]]
+then you can reasonably assume that this much space is available to you in $SLURM_TMPDIR on each node.
+However, if the job requests less than a whole node, then other jobs may also write to the same filesystem
+(but not the same directory!), reducing the space available to your job.
-[Amount of space. RAM-disk at Niagara. MORE TO COME...
+Some nodes at each site have more local disk than shown above.
+See "Node characteristics" at the appropriate page ([[Béluga/en|Béluga]], [[Cedar]], [[Graham]]) for guidance.

Using node-local storage: Difference between revisions

Using node-local storage (view source)

Revision as of 22:34, 6 April 2020

Navigation menu

Search