Using nearline storage: Difference between revisions

Marked this version for translation
No edit summary
(Marked this version for translation)
Line 11: Line 11:
Because of the delay in reading from tape, Nearline is not intended to be used by jobs, where the delay would waste allocated time.  It is only accessible as a directory on certain nodes of the clusters, and in particular, not on the compute nodes  
Because of the delay in reading from tape, Nearline is not intended to be used by jobs, where the delay would waste allocated time.  It is only accessible as a directory on certain nodes of the clusters, and in particular, not on the compute nodes  


<!--T:9-->
Nearline is intended for use with relatively large files - do not use it for large numbers of small files.  In fact, files smaller than a certain threshold size may not be moved to tape at all.  Files smaller than ~200MB should be combined into archive files ("tarballs") using [[Archiving and compressing files|tar]] or a similar tool.
Nearline is intended for use with relatively large files - do not use it for large numbers of small files.  In fact, files smaller than a certain threshold size may not be moved to tape at all.  Files smaller than ~200MB should be combined into archive files ("tarballs") using [[Archiving and compressing files|tar]] or a similar tool.


Line 18: Line 19:
== How to use == <!--T:6-->
== How to use == <!--T:6-->


<!--T:10-->
<tabs>
<tabs>
<tab name="General purpose clusters">
<tab name="General purpose clusters">
Nearline is only accessible as a directory on the login nodes and DTNs ("Data Transfer Nodes"),
Nearline is only accessible as a directory on the login nodes and DTNs ("Data Transfer Nodes"),


<!--T:11-->
To use Nearline, just put files into your <tt>~/nearline/PROJECT</tt> directory. After a period of time (currently 24 hours), they'll be copied onto tape.  If the file remains unchanged for another period (also 24h), the copy on disk will be removed, making the file virtualized on tape.  
To use Nearline, just put files into your <tt>~/nearline/PROJECT</tt> directory. After a period of time (currently 24 hours), they'll be copied onto tape.  If the file remains unchanged for another period (also 24h), the copy on disk will be removed, making the file virtualized on tape.  


Line 30: Line 33:
There are three ways to access Nearline on Niagara:
There are three ways to access Nearline on Niagara:


<!--T:12-->
1. By submitting hpss-specific commands htar or hsi as an 'archive' job to SLURM; see [https://docs.scinet.utoronto.ca/index.php/HPSS the HPSS documentation] for detailed examples. Using job scripts offer the benefit of automating Nearline transfers, and is the best method if you use HPSS regularly.
1. By submitting hpss-specific commands htar or hsi as an 'archive' job to SLURM; see [https://docs.scinet.utoronto.ca/index.php/HPSS the HPSS documentation] for detailed examples. Using job scripts offer the benefit of automating Nearline transfers, and is the best method if you use HPSS regularly.


<!--T:13-->
2. For small data management of files in HPSS, you can use the VFS ("Virtual File System") node, which is accessed using the command: <tt>salloc --time=1:00:00 -pvfsshort</tt>
2. For small data management of files in HPSS, you can use the VFS ("Virtual File System") node, which is accessed using the command: <tt>salloc --time=1:00:00 -pvfsshort</tt>


<!--T:14-->
3. You can also use [[Globus]] for transfers to and from HPSS using the endpoint <b>computecanada#hpss</b>.  This is useful for occasional usage and for transfers from other sites.
3. You can also use [[Globus]] for transfers to and from HPSS using the endpoint <b>computecanada#hpss</b>.  This is useful for occasional usage and for transfers from other sites.


<!--T:15-->
In usage modes 1 and 2, your HPSS files can be found in the $ARCHIVE directory, which is like '$PROJECT' but with '/project' replaced by '/archive'.  
In usage modes 1 and 2, your HPSS files can be found in the $ARCHIVE directory, which is like '$PROJECT' but with '/project' replaced by '/archive'.  
</tab>
</tab>
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits