Using nearline storage: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 3: Line 3:


==Nearline is a filesystem virtualized onto tape== <!--T:1-->
==Nearline is a filesystem virtualized onto tape== <!--T:1-->
Nearline storage is  a disk-tape hybrid filesystem with a layout like [[Project layout|Project]], except that it may virtualize files by moving them to tape-based storage on criteria like age and size, and then back again upon read or recall operations. This is a way to manage less used files. On tape, the files do not consume your disk quota, but they can still be accessed, albeit more slowly.
Nearline storage is  a disk-tape hybrid filesystem with a layout like [[Project layout|Project]], except that it can virtualize files by moving them to tape-based storage on criteria like age and size, and then back again upon read or recall operations. This is a way to manage less used files. On tape, the files do not consume your disk quota, but they can still be accessed, albeit slower than on the home, scratch and project filesystems.


<!--T:2-->
<!--T:2-->
This is useful because the capacity of our tape libraries is both large and expandable.  When a file has been moved to tape (or ''virtualized''), it still appears in the directory listing.  If the virtual file is read, the reading process will block for some time, probably a few minutes, while the file contents is read from tape to disk.   
This is useful because the capacity of our tape libraries is both large and expandable.  When a file has been moved to tape (or ''virtualized''), it still appears in the directory listing.  If the virtual file is read, the reading process will block for some time, probably a few minutes, while the file contents is copied from tape to disk.   


== Expected use == <!--T:3-->
== Expected use == <!--T:3-->
Because of the delay in reading from tape, Nearline is not intended to be used by jobs where allocated time would be wasted.  It is only accessible as a directory on certain nodes of the clusters, and in particular, not on the compute nodes.  
Because of the delay in reading from tape, Nearline is not intended to be used by jobs where allocated time would be wasted.  It is only accessible as a directory on certain nodes of the clusters, but never on compute nodes.  


<!--T:9-->
<!--T:9-->
Line 25: Line 25:


<!--T:11-->
<!--T:11-->
To use Nearline, just put files into your <tt>~/nearline/PROJECT</tt> directory. After a period of time (currently 24 hours), they'll be copied onto tape.  If the file remains unchanged for another period (also 24h), the copy on disk will be removed, making the file virtualized on tape.  
To use Nearline, just put files into your <tt>~/nearline/PROJECT</tt> directory. After a period of time (24 hours as of February 2019), they'll be copied onto tape.  If the file remains unchanged for another period (24 hours as of February 2019), the copy on disk will be removed, making the file virtualized on tape.  


<!--T:8-->
<!--T:8-->
Line 44: Line 44:


<!--T:13-->
<!--T:13-->
2. For small data management of files in HPSS, you can use the VFS (''Virtual File System'') node, which is accessed with the command <tt>salloc --time=1:00:00 -pvfsshort</tt>.
2. To manage a small number of files in HPSS, you can use the VFS (''Virtual File System'') node, which is accessed with the command <tt>salloc --time=1:00:00 -pvfsshort</tt>.


<!--T:14-->
<!--T:14-->
rsnt_translations
56,420

edits

Navigation menu