Using nearline storage: Difference between revisions

Marked this version for translation
No edit summary
(Marked this version for translation)
Line 2: Line 2:
<translate>
<translate>


==Nearline is a filesystem virtualized onto tape==
==Nearline is a filesystem virtualized onto tape== <!--T:1-->
Nearline storage is like [[Project layout|Project]], except that the system may "virtualize" files by moving them to tape.  This is a way to manage less-used files.  On tape they do not consume your disk quota, but they can still be accessed, albeit more slowly.
Nearline storage is like [[Project layout|Project]], except that the system may "virtualize" files by moving them to tape.  This is a way to manage less-used files.  On tape they do not consume your disk quota, but they can still be accessed, albeit more slowly.


<!--T:2-->
This is useful because the capacity of our tape libraries is both large and expandable.  When a file has been moved to tape (that is, "virtualized"), it will still appear in the directory listing.  If the virtual file is read, the reading process will block for some time, probably a few minutes, while the file contents are read from tape to disk.  Then IO to the file will behave like any other disk-based file.
This is useful because the capacity of our tape libraries is both large and expandable.  When a file has been moved to tape (that is, "virtualized"), it will still appear in the directory listing.  If the virtual file is read, the reading process will block for some time, probably a few minutes, while the file contents are read from tape to disk.  Then IO to the file will behave like any other disk-based file.


== Expected use and Status==
== Expected use and Status== <!--T:3-->
Because of the delay in reading from tape, Nearline is not intended to be used by jobs, where the delay would waste allocated time. It is only accessible from login and DTN nodes.
Because of the delay in reading from tape, Nearline is not intended to be used by jobs, where the delay would waste allocated time. It is only accessible from login and DTN nodes.


<!--T:4-->
Nearline is intended for use with relatively large files - do not use it for large numbers of small files.  This is because retrievals from tape take longer than from disk, and the number of tape drives is limited.
Nearline is intended for use with relatively large files - do not use it for large numbers of small files.  This is because retrievals from tape take longer than from disk, and the number of tape drives is limited.


<!--T:5-->
Currently, Nearline is implemented on [[Graham]], with work underway for [[Cedar]] and eventually Béluga.
Currently, Nearline is implemented on [[Graham]], with work underway for [[Cedar]] and eventually Béluga.


== How to use ==
== How to use == <!--T:6-->
To use Nearline, just put files into your <tt>~/nearline/PROJECT</tt> directory. After a period of time (currently 24 hours), they'll be copied onto tape.  If the file remains unchanged for another period (also 24h), the copy on disk will be removed, making the file virtualized on tape.  
To use Nearline, just put files into your <tt>~/nearline/PROJECT</tt> directory. After a period of time (currently 24 hours), they'll be copied onto tape.  If the file remains unchanged for another period (also 24h), the copy on disk will be removed, making the file virtualized on tape.  


<!--T:7-->
Like most HPC storage, it's bad practice to have lots of small files.  In fact, files smaller than a certain threshold size may not be moved to tape at all.  So if you have large collections of small files, you should first bundle them using a tool like [[Archiving and compressing files|tar]].  
Like most HPC storage, it's bad practice to have lots of small files.  In fact, files smaller than a certain threshold size may not be moved to tape at all.  So if you have large collections of small files, you should first bundle them using a tool like [[Archiving and compressing files|tar]].  


<!--T:8-->
If you remove a file in <tt>~/nearline</tt>, the tape copy will be retained for up to 60 days. To restore such a file, contact [[technical support]] with the full path for the file(s) and desired version (by date), just as you would for [[Storage and file management#Filesystem Quotas and Policies|backup]] restoration. Note that since you will need the full path for the file, it is important for you to retain a copy of the complete directory structure of your Nearline space. For example, you can run the command <tt>ls -R > ~/nearline_contents.txt</tt> from the <tt>~/nearline/PROJECT</tt> directory so that you have a copy of the location of all the files in your Nearline space.
If you remove a file in <tt>~/nearline</tt>, the tape copy will be retained for up to 60 days. To restore such a file, contact [[technical support]] with the full path for the file(s) and desired version (by date), just as you would for [[Storage and file management#Filesystem Quotas and Policies|backup]] restoration. Note that since you will need the full path for the file, it is important for you to retain a copy of the complete directory structure of your Nearline space. For example, you can run the command <tt>ls -R > ~/nearline_contents.txt</tt> from the <tt>~/nearline/PROJECT</tt> directory so that you have a copy of the location of all the files in your Nearline space.


</translate>
</translate>
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits