Using nearline storage: Difference between revisions

no edit summary
No edit summary
Line 4: Line 4:
Nearline storage is like [[Project layout|Project]], except that files can be "virtualized" by moving them to tape.  The files are still directly accessible, so this is a way to manage less-used files. On tape, they do not consume your disk quota, but can still be accessed if necessary.
Nearline storage is like [[Project layout|Project]], except that files can be "virtualized" by moving them to tape.  The files are still directly accessible, so this is a way to manage less-used files. On tape, they do not consume your disk quota, but can still be accessed if necessary.


This is useful because the capacity of our tape libraries is both large and expandable.  When a file has been moved to tape (that is, "virtualized"), it will still appear in the nearline directory.  If the virtual file is read, the reading process will block for some period, probably a few minutes, while the file contents are read from tape to disk.  Then IO to the file will behave like a normal disk-based file.
This is useful because the capacity of our tape libraries is both large and expandable.  When a file has been moved to tape (that is, "virtualized"), it will still appear in the <code>~/nearline</code> directory.  If the virtual file is read, the reading process will block for some period, probably a few minutes, while the file contents are read from tape to disk.  Then IO to the file will behave like a normal disk-based file.


== Expected use ==
== Expected use ==
Line 10: Line 10:


== How to use ==
== How to use ==
To use nearline, just put files into your ~/nearline/PROJECT directory.; after a period (currently 24 hours), they'll be stored on tape.  If the file remains unchanged for another period (also 24h), the disk-based copy will be dropped (making the file virtualized on tape).  Like most HPC storage, it's bad practice to have lots of small files.  (In fact, presently small files won't be moved to tape at all.) So if you have large collections of small files, it's wise to bundle them using a tool like tar. Also note that if you remove a file in Nearline, the tape copy will be retained for up to 60 days (restoring such a file is like asking for recovery from backup - you have to email us the request.)
To use Nearline, just put files into your <code>~/nearline/PROJECT</code> directory. After a period of time (currently 24 hours), they'll be copied onto tape.  If the file remains unchanged for another period (also 24h), the copy on disk will be removed, making the file virtualized on tape.   
 
Like most HPC storage, it's bad practice to have lots of small files.  In fact, files smaller than a certain threshold size may not be moved to tape at all.  So if you have large collections of small files, it's wise to bundle them using a tool like [[Archiving and compressing files|tar]].  
 
If you remove a file in <code>~/nearline</code>, the tape copy will be retained for up to 60 days. To restore such a file, contact [[technical support]] with the full path for the file(s) and desired version (by date), just as you would for [[Storage and file management#Filesystem Quotas and Policies|backup]] restoration.
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits