Data management at Niagara: Difference between revisions

m
no edit summary
No edit summary
mNo edit summary
Line 15: Line 15:
Niagara accesses several different filesystems.  Note that not all of these filesystems are available to all users.
Niagara accesses several different filesystems.  Note that not all of these filesystems are available to all users.


=== /home($HOME)=== <!--T:5-->
=== /home ($HOME)=== <!--T:5-->
/home is intended primarily for individual user files, common software or small datasets used by others in the same group, provided it does not exceed individual quotas. Otherwise you may consider /scratch or /project. /home is read-only on the compute nodes.
/home is intended primarily for individual user files, common software or small datasets used by others in the same group, provided it does not exceed individual quotas. Otherwise you may consider /scratch or /project. /home is read-only on the compute nodes.


=== /scratch($SCRATCH) === <!--T:6-->
=== /scratch ($SCRATCH) === <!--T:6-->
/scratch is to be used primarily for temporary or transient files, for all the results of your computations and simulations, or any material that can be easily recreated or reacquired. You may use scratch as well for any intermediate step in your workflow, provided it does not induce too much IO or too many small files on this disk-based storage pool, otherwise you should consider burst buffer (/bb). Once you have your final results, those that you want to keep for the long term, you may migrate them to /project or /archive. /scratch is purged on a regular basis and has no backups.
/scratch is to be used primarily for temporary or transient files, for all the results of your computations and simulations, or any material that can be easily recreated or reacquired. You may use scratch as well for any intermediate step in your workflow, provided it does not induce too much IO or too many small files on this disk-based storage pool, otherwise you should consider burst buffer (/bb). Once you have your final results, those that you want to keep for the long term, you may migrate them to /project or /archive. /scratch is purged on a regular basis and has no backups.


=== /project($PROJECT) === <!--T:7-->
=== /project ($PROJECT) === <!--T:7-->
/project is intended for common group software, large static datasets, or any material very costly to be reacquired or re-generated by the group. <font color=red>Material on /project is expected to remain relatively immutable over time.</font> Temporary or transient files should be kept on scratch, not project. High data turnover induces stress and unnecessary consumption tapes on the TSM backup system, long after this material has been deleted, due to backup retention policies and the extra versions kept of the same file. Even renaming top directories is enough to trick the system into assuming a completely new directory tree has been created, and the old one deleted, hence think carefully about your naming convention ahead of time, and stick with it. Users abusing the project filesystem and using it as scratch will be flagged and contacted. Note that on niagara /project is only available to groups with RAC allocation.
/project is intended for common group software, large static datasets, or any material very costly to be reacquired or re-generated by the group. <font color=red>Material on /project is expected to remain relatively immutable over time.</font> Temporary or transient files should be kept on scratch, not project. High data turnover induces stress and unnecessary consumption tapes on the TSM backup system, long after this material has been deleted, due to backup retention policies and the extra versions kept of the same file. Even renaming top directories is enough to trick the system into assuming a completely new directory tree has been created, and the old one deleted, hence think carefully about your naming convention ahead of time, and stick with it. Users abusing the project filesystem and using it as scratch will be flagged and contacted. Note that on niagara /project is only available to groups with RAC allocation.


=== /bb($BBUFFER) === <!--T:8-->
=== /bb ($BBUFFER) === <!--T:8-->
/bb, the [https://docs.scinet.utoronto.ca/index.php/Burst_Buffer burst buffer], is a very fast, very high performance alternative to /scratch, made of solid-state drives (SSD). You may request this resource if you anticipate a lot of IOPs (Input/Output Operations) or when you notice your job is not performing well running on scratch or project because of I/O (Input/Output) bottlenecks. See [https://docs.scinet.utoronto.ca/index.php/Burst_Buffer here] for more details.
/bb, the [https://docs.scinet.utoronto.ca/index.php/Burst_Buffer burst buffer], is a very fast, very high performance alternative to /scratch, made of solid-state drives (SSD). You may request this resource if you anticipate a lot of IOPs (Input/Output Operations) or when you notice your job is not performing well running on scratch or project because of I/O (Input/Output) bottlenecks. See [https://docs.scinet.utoronto.ca/index.php/Burst_Buffer here] for more details.


=== /archive($ARCHIVE) === <!--T:9-->
=== /archive ($ARCHIVE) === <!--T:9-->
/archive is a '''nearline''' storage pool, if you want to temporarily offload semi-active material from any of the above filesystems. In practice users will offload/recall material as part of their regular workflow, or when they hit their quotas on scratch or project. That material can remain on HPSS for a few months to a few years. Note that on niagara /archive is only available to groups with RAC allocation.
/archive is a '''nearline''' storage pool, if you want to temporarily offload semi-active material from any of the above filesystems. In practice users will offload/recall material as part of their regular workflow, or when they hit their quotas on scratch or project. That material can remain on HPSS for a few months to a few years. Note that on niagara /archive is only available to groups with RAC allocation.


Bureaucrats, cc_docs_admin, cc_staff
2,879

edits