Storage and file management: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
Line 2: Line 2:
==Overview==
==Overview==


Compute Canada provides a wide range of storage options to cover the needs of our very diverse users. These storage solutions range from high-speed temporary local storage to different kinds of long-term storage, so you can choose the storage medium that best corresponds to your needs and usage patterns. In most cases the filesystems on Compute Canada systems are a ''shared'' resource and for this reason should be used responsibly - unwise behaviour can negatively affect dozens or hundreds of other users. These filesystems are also designed to store a limited number of very large files, typically binary rather than text files, i.e. they are not directly human-readable. You should therefore avoid storing thousands of small files, where small means less than a few megabytes, particularly in the same directory. A better approach is to use commands like [[Archiving and compressing files|<tt>tar</tt>]] or <tt>zip</tt> to convert a directory containing many small files into a single very large archive file.  
Compute Canada provides a wide range of storage options to cover the needs of our very diverse users. These storage solutions range from high-speed temporary local storage to different kinds of long-term storage, so you can choose the storage medium that best corresponds to your needs and usage patterns. In most cases the [https://en.wikipedia.org/wiki/File_system filesystems] on Compute Canada systems are a ''shared'' resource and for this reason should be used responsibly - unwise behaviour can negatively affect dozens or hundreds of other users. These filesystems are also designed to store a limited number of very large files, typically binary rather than text files, i.e. they are not directly human-readable. You should therefore avoid storing thousands of small files, where small means less than a few megabytes, particularly in the same directory. A better approach is to use commands like [[Archiving and compressing files|<tt>tar</tt>]] or <tt>zip</tt> to convert a directory containing many small files into a single very large archive file.  


It is also your responsibility to manage the age of your stored data: most of the filesystems are not intended to provide an indefinite archiving service so when a given file or directory is no longer needed, you need to move it to a more appropriate filesystem which may well mean your personal workstation or some other storage system under your control. Moving significant amounts of data between your workstation and a Compute Canada system or between two Compute Canada systems should generally be done using [[Globus]].  
It is also your responsibility to manage the age of your stored data: most of the filesystems are not intended to provide an indefinite archiving service so when a given file or directory is no longer needed, you need to move it to a more appropriate filesystem which may well mean your personal workstation or some other storage system under your control. Moving significant amounts of data between your workstation and a Compute Canada system or between two Compute Canada systems should generally be done using [[Globus]].  
Line 9: Line 9:


== Storage Types ==
== Storage Types ==
Unlike your personal computer, a Compute Canada system will typically have several filesystems and you should ensure that you are using the right filesystem for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with its characteristics. Storage options are distinguished by the available hardware, access mode and write system. Typically, most Compute Canada systems offer the following storage types:
Unlike your personal computer, a Compute Canada system will typically have several storage spaces or filesystems and you should ensure that you are using the right space for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with its characteristics. Storage options are distinguished by the available hardware, access mode and write system. Typically, most Compute Canada systems offer the following storage types:


;Network Filesystem (NFS)
;Network Filesystem (NFS)
Line 16: Line 16:
: This type of storage is generally equally visible on both login and compute nodes. Combining multiple disk arrays and fast servers, it offers excellent performance for large files and large input/output operations. Often two types of storage are distinguished on such systems: long term storage and temporary storage (scratch). Performance is subject to variations caused by other users.
: This type of storage is generally equally visible on both login and compute nodes. Combining multiple disk arrays and fast servers, it offers excellent performance for large files and large input/output operations. Often two types of storage are distinguished on such systems: long term storage and temporary storage (scratch). Performance is subject to variations caused by other users.
;Local Filesystem
;Local Filesystem
: This type of storage consists of a local hard drive attached to each compute node. Its advantage is that its performance is high because it is very rarely shared --- typically, only one user will access a local drive at a time. However, you must copy your files back to another storage medium like <tt>$SCRATCH</tt> or <tt>$PROJECT</tt> before your job ends because everything will be cleaned after each job.
: This type of storage consists of a local hard drive attached to each compute node. Its advantage is that its performance is high because it is very rarely shared --- typically, only one user will access a local drive at a time. However, you must copy your files back to another storage medium like the scratch space or project space before your job ends because everything will be cleaned after each job.
;RAM (memory) file system
;RAM (memory) Filesystem
: This is a file system that exists within a compute node's RAM, so its use reduces available memory for computations. Such file systems are very fast for small files and particularly faster than other systems when file access is random. A RAM disk is always cleaned at the end of a job.
: This is a filesystem that exists within a compute node's RAM, so its use reduces available memory for computations. Such file systems are very fast for small files and particularly faster than other systems when file access is random. A RAM disk is always cleaned at the end of a job.


The following table summarizes the properties of these storage types.
The following table summarizes the properties of these storage types.
Line 25: Line 25:
|+align="bottom" style="color:#e76700;"|''Description of storage type''
|+align="bottom" style="color:#e76700;"|''Description of storage type''
! scope="col" width="120px" | Storage Type
! scope="col" width="120px" | Storage Type
! scope="col" width="120px" | Name
! scope="col" width="120px" | Accessibility
! scope="col" width="120px" | Accessibility
! scope="col" width="120px" | Throughput (large operations, > 1&nbsp;MB per operation)
! scope="col" width="120px" | Throughput (large operations, > 1&nbsp;MB per operation)
Line 32: Line 31:
|-
|-
|Network Filesystem (NFS)
|Network Filesystem (NFS)
|<tt>$HOME</tt>
|style="background-color:#00c000;"|All nodes
|style="background-color:#00c000;"|All nodes
|style="background-color:#ff0000;"|100 MB/s shared
|style="background-color:#ff0000;"|Poor
|style="background-color:#ff0000;"|High
|style="background-color:#ff0000;"|High
|style="background-color:#00c000;"|Long term
|style="background-color:#00c000;"|Long term
|-
|-
|Long-Term Parallel Filesystem
|Long-Term Parallel Filesystem
|<tt>$PROJECT</tt>
|style="background-color:#00c000;"|All nodes
|style="background-color:#00c000;"|All nodes
|style="background-color:#ffff00;"|1-10 GB/s shared
|style="background-color:#ffff00;"|Fair
|style="background-color:#ff0000;"|High
|style="background-color:#ff0000;"|High
|style="background-color:#00c000;"|Long term
|style="background-color:#00c000;"|Long term
|-
|-
|Short-Term Parallel Filesystem
|Short-Term Parallel Filesystem
|<tt>$SCRATCH</tt>
|style="background-color:#00c000;"|All nodes
|style="background-color:#00c000;"|All nodes
|style="background-color:#ffff00;"|1-10 GB/s shared
|style="background-color:#ffff00;"|Fair
|style="background-color:#ff0000;"|High
|style="background-color:#ff0000;"|High
|style="background-color:#ffff00;"|Short term (periodically cleaned)
|style="background-color:#ffff00;"|Short term (periodically cleaned)
|-
|-
|Local Filesystem
|Local Filesystem
|<tt>$LSCRATCH</tt>
|style="background-color:#ff0000;"|Local to the node
|style="background-color:#ff0000;"|Local to the node
|style="background-color:#ffff00;"|100 MB/s
|style="background-color:#ffff00;"|Fair
|style="background-color:#ffff00;"|Medium
|style="background-color:#ffff00;"|Medium
|style="background-color:#ff0000;"|Very short term
|style="background-color:#ff0000;"|Very short term
|-
|-
|Memory (RAM) Filesystem
|Memory (RAM) Filesystem
|<tt>$RAMDISK, /dev/shm</tt>
|style="background-color:#ff0000;"|Local to the node
|style="background-color:#ff0000;"|Local to the node
|style="background-color:#00c000;"|1-10 GB/s
|style="background-color:#00c000;"|Good
|style="background-color:#00c000;"|Very low
|style="background-color:#00c000;"|Very low
|style="background-color:#ff0000;"|Very short term, cleaned after every job
|style="background-color:#ff0000;"|Very short term, cleaned after every job
Line 71: Line 65:
* As far as possible, use local storage for temporary files.
* As far as possible, use local storage for temporary files.
* If your program must search within a file, it is fastest to do it by first reading it completely before searching, or to use a RAM disk.
* If your program must search within a file, it is fastest to do it by first reading it completely before searching, or to use a RAM disk.
* Regularly clean up your data in the <tt>$SCRATCH</tt> and <tt>$PROJECT</tt> filesystems, because those systems are used for huge data collections.
* Regularly clean up your data in the scratch and project spaces, because those filesystems are used for huge data collections.
* If you no longer use certain files but they must be retained, [[Archiving and compressing files|archive and compress]] them, and if possible copy them elsewhere.
* If you no longer use certain files but they must be retained, [[Archiving and compressing files|archive and compress]] them, and if possible copy them elsewhere.
* If your needs are not well served by the available storage options please contact us by sending an e-mail to [mailto:support@computecanada.ca Compute Canada support].
* If your needs are not well served by the available storage options please contact us by sending an e-mail to [mailto:support@computecanada.ca Compute Canada support].
Line 77: Line 71:
==Filesystem Quotas and Policies==
==Filesystem Quotas and Policies==


In order to ensure that there is adequate space for all Compute Canada users, there are a variety of quotas and policy restrictions concerning back-ups and automatic purging of certain filesystems. Every user has access to the <tt>$HOME</tt> and <tt>$SCRATCH</tt> filesystems by default as well as a certain amount of space on <tt>$PROJECT</tt>. To have access to the full 10 TB quota on <tt>$PROJECT</tt> users must submit a request while <tt>$NEARLINE</tt> is allocated using the annual RAC (resource allocation) process, which can also have the effect of increasing a group's quote for <tt>$PROJECT</tt> and <tt>$SCRATCH</tt>.
In order to ensure that there is adequate space for all Compute Canada users, there are a variety of quotas and policy restrictions concerning back-ups and automatic purging of certain filesystems. Every user has access to the home and scratch spaces by default as well as a certain amount of project space. To have access to the full 10 TB quota of project space users must submit a request while the nearline space is allocated using the annual RAC (resource allocation) process, which can also have the effect of increasing a group's quote for the project and scratch spaces.  


{| class="wikitable"
{| class="wikitable"
Line 88: Line 82:
! Mounted on Compute Nodes?
! Mounted on Compute Nodes?
|-
|-
|<tt>$HOME</tt>
|Home Space
|50 GB, 500K files
|50 GB, 500K files
|Yes
|Yes
Line 95: Line 89:
|Yes
|Yes
|-
|-
|<tt>$SCRATCH</tt>
|Scratch Space
|20 TB and 1000K files per user, 100 TB and 10M files per group
|20 TB and 1000K files per user, 100 TB and 10M files per group
|No
|No
Line 102: Line 96:
|Yes
|Yes
|-
|-
|<tt>$PROJECT</tt>
|Project Space
|Up to 10 TB and 5M files per group, 500K files per user
|Up to 10 TB and 5M files per group, 500K files per user
|Yes
|Yes
Line 109: Line 103:
|Yes
|Yes
|-
|-
|<tt>$NEARLINE</tt>
|Nearline Space
|5 TB per group
|5 TB per group
|No
|No

Revision as of 15:42, 18 April 2017


This article is a draft

This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.



Overview[edit]

Compute Canada provides a wide range of storage options to cover the needs of our very diverse users. These storage solutions range from high-speed temporary local storage to different kinds of long-term storage, so you can choose the storage medium that best corresponds to your needs and usage patterns. In most cases the filesystems on Compute Canada systems are a shared resource and for this reason should be used responsibly - unwise behaviour can negatively affect dozens or hundreds of other users. These filesystems are also designed to store a limited number of very large files, typically binary rather than text files, i.e. they are not directly human-readable. You should therefore avoid storing thousands of small files, where small means less than a few megabytes, particularly in the same directory. A better approach is to use commands like tar or zip to convert a directory containing many small files into a single very large archive file.

It is also your responsibility to manage the age of your stored data: most of the filesystems are not intended to provide an indefinite archiving service so when a given file or directory is no longer needed, you need to move it to a more appropriate filesystem which may well mean your personal workstation or some other storage system under your control. Moving significant amounts of data between your workstation and a Compute Canada system or between two Compute Canada systems should generally be done using Globus.

Note that Compute Canada storage systems are not for personal use and should only be used to store research data.

Storage Types[edit]

Unlike your personal computer, a Compute Canada system will typically have several storage spaces or filesystems and you should ensure that you are using the right space for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with its characteristics. Storage options are distinguished by the available hardware, access mode and write system. Typically, most Compute Canada systems offer the following storage types:

Network Filesystem (NFS)
This type of storage is generally equally visible on both login and compute nodes. This is the appropriate place to put small but important files that are regularly used: source code, programs, job scripts and parameter files. This type of storage offers performance comparable to a conventional hard disk.
Parallel Filesystem (Lustre, GPFS)
This type of storage is generally equally visible on both login and compute nodes. Combining multiple disk arrays and fast servers, it offers excellent performance for large files and large input/output operations. Often two types of storage are distinguished on such systems: long term storage and temporary storage (scratch). Performance is subject to variations caused by other users.
Local Filesystem
This type of storage consists of a local hard drive attached to each compute node. Its advantage is that its performance is high because it is very rarely shared --- typically, only one user will access a local drive at a time. However, you must copy your files back to another storage medium like the scratch space or project space before your job ends because everything will be cleaned after each job.
RAM (memory) Filesystem
This is a filesystem that exists within a compute node's RAM, so its use reduces available memory for computations. Such file systems are very fast for small files and particularly faster than other systems when file access is random. A RAM disk is always cleaned at the end of a job.

The following table summarizes the properties of these storage types.

Description of storage type
Storage Type Accessibility Throughput (large operations, > 1 MB per operation) Latency (small operations) Longevity
Network Filesystem (NFS) All nodes Poor High Long term
Long-Term Parallel Filesystem All nodes Fair High Long term
Short-Term Parallel Filesystem All nodes Fair High Short term (periodically cleaned)
Local Filesystem Local to the node Fair Medium Very short term
Memory (RAM) Filesystem Local to the node Good Very low Very short term, cleaned after every job

Best practices[edit]

  • Only use text format for files that are smaller than a few megabytes.
  • As far as possible, use local storage for temporary files.
  • If your program must search within a file, it is fastest to do it by first reading it completely before searching, or to use a RAM disk.
  • Regularly clean up your data in the scratch and project spaces, because those filesystems are used for huge data collections.
  • If you no longer use certain files but they must be retained, archive and compress them, and if possible copy them elsewhere.
  • If your needs are not well served by the available storage options please contact us by sending an e-mail to Compute Canada support.

Filesystem Quotas and Policies[edit]

In order to ensure that there is adequate space for all Compute Canada users, there are a variety of quotas and policy restrictions concerning back-ups and automatic purging of certain filesystems. Every user has access to the home and scratch spaces by default as well as a certain amount of project space. To have access to the full 10 TB quota of project space users must submit a request while the nearline space is allocated using the annual RAC (resource allocation) process, which can also have the effect of increasing a group's quote for the project and scratch spaces.

Filesystem Characteristics
Filesystem Quotas Backed up? Purged? Available by Default? Mounted on Compute Nodes?
Home Space 50 GB, 500K files Yes No Yes Yes
Scratch Space 20 TB and 1000K files per user, 100 TB and 10M files per group No Yes, all files older than 90 days Yes Yes
Project Space Up to 10 TB and 5M files per group, 500K files per user Yes No Yes Yes
Nearline Space 5 TB per group No No No No

See also[edit]