Storage and file management: Difference between revisions

Jump to navigation Jump to search
no edit summary
(Marked this version for translation)
No edit summary
Line 4: Line 4:


<!--T:2-->
<!--T:2-->
Compute Canada provides a wide range of storage options to cover the needs of our very diverse users. These storage solutions range from high-speed temporary local storage to different kinds of long-term storage, so you can choose the storage medium that best corresponds to your needs and usage patterns. In most cases the [https://en.wikipedia.org/wiki/File_system filesystems] on Compute Canada systems are a ''shared'' resource and for this reason should be used responsibly - unwise behaviour can negatively affect dozens or hundreds of other users. These filesystems are also designed to store a limited number of very large files, typically binary rather than text files, i.e. they are not directly human-readable. You should therefore avoid storing thousands of small files, where small means less than a few megabytes, particularly in the same directory. A better approach is to use commands like [[Archiving and compressing files|<tt>tar</tt>]] or <tt>zip</tt> to convert a directory containing many small files into a single very large archive file.  
Compute Canada provides a wide range of storage options to cover the needs of our very diverse users. These storage solutions range from high-speed temporary local storage to different kinds of long-term storage, so you can choose the storage medium that best corresponds to your needs and usage patterns. In most cases the [https://en.wikipedia.org/wiki/File_system filesystems] on Compute Canada systems are a ''shared'' resource and for this reason should be used responsibly - unwise behaviour can negatively affect dozens or hundreds of other users. These filesystems are also designed to store a limited number of very large files, which are typically binary since very large (hundreds of MB or more) text files lose most of their interest in being human-readable. You should therefore avoid storing tens of thousands of small files, where small means less than a few megabytes, particularly in the same directory. A better approach is to use commands like [[Archiving and compressing files|<tt>tar</tt>]] or <tt>zip</tt> to convert a directory containing many small files into a single very large archive file.  


<!--T:3-->
<!--T:3-->
Line 13: Line 13:


<!--T:17-->
<!--T:17-->
When your account is created on Cedar and Graham, your home directory will not be entirely empty. It will contain references to your scratch and [[Project layout|project]] spaces through the mechanism of a [https://en.wikipedia.org/wiki/Symbolic_link symbolic link], a kind of shortcut that allows easy access to these other filesystems from your home directory. Note that these symbolic links may appear up to a few hours after you first connect to the cluster. While your home and scratch spaces are unique to you as an individual user, the project space is a shared by a research group. This group may consist of those individuals with a Compute Canada account sponsored by a particular faculty member or members of a [https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/ RAC allocation]. A given individual may thus have access to several different project spaces, associated with one or more faculty members, with symbolic links to these different project spaces in the directory projects of your home. Every account has one or many projects. In the folder <tt>projects</tt> within their home directory, each user has a link to each of the projects they have access to. For users with a single active sponsored role is the default project of your sponsor while users with more than one active sponsored role will have a default project that corresponds to the default project of the faculty member with the most sponsored accounts.
When your account is created on a Compute Canada cluster, your home directory will not be entirely empty. It will contain references to your scratch and [[Project layout|project]] spaces through the mechanism of a [https://en.wikipedia.org/wiki/Symbolic_link symbolic link], a kind of shortcut that allows easy access to these other filesystems from your home directory. Note that these symbolic links may appear up to a few hours after you first connect to the cluster. While your home and scratch spaces are unique to you as an individual user, the project space is a shared by a research group. This group may consist of those individuals with a Compute Canada account sponsored by a particular faculty member or members of a [https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/ RAC allocation]. A given individual may thus have access to several different project spaces, associated with one or more faculty members, with symbolic links to these different project spaces in the directory projects of your home. Every account has one or many projects. In the folder <tt>projects</tt> within their home directory, each user has a link to each of the projects they have access to. For users with a single active sponsored role is the default project of your sponsor while users with more than one active sponsored role will have a default project that corresponds to the default project of the faculty member with the most sponsored accounts.


<!--T:16-->
<!--T:16-->
All users can check the available disk space and the current disk utilization for the ''project'', ''home'' and ''scratch'' file systems with the command line utility '''''diskusage_report''''', available on both '''Cedar''' and '''Graham'''. To use this utility, log into Cedar or Graham using SSH, at the command prompt type diskusage_report, and press the Enter key. Following is a typical output of this utility:
All users can check the available disk space and the current disk utilization for the ''project'', ''home'' and ''scratch'' file systems with the command line utility '''''diskusage_report''''', available on both Compute Canada clusters. To use this utility, log into the cluster using SSH, at the command prompt type diskusage_report, and press the Enter key. Following is a typical output of this utility:
<pre>
<pre>
# diskusage_report
# diskusage_report
Bureaucrats, cc_docs_admin, cc_staff
2,306

edits

Navigation menu