https://docs.alliancecan.ca/mediawiki/api.php?action=feedcontributions&user=Rptaylor&feedformat=atomAlliance Doc - User contributions [en]2024-03-29T04:40:37ZUser contributionsMediaWiki 1.39.6https://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=150981CVMFS2024-03-08T18:48:28Z<p>Rptaylor: update monitor link</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes the CERN Virtual Machine File System (CVMFS). We use CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access content, and to the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only content distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. It is designed to deliver software in a fast, scalable and reliable fashion, and is now also used to distribute data. The scale of usage across dozens of projects involves ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [https://cvmfs-monitor-frontend.web.cern.ch/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of the software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository;<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0;<br />
** HTTP proxy servers cache requests from clients to stratum 1 servers;<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 2021-09-27 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimized scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community and with wider hardware support.<br />
* [https://indico.cern.ch/event/1079490/contributions/4939532/ 2022-09-13 CVMFS in Canadian Advanced Research Computing] 2022 CernVM Workshop<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Cloud_storage_options&diff=149868Cloud storage options2024-02-05T22:35:59Z<p>Rptaylor: IP-based ACLs are only applicable for S3</p>
<hr />
<div><languages /><br />
<translate><br />
<br />
<!--T:1--><br />
The existing storage types available in our clouds are:<br />
<br />
<!--T:2--><br />
* <b>[[Working_with_volumes | Volume storage]]</b>: The standard storage unit for cloud computing; can be attached to and detached from an instance. <br />
* <b>Ephemeral/Disk storage</b>: Virtual local disk storage tied to the lifecycle of a single instance.<br />
* <b>[[Arbutus object storage | Object storage]]</b>: Non-hierarchical storage where data is created or uploaded in whole-file form.<br />
* <b>[[Arbutus_CephFS | Shared filesystem storage]]</b>: Private network attached storage space (similar to NFS/SMB shares); must be configured on each instance where it is mounted.<br />
<br />
<!--T:3--><br />
Attributes of each storage type are compared in the following table:<br />
<br />
<!--T:4--><br />
{| class="wikitable sortable"<br />
! Attribute !! Volume storage !! Ephemeral/Disk storage !! Object storage !! Shared filesystem storage <br />
|-<br />
| Default storage option || Yes || Yes || No || No<br />
|-<br />
| Can be accessed via a web browser || No || No || Yes || No <br />
|-<br />
| Access can be restricted for specific source IP ranges || N/A || N/A || Yes (S3 ACL) || N/A <br />
|-<br />
| Can be mounted on a single VM || Yes || Yes || No || Yes <br />
|-<br />
| Can be mounted on multiple VMs (and across projects) simultaneously || No || No || No || Yes <br />
|-<br />
| Automatic backups || No (manually with snapshots) || No || No || Yes (nightly to tape)<br />
|-<br />
| Suitable for write once, read only, and public access || No || No || Yes || No <br />
|-<br />
| Suitable for data/files that change frequently || Yes || Yes || No || Yes<br />
|-<br />
| Hierarchical filesystem || Yes || Yes || No || Yes <br />
|-<br />
| Suitable for long-term storage || Yes || No || Yes || Yes <br />
|-<br />
| Suitable mountable dedicated storage for individual servers || Yes || Only for temporary data || No || No <br />
|-<br />
| Deleted automatically upon deletion of VM || No || Yes || No || No <br />
|- <br />
| Standard magnitude of allocation || GB || GB || TB || TB <br />
|- <br />
| Multi-disk fault tolerance || Yes || c-flavors No; p-flavors Yes || Yes || Yes <br />
|- <br />
| Physical disk-level encryption || No || No || No || No <br />
|- <br />
|}<br />
<br />
</translate><br />
[[Category:Cloud]]</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Cloud_storage_options&diff=149867Cloud storage options2024-02-05T22:18:12Z<p>Rptaylor: explain which types are suitable for servers</p>
<hr />
<div><languages /><br />
<translate><br />
<br />
<!--T:1--><br />
The existing storage types available in our clouds are:<br />
<br />
<!--T:2--><br />
* <b>[[Working_with_volumes | Volume storage]]</b>: The standard storage unit for cloud computing; can be attached to and detached from an instance. <br />
* <b>Ephemeral/Disk storage</b>: Virtual local disk storage tied to the lifecycle of a single instance.<br />
* <b>[[Arbutus object storage | Object storage]]</b>: Non-hierarchical storage where data is created or uploaded in whole-file form.<br />
* <b>[[Arbutus_CephFS | Shared filesystem storage]]</b>: Private network attached storage space (similar to NFS/SMB shares); must be configured on each instance where it is mounted.<br />
<br />
<!--T:3--><br />
Attributes of each storage type are compared in the following table:<br />
<br />
<!--T:4--><br />
{| class="wikitable sortable"<br />
! Attribute !! Volume storage !! Ephemeral/Disk storage !! Object storage !! Shared filesystem storage <br />
|-<br />
| Default storage option || Yes || Yes || No || No<br />
|-<br />
| Can be accessed via a web browser || No || No || Yes || No <br />
|-<br />
| Access can be restricted for specific source IP ranges || Yes || Yes || Yes (S3 ACL) || Yes <br />
|-<br />
| Can be mounted on a single VM || Yes || Yes || No || Yes <br />
|-<br />
| Can be mounted on multiple VMs (and across projects) simultaneously || No || No || No || Yes <br />
|-<br />
| Automatic backups || No (manually with snapshots) || No || No || Yes (nightly to tape)<br />
|-<br />
| Suitable for write once, read only, and public access || No || No || Yes || No <br />
|-<br />
| Suitable for data/files that change frequently || Yes || Yes || No || Yes<br />
|-<br />
| Hierarchical filesystem || Yes || Yes || No || Yes <br />
|-<br />
| Suitable for long-term storage || Yes || No || Yes || Yes <br />
|-<br />
| Suitable mountable dedicated storage for individual servers || Yes || Only for temporary data || No || No <br />
|-<br />
| Deleted automatically upon deletion of VM || No || Yes || No || No <br />
|- <br />
| Standard magnitude of allocation || GB || GB || TB || TB <br />
|- <br />
| Multi-disk fault tolerance || Yes || c-flavors No; p-flavors Yes || Yes || Yes <br />
|- <br />
| Physical disk-level encryption || No || No || No || No <br />
|- <br />
|}<br />
<br />
</translate><br />
[[Category:Cloud]]</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=141557Accessing CVMFS2023-07-28T19:28:15Z<p>Rptaylor: mention CVMFS_REPOSITORIES</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer for our 2016 and 2018 environments, and 3.2 or newer for the 2020 environment. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:72--><br />
Refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
For standard client configuration, refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
<!--T:73--><br />
The <tt>soft.computecanada.ca</tt> repository is provided by the default configuration, so no additional steps are required to access it (though you may wish to include it in <tt>CVMFS_REPOSITORIES</tt> in your client configuration).<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* First ensure that the repositories you want to test are listed in <code>CVMFS_REPOSITORIES</code>.<br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on our software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On our systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available externally, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Multifactor_authentication&diff=136370Multifactor authentication2023-05-16T18:47:40Z<p>Rptaylor: fix typo</p>
<hr />
<div><languages /><br />
<br />
<translate><br />
<br />
<!--T:26--><br />
{{Panel<br />
|title=This topic is in testing phase<br />
|panelstyle=draft<br />
|content=<b>This article currently applies only to staff members</b>: Multifactor authentication is still being tested by staff members. It will be made available to all users as an option at some later date. <br />
[[Category:Draft]]<br />
}}<br />
<br />
<!--T:1--><br />
Multifactor authentication (MFA) allows you to protect your account with more than a password. Once your account is configured to use this feature, you will need to enter your username and password as usual, and then perform a second action (the <i>second factor</i>) to access most of our services. <br><br />
<br />
<!--T:21--><br />
You can choose any of these factors for this second authentication step:<br />
*Approving a notification on a smart device through the Duo Mobile application.<br />
*Entering a code generated on demand.<br />
*Pushing a button on a hardware key (YubiKey).<br />
<br />
<!--T:22--><br />
This feature will be progressively deployed, that is, it will not be immediately available for all our services.<br />
<br />
= Registering factors = <!--T:2--><br />
== Registering multiple factors ==<br />
When you enable multifactor authentication for your account, we <b>strongly recommend</b> that you configure at least two options for your second factor. For example, you can use a phone and single-use codes; a phone and a hardware key; or two hardware keys. This will ensure that if you lose one factor, you can still use your other one to access your account.<br />
<br />
== To use a smartphone or tablet == <!--T:3--><br />
#Install the Duo Mobile authentication application from the [https://itunes.apple.com/us/app/duo-mobile/id422663827 Apple Store] or on [https://play.google.com/store/apps/details?id=com.duosecurity.duomobile Google Play]<br />
#Go to the [https://ccdb.alliancecan.ca CCDB], connect to your account and select <i>My account → [https://ccdb.alliancecan.ca/multi_factor_authentications Multifactor authentication management]</i>.<br />
#Under <i>Register a device</i>, click on <i>Duo Mobile</i>.<br />
#Enter a name for your device.<br />
#In the Duo Mobile application, click the "+" sign to add a new account, and scan the QR code that is shown to you.<br />
<br />
== To use a YubiKey == <!--T:4--><br />
A YubiKey is a hardware token made by the [https://www.yubico.com/ Yubico] company. If you do not have a smartphone or tablet, do not wish to use your phone or tablet for multifactor authentication, or are often in a situation when using your phone or tablet is not possible, then a YubiKey is your best option.<br />
<br />
<!--T:23--><br />
A YubiKey is the size of a small USB stick and costs between $50 and $100. Different models can fit in USB-A, USB-C, or Lightning ports, and some also support near-field communication (NFC) for use with a phone or tablet.<br />
<br />
<!--T:5--><br />
Among the many protocols supported by YubiKeys, the one which works with SSH connections to our clusters is the Yubico One-Time Password (OTP). After you have registered a YubiKey for multifactor authentication, when you log in to one of our clusters you will be prompted for a one-time password (OTP). You respond by touching a button on your YubiKey, which generates and transmits a string of 32 characters to complete your authentication.<br />
<br />
<!--T:6--><br />
To register your YubiKey you will need its Public ID, Private ID, and Secret Key. If you have this information, go to the [https://ccdb.computecanada.ca/multi_factor_authentications Multifactor authentication management page]. If you do not have this information, configure your key using the steps below.<br />
<br />
=== Configuring your YubiKey for Yubico OTP === <!--T:7--><br />
<br />
<!--T:8--><br />
# Download and install the YubiKey Manager software from the [https://www.yubico.com/support/download/yubikey-manager/ Yubico website].<br />
# Insert your YubiKey and launch the YubiKey Manager software.<br />
# In the YubiKey Manager software, select <i>Applications</i>, then <i>OTP</i>. (Images below illustrate this and the next few steps.)<br />
# Select <i>Configure</i> for either slot 1 or slot 2. Slot 1 corresponds to a short touch (pressing for 1s to 2.5), while slot 2 is a long touch on the key (pressing for 3s to 5s). Slot 1 is typically pre-registered for Yubico cloud mode. If you are already using this slot for other services, either use slot 2, or click on <i>Swap</i> to transfer the configuration to slot 2 before configuring slot 1. <br />
# Select <i>Yubico OTP</i>.<br />
# Select <i>Use serial</i>, then generate a private ID and a secret key. <b>Securely save a copy of the data in the Public ID, Private ID, and Secret Key fields before you click on <i>Finish</i>, as you will need the data for the next step.</b><br />
# <b>IMPORTANT: Make sure you clicked on "Finish" in the previous step.</b><br />
# Log into the CCDB to register your YubiKey in the <i>[https://ccdb.alliancecan.ca/multi_factor_authentications Multifactor authentication management page]</i>.<br />
<gallery widths=300px heights=300px><br />
File:Yubico Manager OTP.png|Step 3<br />
File:Yubico Manager OTP configuration.png|Step 4<br />
File:Select Yubico OTP.png|Step 5<br />
File:Generate Yubikey IDs.png|Step 6, Step 7<br />
CCDB Yubikeys.png|Step 8<br />
</gallery><br />
<br />
=== Configuring your YubiKey for Yubico OTP using the Command Line (<code>ykman</code>)=== <!--T:27--><br />
# Install the command line YubiKey Manager software (<code>ykman</code>) following instructions for your OS from Yubico's [https://docs.yubico.com/software/yubikey/tools/ykman/Install_ykman.html#download-ykman ykman guide].<br />
# Insert your YubiKey and read key information with the command <code>ykman info</code>.<br />
# Read OTP information with the command <code>ykman otp info</code>.<br />
# Select the slot you wish to program and use the command <code>ykman otp yubiotp</code> to program it.<br />
# <b>Securely save a copy of the data in the Public ID, Private ID, and Secret Key fields. You will need the data for the next step.</b><br />
# Log into the CCDB to register your YubiKey in the <i>[https://ccdb.alliancecan.ca/multi_factor_authentications Multifactor authentication management page]</i>.<br />
<br />
<!--T:28--><br />
:<source lang="console"><br />
[name@yourLaptop]$ ykman otp yubiotp -uGgP vvcccctffclk 2<br />
Using a randomly generated private ID: bc3dd98eaa12<br />
Using a randomly generated secret key: ae012f11bc5a00d3cac00f1d57aa0b12<br />
Upload credential to YubiCloud? [y/N]: y<br />
Upload to YubiCloud initiated successfully.<br />
Program an OTP credential in slot 2? [y/N]: y<br />
Opening upload form in browser: https://upload.yubico.com/proceed/4567ad02-c3a2-1234-a1c3-abe3f4d21c69<br />
</source><br />
<br />
= Using your second factor = <!--T:9--><br />
== When connecting via SSH == <br />
If your account has multifactor authentication enabled, when you connect via SSH to a cluster which supports MFA, you will be prompted to use your second factor after you first use either your password or your [[SSH Keys|SSH key]]. This prompt will look like this:<br />
{{Command|ssh cluster.computecanada.ca<br />
|result= Duo two-factor login for name<br />
<br />
<!--T:10--><br />
Enter a passcode or select one of the following options:<br />
<br />
<!--T:11--><br />
1. Duo Push to My phone (iOS)<br />
<br />
<!--T:12--><br />
Passcode or option (1-1):}}<br />
At this point, you can select which phone or tablet you want Duo to send a notification to. If you have multiple devices enrolled, you will be shown a list. You will then get a notification on your device, which you accept to complete the authentication.<br />
<br />
<!--T:13--><br />
If you are using a YubiKey, a backup code, or if you prefer to enter the time-based one-time password that the Duo Mobile application shows, you would write these instead of selecting an option. For example:<br />
{{Command|ssh cluster.computecanada.ca<br />
|result= Duo two-factor login for name<br />
<br />
<!--T:14--><br />
Enter a passcode or select one of the following options:<br />
<br />
<!--T:15--><br />
1. Duo Push to My phone (iOS)<br />
<br />
<!--T:16--><br />
Passcode or option (1-1):vvcccbhbllnuuebegkkbcfdftndjijlneejilrgiguki<br />
Success. Logging you in...}}<br />
<br />
=== Configuring your SSH client to only ask every so often === <!--T:17--><br />
If you use OpenSSH to connect, you can reduce the frequency with which you are asked for a second factor. To do so, edit your <code>.ssh/config</code> to add the lines:<br />
<br />
<!--T:24--><br />
<pre><br />
Host HOSTNAME<br />
ControlPath ~/.ssh/cm-%r@%h:%p<br />
ControlMaster auto<br />
ControlPersist 10m<br />
</pre><br />
where you would replace <code>HOSTNAME</code> with the host name of the server for which you want this configuration.<br />
<br />
== When authenticating to our account portal == <!--T:18--><br />
Once multifactor authentication is enabled on your account, you will be required to use it when connecting to our account portal. After entering your username and password, you will see a prompt similar to this, where you click on the option you want to use. <br><br />
(Note: <i>This screen will be updated</i>.)<br />
<gallery widths=300px heights=300px><br />
File:CCDB MFA prompt.png<br />
</gallery><br />
<br />
= Frequently asked questions = <!--T:19--><br />
== I have an Android phone which is older than Android 9. I do not find the Duo Mobile application. Can I still use Duo ? ==<br />
Yes. However, you have to download the application from the Duo website. See [https://help.duo.com/s/article/2211?language=en_US this page] for more details. <br />
<br />
== I do not have a smartphone or tablet, or they are too old. Can I still use multifactor authentication? == <!--T:25--><br />
Yes. In this case, you need [[#To use a YubiKey|to use a YubiKey]].<br />
<br />
== I have lost my second factor device. What can I do? == <!--T:20--><br />
* If you have backup codes, or if you have more than one device, use that other mechanism to connect to your account on our [https://ccdb.alliancecan.ca/multi_factor_authentications account portal], and then delete your lost device from the list. Then, register a new device. <br />
* If you do not have backup codes, or if have lost all of your devices, contact [[technical support]] for assistance.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Cloud_resources&diff=124463Cloud resources2022-11-30T21:47:01Z<p>Rptaylor: /* Arbutus cloud */ revert - persistent nodes are 10x CPU oversubscribed</p>
<hr />
<div><languages/><br />
<translate><br />
''Parent page: [[Cloud]]''<br />
==Hardware== <!--T:1--><br />
===Arbutus cloud===<br />
Address: [https://arbutus.cloud.computecanada.ca arbutus.cloud.computecanada.ca]<br />
<br />
<!--T:13--><br />
{| class="wikitable sortable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 156 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 384 ||2 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0 RAID0] || 1 x 25GbE || N/A || 6,240 || 12,480<br />
|-<br />
| 8 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 1024 ||2 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1] || 1 x 25GbE || N/A || 320 || 6,400<br />
|-<br />
| 26 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 384 ||2 x 1.6TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0 RAID0] || 1 x 25GbE || 4 x [https://www.nvidia.com/en-us/data-center/v100/ V100 32GB] || 1,040 || 2,080<br />
|-<br />
| 32 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 256 ||6 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#Nested_RAID RAID10] || 1 x 10GbE || N/A || 1,024 || 2,048<br />
|-<br />
| 4 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 768 ||6 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#Nested_RAID RAID10] || 2 x 10GbE || N/A || 128 || 2,560<br />
|-<br />
| 8 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 256 ||4 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5 RAID5] || 1 x 10GbE || N/A || 256 || 512<br />
|-<br />
| 240 || 2 x [https://ark.intel.com/products/91754/Intel-Xeon-Processor-E5-2680-v4-35M-Cache-2_40-GHz E5-2680 v4] || 256 ||4 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5 RAID5] || 1 x 10GbE || N/A || 6,720 || 13,440<br />
|-<br />
| 8 || 2 x E5-2680 v4 || 512 || 4 x 900GB 10k SAS in RAID5 || 2 x 10GbE || N/A || 224 || 4,480<br />
|-<br />
| 2 || 2 x E5-2680 v4 || 128 || 4 x 900GB 10k SAS in RAID5 || 1 x 10GbE || 2 x [https://www.nvidia.com/en-us/data-center/tesla-k80/ Tesla K80] || 56 || 112<br />
|}<br />
Location: University of Victoria<br/><br />
Total CPUs: 16,008 (484 nodes)<br/><br />
Total vCPUs: 44,112<br/><br />
Total GPUs: 108 (28 nodes)<br/><br />
Total RAM: 157,184 GB<br/><br />
5.3 PB of Volume and Snapshot [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage.<br /><br />
12 PB of Object/Shared Filesystem [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage.<br /><br />
<br />
===Cedar cloud=== <!--T:3--><br />
Address: [http://cedar.cloud.computecanada.ca cedar.cloud.computecanada.ca]<br />
<br />
<!--T:14--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 28 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/91766/intel-xeon-processor-e5-2683-v4-40m-cache-2-10-ghz.html E5-2683 v4] || 256 || 2 x 480GB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1]|| 1 x 10GbE || N/A || 896 || 1,792<br />
|-<br />
| 4 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/91766/intel-xeon-processor-e5-2683-v4-40m-cache-2-10-ghz.html E5-2683 v4] || 256 || 2 x 480GB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1]|| 1 x 10GbE || N/A || 128 || 256<br />
|}<br />
Location: Simon Fraser University<br/><br />
Total CPUs: 1,024<br/><br />
Total vCPUs: 2,048<br/><br />
Total RAM: 7,680 GB<br/><br />
500 TB of persistent [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
===Graham cloud=== <!--T:8--><br />
Address: [https://graham.cloud.computecanada.ca graham.cloud.computecanada.ca]<br />
<br />
<!--T:15--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUS<br />
|-<br />
| 6 || 2 x E5-2683 v4 || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 192 || <br />
|-<br />
| 2 || 2 x E5-2683 v4 || 512 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 64 || <br />
|-<br />
| 8 || 2 x E5-2637 v4 || 128 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 256 ||<br />
|-<br />
| 8 || 2 x Xeon(R) Gold 6130 CPU || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 256 ||<br />
|-<br />
| 3 || 2 x E5-2640 v4 || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 120 ||<br />
|-<br />
| 12 || 2 x Xeon(R) Gold 6248 CPU || 768 || 2x 1TB SSD in RAID0 || 1 x 10GbE || N/A || 480 || <br />
|-<br />
|}<br />
Location: University of Waterloo<br/><br />
Total CPUs: 1,368<br/><br />
Total vCPUs: <br/><br />
Total RAM: 15,616 GB<br/><br />
84 TB of persistent [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
===Béluga cloud=== <!--T:12--><br />
Address: [https://beluga.cloud.computecanada.ca beluga.cloud.computecanada.ca]<br />
<br />
<!--T:16--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 96 || 2 x Intel Xeon Gold 5218 || 256 || N/A, ephemeral storage in ceph || 1 x 25GbE || N/A || 3,072 || 6,144<br />
|-<br />
| 16 || 2 x Intel Xeon Gold 5218 || 768 || N/A, ephemeral storage in ceph || 1 x 25GbE || N/A || 512 || 1,024<br />
|-<br />
|}<br />
Location: École de Technologie Supérieure<br/><br />
Total CPUs: 3,584<br/><br />
Total vCPUs: 7,168<br/><br />
Total RAM: 36,864 GiB<br/><br />
200 TiB of replicated persistent SSD [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
1.7 PiB of erasure coded persistent HDD [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
==Software== <!--T:2--><br />
Compute Canada cloud OpenStack platform versions as of March 11, 2021<br/><br />
* Arbutus: Ussuri<br />
* Cedar: Train<br />
* Graham: Ussuri<br />
* Béluga: Victoria<br />
<br />
<br />
<br />
<!--T:4--><br />
See the [http://releases.openstack.org/ OpenStack releases] for a list of all OpenStack versions.<br />
<br />
==Images== <!--T:9--><br />
Below are the images provided by Compute Canada staff on the Compute Canada Clouds. New images will be added periodically as new releases and updates become available. As releases have an end of life (EOL) after which support and updates are no longer provided, we encourage you to migrate systems and platforms to newer releases in order to continue receiving patches and security updates. The EOL dates listed in the table are the dates at which these images will be removed from the Compute Canada clouds.<br />
<br />
<!--T:10--><br />
For more details about using images see [[Working_with_images|working with images]].<br />
<br />
<!--T:11--><br />
{| class="wikitable sortable" style="width:85%"<br />
! style="width: 15%" align="center" | Name<br />
! style="width: 25%" align="center" | Cloud<br />
! style="width: 15%" align="center" | End Of Life<br />
|-<br />
| align="center" | CentOS-8-x64-2019-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''Dec 31, 2021''' <ref name="accelerated">Accelerated end-of-life for CentOS 8 [https://blog.centos.org/2020/12/future-is-centos-stream/ announced Dec 2020].</ref><br />
|-<br />
| align="center" | CentOS-7-x64-2019-07<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | March 31, 2024<br />
|-<br />
| align="center" | CentOS-6-x64-2019-07<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020''' <ref name="removed">These images have been removed from Compute Canada clouds.</ref><br />
|-<br />
| align="center" | Debian-10.6.2-Buster-x64-2020-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | June 30, 2023<br />
|-<br />
| align="center" | Debian-10.2.0-Buster-2019-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | March 31, 2022<br />
|-<br />
| align="center" | Debian-9.11.6-Stretch-2019-12<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
| align="center" | Fedora-33-1.2-x64-2020-10<br />
| align="center" | Arbutus<br />
| align="center" | TBD Fedora 33 will be maintained until four weeks after the release of Fedora 35<br />
|-<br />
| align="center" | Fedora-32-1.6-x64-2020-04<br />
| align="center" | Arbutus<br />
| align="center" | May 18, 2021<br />
|-<br />
| align="center" | Fedora-31-1.9-x64-2020-01<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | November 24, 2020<br />
|-<br />
| align="center" | Fedora-30-1.2-x86-2019-07<br />
| align="center" | Arbutus, East<br />
| align="center" | May 26, 2020<br />
|-<br />
| align="center" | Ubuntu-20.04-Focal-minimal-x64-2020-12<br />
| align="center" | Arbutus, East, Graham<br />
| align="center" | June 30, 2024<br />
|-<br />
| align="center" | Ubuntu-20.04-Focal-x64-2020-12<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | June 30, 2024<br />
|-<br />
| align="center" | Ubuntu-18.04.3-Bionic-minimal-x64-2020-01<br />
| align="center" | Arbutus<br />
| align="center" | March 31, 2023<br />
|-<br />
| align="center" | Ubuntu-18.04.3-Bionic-x64-2020-01<br />
| align="center" | Arbutus<br />
| align="center" | March 31, 2023<br />
|-<br />
| align="center" | Ubuntu-16.04.6-Xenial-minimal-x64-2020-01<br />
| align="center" | Arbutus, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
| align="center" | Ubuntu-16.04.6-Xenial-x64-2020-01<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
|}<br />
<br />
<!--T:6--><br />
[[Category:CC-Cloud]]<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Cloud_resources&diff=124451Cloud resources2022-11-30T21:40:00Z<p>Rptaylor: /* Arbutus cloud */ fix typo</p>
<hr />
<div><languages/><br />
<translate><br />
''Parent page: [[Cloud]]''<br />
==Hardware== <!--T:1--><br />
===Arbutus cloud===<br />
Address: [https://arbutus.cloud.computecanada.ca arbutus.cloud.computecanada.ca]<br />
<br />
<!--T:13--><br />
{| class="wikitable sortable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 156 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 384 ||2 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0 RAID0] || 1 x 25GbE || N/A || 6,240 || 12,480<br />
|-<br />
| 8 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 1024 ||2 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1] || 1 x 25GbE || N/A || 320 || 640<br />
|-<br />
| 26 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 384 ||2 x 1.6TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0 RAID0] || 1 x 25GbE || 4 x [https://www.nvidia.com/en-us/data-center/v100/ V100 32GB] || 1,040 || 2,080<br />
|-<br />
| 32 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 256 ||6 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#Nested_RAID RAID10] || 1 x 10GbE || N/A || 1,024 || 2,048<br />
|-<br />
| 4 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 768 ||6 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#Nested_RAID RAID10] || 2 x 10GbE || N/A || 128 || 2,560<br />
|-<br />
| 8 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 256 ||4 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5 RAID5] || 1 x 10GbE || N/A || 256 || 512<br />
|-<br />
| 240 || 2 x [https://ark.intel.com/products/91754/Intel-Xeon-Processor-E5-2680-v4-35M-Cache-2_40-GHz E5-2680 v4] || 256 ||4 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5 RAID5] || 1 x 10GbE || N/A || 6,720 || 13,440<br />
|-<br />
| 8 || 2 x E5-2680 v4 || 512 || 4 x 900GB 10k SAS in RAID5 || 2 x 10GbE || N/A || 224 || 4,480<br />
|-<br />
| 2 || 2 x E5-2680 v4 || 128 || 4 x 900GB 10k SAS in RAID5 || 1 x 10GbE || 2 x [https://www.nvidia.com/en-us/data-center/tesla-k80/ Tesla K80] || 56 || 112<br />
|}<br />
Location: University of Victoria<br/><br />
Total CPUs: 16,008 (484 nodes)<br/><br />
Total vCPUs: 44,112<br/><br />
Total GPUs: 108 (28 nodes)<br/><br />
Total RAM: 157,184 GB<br/><br />
5.3 PB of Volume and Snapshot [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage.<br /><br />
12 PB of Object/Shared Filesystem [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage.<br /><br />
<br />
===Cedar cloud=== <!--T:3--><br />
Address: [http://cedar.cloud.computecanada.ca cedar.cloud.computecanada.ca]<br />
<br />
<!--T:14--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 28 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/91766/intel-xeon-processor-e5-2683-v4-40m-cache-2-10-ghz.html E5-2683 v4] || 256 || 2 x 480GB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1]|| 1 x 10GbE || N/A || 896 || 1,792<br />
|-<br />
| 4 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/91766/intel-xeon-processor-e5-2683-v4-40m-cache-2-10-ghz.html E5-2683 v4] || 256 || 2 x 480GB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1]|| 1 x 10GbE || N/A || 128 || 256<br />
|}<br />
Location: Simon Fraser University<br/><br />
Total CPUs: 1,024<br/><br />
Total vCPUs: 2,048<br/><br />
Total RAM: 7,680 GB<br/><br />
500 TB of persistent [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
===Graham cloud=== <!--T:8--><br />
Address: [https://graham.cloud.computecanada.ca graham.cloud.computecanada.ca]<br />
<br />
<!--T:15--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUS<br />
|-<br />
| 6 || 2 x E5-2683 v4 || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 192 || <br />
|-<br />
| 2 || 2 x E5-2683 v4 || 512 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 64 || <br />
|-<br />
| 8 || 2 x E5-2637 v4 || 128 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 256 ||<br />
|-<br />
| 8 || 2 x Xeon(R) Gold 6130 CPU || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 256 ||<br />
|-<br />
| 3 || 2 x E5-2640 v4 || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 120 ||<br />
|-<br />
| 12 || 2 x Xeon(R) Gold 6248 CPU || 768 || 2x 1TB SSD in RAID0 || 1 x 10GbE || N/A || 480 || <br />
|-<br />
|}<br />
Location: University of Waterloo<br/><br />
Total CPUs: 1,368<br/><br />
Total vCPUs: <br/><br />
Total RAM: 15,616 GB<br/><br />
84 TB of persistent [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
===Béluga cloud=== <!--T:12--><br />
Address: [https://beluga.cloud.computecanada.ca beluga.cloud.computecanada.ca]<br />
<br />
<!--T:16--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 96 || 2 x Intel Xeon Gold 5218 || 256 || N/A, ephemeral storage in ceph || 1 x 25GbE || N/A || 3,072 || 6,144<br />
|-<br />
| 16 || 2 x Intel Xeon Gold 5218 || 768 || N/A, ephemeral storage in ceph || 1 x 25GbE || N/A || 512 || 1,024<br />
|-<br />
|}<br />
Location: École de Technologie Supérieure<br/><br />
Total CPUs: 3,584<br/><br />
Total vCPUs: 7,168<br/><br />
Total RAM: 36,864 GiB<br/><br />
200 TiB of replicated persistent SSD [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
1.7 PiB of erasure coded persistent HDD [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
==Software== <!--T:2--><br />
Compute Canada cloud OpenStack platform versions as of March 11, 2021<br/><br />
* Arbutus: Ussuri<br />
* Cedar: Train<br />
* Graham: Ussuri<br />
* Béluga: Victoria<br />
<br />
<br />
<br />
<!--T:4--><br />
See the [http://releases.openstack.org/ OpenStack releases] for a list of all OpenStack versions.<br />
<br />
==Images== <!--T:9--><br />
Below are the images provided by Compute Canada staff on the Compute Canada Clouds. New images will be added periodically as new releases and updates become available. As releases have an end of life (EOL) after which support and updates are no longer provided, we encourage you to migrate systems and platforms to newer releases in order to continue receiving patches and security updates. The EOL dates listed in the table are the dates at which these images will be removed from the Compute Canada clouds.<br />
<br />
<!--T:10--><br />
For more details about using images see [[Working_with_images|working with images]].<br />
<br />
<!--T:11--><br />
{| class="wikitable sortable" style="width:85%"<br />
! style="width: 15%" align="center" | Name<br />
! style="width: 25%" align="center" | Cloud<br />
! style="width: 15%" align="center" | End Of Life<br />
|-<br />
| align="center" | CentOS-8-x64-2019-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''Dec 31, 2021''' <ref name="accelerated">Accelerated end-of-life for CentOS 8 [https://blog.centos.org/2020/12/future-is-centos-stream/ announced Dec 2020].</ref><br />
|-<br />
| align="center" | CentOS-7-x64-2019-07<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | March 31, 2024<br />
|-<br />
| align="center" | CentOS-6-x64-2019-07<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020''' <ref name="removed">These images have been removed from Compute Canada clouds.</ref><br />
|-<br />
| align="center" | Debian-10.6.2-Buster-x64-2020-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | June 30, 2023<br />
|-<br />
| align="center" | Debian-10.2.0-Buster-2019-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | March 31, 2022<br />
|-<br />
| align="center" | Debian-9.11.6-Stretch-2019-12<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
| align="center" | Fedora-33-1.2-x64-2020-10<br />
| align="center" | Arbutus<br />
| align="center" | TBD Fedora 33 will be maintained until four weeks after the release of Fedora 35<br />
|-<br />
| align="center" | Fedora-32-1.6-x64-2020-04<br />
| align="center" | Arbutus<br />
| align="center" | May 18, 2021<br />
|-<br />
| align="center" | Fedora-31-1.9-x64-2020-01<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | November 24, 2020<br />
|-<br />
| align="center" | Fedora-30-1.2-x86-2019-07<br />
| align="center" | Arbutus, East<br />
| align="center" | May 26, 2020<br />
|-<br />
| align="center" | Ubuntu-20.04-Focal-minimal-x64-2020-12<br />
| align="center" | Arbutus, East, Graham<br />
| align="center" | June 30, 2024<br />
|-<br />
| align="center" | Ubuntu-20.04-Focal-x64-2020-12<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | June 30, 2024<br />
|-<br />
| align="center" | Ubuntu-18.04.3-Bionic-minimal-x64-2020-01<br />
| align="center" | Arbutus<br />
| align="center" | March 31, 2023<br />
|-<br />
| align="center" | Ubuntu-18.04.3-Bionic-x64-2020-01<br />
| align="center" | Arbutus<br />
| align="center" | March 31, 2023<br />
|-<br />
| align="center" | Ubuntu-16.04.6-Xenial-minimal-x64-2020-01<br />
| align="center" | Arbutus, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
| align="center" | Ubuntu-16.04.6-Xenial-x64-2020-01<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
|}<br />
<br />
<!--T:6--><br />
[[Category:CC-Cloud]]<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Arbutus_object_storage&diff=121175Arbutus object storage2022-10-25T20:51:52Z<p>Rptaylor: clean up URL</p>
<hr />
<div><languages /><br />
<translate><br />
<br />
= Introduction = <!--T:1--><br />
<br />
<!--T:27--><br />
Object storage is a storage facility that is simpler than a normal hierarchical filesystem, but benefits by avoiding some performance bottlenecks.<br />
<br />
<!--T:28--><br />
An object is a fixed file in a flat namespace: you can create/upload an object as a whole, but cannot modify bytes within it. Objects are named as bucket:tag with no further nesting. Since bucket operations are basically whole-file, the provider can use a simpler internal representation. The flat namespace allows the provider to avoid metadata bottlenecks; it's basically a key-value store.<br />
<br />
<!--T:29--><br />
The best use of object storage is to store and export items which do not need hierarchical naming; are accessed mostly atomically and mostly read-only; and with simplified access-control rules.<br />
<br />
<!--T:2--><br />
All Arbutus projects are allocated a default 1TB of Object Store. If more is required, you can either apply for a RAS allocation or a RAC allocation. <br />
<br />
<!--T:3--><br />
We offer access to the Object Store via two different protocols: Swift or S3.<br />
<br />
<!--T:5--><br />
These protocols are very similar and in most situations you can use whichever you like. You don't have to commit to one, as buckets and objects created with Swift or S3 can be accessed using both protocols. There are a few key differences in the context of Arbutus Object Store.<br />
<br />
<!--T:6--><br />
Swift is given by default and is simpler since you do not have to manage credentials yourself. Access is governed using your Arbutus account. However, Swift does not replicate all the functionality of S3. The main use case here is when you want to manage your buckets using bucket policies you must use S3 as Swift does not support bucket policies. You can also create and manage your own keys using S3, which could be useful if you for example want to create a read-only user for a specific application. A full list of Swift/S3 compatibility can be found here: <br />
<br />
<!--T:7--><br />
https://docs.openstack.org/swift/latest/s3_compat.html<br />
<br />
= Accessing and managing Object Store = <!--T:8--><br />
<br />
<!--T:10--><br />
You can manage your object storage using the Object Store tab for your project at https://arbutus.cloud.computecanada.ca/. This interface refers to buckets as containers (not to be confused with containers based on namespace functionality of the Linux kernel). You can create containers (AKA buckets) in this interface, upload files, and create directories. Containers can also be created using S3-compatible CLI clients. <br />
Please note that if you create a new container as ''Public'', any object placed within this container can be freely accessed (read-only) by anyone on the internet simply by navigating to <code><nowiki>https://object-arbutus.cloud.computecanada.ca/<YOUR CONTAINER NAME HERE>/<YOUR OBJECT NAME HERE></nowiki></code> with your container and object names inserted in place.<br />
<br />
<!--T:12--><br />
You can also use the OpenStack command line client.<br />
For instructions on how to install and operate the OpenStack command line clients, see [[OpenStack Command Line Clients]].<br />
<br />
<!--T:13--><br />
To generate your own S3 access ID and secret key for the S3 protocol, use the OpenStack command line client:<br />
<br />
<!--T:14--><br />
<code>openstack ec2 credentials create</code><br />
<br />
<!--T:15--><br />
The <tt>s3cmd</tt> tool which is available in Linux is the preferred way to access our S3 gateway; however there are [[Arbutus Object Storage Clients|other tools]] out there that will also work.<br />
<br />
<!--T:16--><br />
The users are responsible for operations inside the ''tenant''. As such, the buckets and management of those buckets are up to the user. <br />
<br />
=== General information === <!--T:17--><br />
<br />
<!--T:18--><br />
* Buckets are owned by the user who creates them, and no other user can manipulate them.<br />
* You can make a bucket accessible to the world, which then gives you a URL to share that will serve content from the bucket.<br />
* Bucket names must be unique across '''all''' users in the Object Store, so you may benefit by prefixing each bucket with your project name to maintain uniqueness. In other words, don't bother trying to create a bucket named ''test'', but ''def-myname-test'' is probably OK.<br />
* Bucket policies are managed via json files.<br />
<br />
= Connection details and s3cmd Configuration = <!--T:19--><br />
<br />
<!--T:20--><br />
Object storage is accessible via an HTTPS endpoint:<br />
<br />
<!--T:21--><br />
<code>object-arbutus.cloud.computecanada.ca:443</code><br />
<br />
<!--T:22--><br />
The following is an example of a minimal s3cmd configuration file. You will need these values, but are free to explore additional s3cmd configuration options to fit your use case. Note that in the example the keys are redacted and you will need to replace them with your provided key values:<br />
<br />
<!--T:23--><br />
<pre>[default]<br />
access_key = <redacted><br />
check_ssl_certificate = True<br />
check_ssl_hostname = True<br />
host_base = object-arbutus.cloud.computecanada.ca<br />
host_bucket = object-arbutus.cloud.computecanada.ca<br />
secret_key = <redacted><br />
use_https = True<br />
</pre><br />
<br />
<!--T:24--><br />
Using s3cmd's <code>--configure</code> feature is [[Arbutus_Object_Storage_Clients#Configuring_s3cmd | described here]].<br />
<br />
= Example operations on a bucket = <!--T:25--><br />
<br />
<!--T:26--><br />
<ul><br />
<li><p>Make a bucket public so that it is web accessible:</p><br />
<p><code>s3cmd setacl s3://testbucket --acl-public</code></p></li><br />
<li><p>Make the bucket private again:</p><br />
<p><code>s3cmd setacl s3://testbucket --acl-private</code></p></li><br />
<li><p>Example bucket policy:</p><br />
<p>You need to first create a policy json file:</p><br />
<pre>{<br />
&quot;Version&quot;: &quot;2012-10-17&quot;,<br />
&quot;Statement&quot;: [{<br />
&quot;Effect&quot;: &quot;Allow&quot;,<br />
&quot;Principal&quot;: {&quot;AWS&quot;: [<br />
&quot;arn:aws:iam::rrg_cjhuofw:user/parsa7&quot;,<br />
&quot;arn:aws:iam::rrg_cjhuofw:user/dilbar&quot;<br />
]},<br />
&quot;Action&quot;: [<br />
&quot;s3:ListBucket&quot;,<br />
&quot;s3:PutObject&quot;,<br />
&quot;s3:DeleteObject&quot;,<br />
&quot;s3:GetObject&quot;<br />
],<br />
&quot;Resource&quot;: [<br />
&quot;arn:aws:s3:::testbucket/*&quot;,<br />
&quot;arn:aws:s3:::testbucket&quot;<br />
]<br />
}]<br />
}<br />
</pre><br />
<p>This file allows you to set specific permissions for any number of users of that bucket.</p><br />
<p>You can even specify users from another tenant if there is a user from another project working with you.</p><br />
<p>Now that you have your policy file, you can implement that policy on the bucket:</p><br />
<p><code>s3cmd setpolicy testbucket.policy s3://testbucket</code></p><br />
<p>More extensive examples and actions can be found here: https://www.linode.com/docs/platform/object-storage/how-to-use-object-storage-acls-and-bucket-policies/</p></li></ul><br />
</translate><br />
[[Category:CC-Cloud]]</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Arbutus_object_storage&diff=121174Arbutus object storage2022-10-25T20:48:07Z<p>Rptaylor: clarify and reorder</p>
<hr />
<div><languages /><br />
<translate><br />
<br />
= Introduction = <!--T:1--><br />
<br />
<!--T:27--><br />
Object storage is a storage facility that is simpler than a normal hierarchical filesystem, but benefits by avoiding some performance bottlenecks.<br />
<br />
<!--T:28--><br />
An object is a fixed file in a flat namespace: you can create/upload an object as a whole, but cannot modify bytes within it. Objects are named as bucket:tag with no further nesting. Since bucket operations are basically whole-file, the provider can use a simpler internal representation. The flat namespace allows the provider to avoid metadata bottlenecks; it's basically a key-value store.<br />
<br />
<!--T:29--><br />
The best use of object storage is to store and export items which do not need hierarchical naming; are accessed mostly atomically and mostly read-only; and with simplified access-control rules.<br />
<br />
<!--T:2--><br />
All Arbutus projects are allocated a default 1TB of Object Store. If more is required, you can either apply for a RAS allocation or a RAC allocation. <br />
<br />
<!--T:3--><br />
We offer access to the Object Store via two different protocols: Swift or S3.<br />
<br />
<!--T:5--><br />
These protocols are very similar and in most situations you can use whichever you like. You don't have to commit to one, as buckets and objects created with Swift or S3 can be accessed using both protocols. There are a few key differences in the context of Arbutus Object Store.<br />
<br />
<!--T:6--><br />
Swift is given by default and is simpler since you do not have to manage credentials yourself. Access is governed using your Arbutus account. However, Swift does not replicate all the functionality of S3. The main use case here is when you want to manage your buckets using bucket policies you must use S3 as Swift does not support bucket policies. You can also create and manage your own keys using S3, which could be useful if you for example want to create a read-only user for a specific application. A full list of Swift/S3 compatibility can be found here: <br />
<br />
<!--T:7--><br />
https://docs.openstack.org/swift/latest/s3_compat.html<br />
<br />
= Accessing and managing Object Store = <!--T:8--><br />
<br />
<!--T:10--><br />
You can manage your object storage using the Object Store tab for your project at https://arbutus.cloud.computecanada.ca/. This interface refers to buckets as containers (not to be confused with containers based on namespace functionality of the Linux kernel). You can create containers (AKA buckets) in this interface, upload files, and create directories. Containers can also be created using S3-compatible CLI clients. Please note that if you create a new container as ''Public'', any object placed within this container can be freely accessed (read-only) by anyone on the internet simply by navigating to the following URL with your container and object names inserted in place:<br />
<br />
<!--T:11--><br />
https://object-arbutus.cloud.computecanada.ca/<YOUR CONTAINER NAME HERE>/<YOUR OBJECT NAME HERE><br />
<br />
<!--T:12--><br />
You can also use the OpenStack command line client.<br />
For instructions on how to install and operate the OpenStack command line clients, see [[OpenStack Command Line Clients]].<br />
<br />
<!--T:13--><br />
To generate your own S3 access ID and secret key for the S3 protocol, use the OpenStack command line client:<br />
<br />
<!--T:14--><br />
<code>openstack ec2 credentials create</code><br />
<br />
<!--T:15--><br />
The <tt>s3cmd</tt> tool which is available in Linux is the preferred way to access our S3 gateway; however there are [[Arbutus Object Storage Clients|other tools]] out there that will also work.<br />
<br />
<!--T:16--><br />
The users are responsible for operations inside the ''tenant''. As such, the buckets and management of those buckets are up to the user. <br />
<br />
=== General information === <!--T:17--><br />
<br />
<!--T:18--><br />
* Buckets are owned by the user who creates them, and no other user can manipulate them.<br />
* You can make a bucket accessible to the world, which then gives you a URL to share that will serve content from the bucket.<br />
* Bucket names must be unique across '''all''' users in the Object Store, so you may benefit by prefixing each bucket with your project name to maintain uniqueness. In other words, don't bother trying to create a bucket named ''test'', but ''def-myname-test'' is probably OK.<br />
* Bucket policies are managed via json files.<br />
<br />
= Connection details and s3cmd Configuration = <!--T:19--><br />
<br />
<!--T:20--><br />
Object storage is accessible via an HTTPS endpoint:<br />
<br />
<!--T:21--><br />
<code>object-arbutus.cloud.computecanada.ca:443</code><br />
<br />
<!--T:22--><br />
The following is an example of a minimal s3cmd configuration file. You will need these values, but are free to explore additional s3cmd configuration options to fit your use case. Note that in the example the keys are redacted and you will need to replace them with your provided key values:<br />
<br />
<!--T:23--><br />
<pre>[default]<br />
access_key = <redacted><br />
check_ssl_certificate = True<br />
check_ssl_hostname = True<br />
host_base = object-arbutus.cloud.computecanada.ca<br />
host_bucket = object-arbutus.cloud.computecanada.ca<br />
secret_key = <redacted><br />
use_https = True<br />
</pre><br />
<br />
<!--T:24--><br />
Using s3cmd's <code>--configure</code> feature is [[Arbutus_Object_Storage_Clients#Configuring_s3cmd | described here]].<br />
<br />
= Example operations on a bucket = <!--T:25--><br />
<br />
<!--T:26--><br />
<ul><br />
<li><p>Make a bucket public so that it is web accessible:</p><br />
<p><code>s3cmd setacl s3://testbucket --acl-public</code></p></li><br />
<li><p>Make the bucket private again:</p><br />
<p><code>s3cmd setacl s3://testbucket --acl-private</code></p></li><br />
<li><p>Example bucket policy:</p><br />
<p>You need to first create a policy json file:</p><br />
<pre>{<br />
&quot;Version&quot;: &quot;2012-10-17&quot;,<br />
&quot;Statement&quot;: [{<br />
&quot;Effect&quot;: &quot;Allow&quot;,<br />
&quot;Principal&quot;: {&quot;AWS&quot;: [<br />
&quot;arn:aws:iam::rrg_cjhuofw:user/parsa7&quot;,<br />
&quot;arn:aws:iam::rrg_cjhuofw:user/dilbar&quot;<br />
]},<br />
&quot;Action&quot;: [<br />
&quot;s3:ListBucket&quot;,<br />
&quot;s3:PutObject&quot;,<br />
&quot;s3:DeleteObject&quot;,<br />
&quot;s3:GetObject&quot;<br />
],<br />
&quot;Resource&quot;: [<br />
&quot;arn:aws:s3:::testbucket/*&quot;,<br />
&quot;arn:aws:s3:::testbucket&quot;<br />
]<br />
}]<br />
}<br />
</pre><br />
<p>This file allows you to set specific permissions for any number of users of that bucket.</p><br />
<p>You can even specify users from another tenant if there is a user from another project working with you.</p><br />
<p>Now that you have your policy file, you can implement that policy on the bucket:</p><br />
<p><code>s3cmd setpolicy testbucket.policy s3://testbucket</code></p><br />
<p>More extensive examples and actions can be found here: https://www.linode.com/docs/platform/object-storage/how-to-use-object-storage-acls-and-bucket-policies/</p></li></ul><br />
</translate><br />
[[Category:CC-Cloud]]</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Arbutus_object_storage&diff=121173Arbutus object storage2022-10-25T20:31:10Z<p>Rptaylor: fix error</p>
<hr />
<div><languages /><br />
<translate><br />
<br />
= Introduction = <!--T:1--><br />
<br />
<!--T:27--><br />
Object storage is a storage facility that is simpler than a normal hierarchical filesystem, but benefits by avoiding some performance bottlenecks.<br />
<br />
<!--T:28--><br />
An object is a fixed file in a flat namespace: you can create/upload an object as a whole, but cannot modify bytes within it. Objects are named as bucket:tag with no further nesting. Since bucket operations are basically whole-file, the provider can use a simpler internal representation. The flat namespace allows the provider to avoid metadata bottlenecks; it's basically a key-value store.<br />
<br />
<!--T:29--><br />
The best use of object storage is to store and export items which do not need hierarchical naming; are accessed mostly atomically and mostly read-only; and with simplified access-control rules.<br />
<br />
<!--T:2--><br />
All Arbutus projects are allocated a default 1TB of Object Store. If more is required, you can either apply for a RAS allocation or a RAC allocation. <br />
<br />
<!--T:3--><br />
We offer access to the Object Store via two different protocols: Swift or S3.<br />
<br />
<!--T:5--><br />
These protocols are very similar and in most situations you can use whichever you like. You don't have to commit to one, as buckets and objects created with Swift or S3 can be accessed using both protocols. There are a few key differences in the context of Arbutus Object Store.<br />
<br />
<!--T:6--><br />
Swift is given by default and is simpler since you do not have to manage credentials yourself. Access is governed using your Arbutus account. However, Swift does not replicate all the functionality of S3. The main use case here is when you want to manage your buckets using bucket policies you must use S3 as Swift does not support bucket policies. You can also create and manage your own keys using S3, which could be useful if you for example want to create a read-only user for a specific application. A full list of Swift/S3 compatibility can be found here: <br />
<br />
<!--T:7--><br />
https://docs.openstack.org/swift/latest/s3_compat.html<br />
<br />
= Accessing and managing Object Store = <!--T:8--><br />
<br />
<!--T:9--><br />
Users can generate their own keys using the openstack command line tool. Users can create their own containers through the OpenStack Dashboard or by using S3 or Swift compatible clients such as s3cmd.<br />
<br />
<!--T:10--><br />
You can interact with your Object Store using the Object Store tab for your project at https://arbutus.cloud.computecanada.ca/. This interface refers to buckets as containers. In this context the two terms are interchangeable. Please note that if you create a new container as ''Public'', any object placed within this container can be accessed (read-only) by anyone freely on the internet simply by navigating to the following URL with your container and object names inserted in place:<br />
<br />
<!--T:11--><br />
https://object-arbutus.cloud.computecanada.ca/<YOUR CONTAINER NAME HERE>/<YOUR OBJECT NAME HERE><br />
<br />
<!--T:12--><br />
You can also use the OpenStack command line client.<br />
For instructions on how to install and operate the OpenStack command line clients, see [[OpenStack Command Line Clients]].<br />
<br />
<!--T:13--><br />
If you wish to use the S3 protocol, you can generate your own S3 access and secret keys using the OpenStack command line client:<br />
<br />
<!--T:14--><br />
<code>openstack ec2 credentials create</code><br />
<br />
<!--T:15--><br />
The <tt>s3cmd</tt> tool which is available in Linux is the preferred way to access our S3 gateway; however there are [[Arbutus Object Storage Clients|other tools]] out there that will also work.<br />
<br />
<!--T:16--><br />
The users are responsible for operations inside the ''tenant''. As such, the buckets and management of those buckets are up to the user. <br />
<br />
=== General information === <!--T:17--><br />
<br />
<!--T:18--><br />
* Buckets are owned by the user who creates them, and no other user can manipulate them.<br />
* You can make a bucket accessible to the world, which then gives you a URL to share that will serve content from the bucket.<br />
* Bucket names must be unique across '''all''' users in the Object Store, so you may benefit by prefixing each bucket with your project name to maintain uniqueness. In other words, don't bother trying to create a bucket named ''test'', but ''def-myname-test'' is probably OK.<br />
* Bucket policies are managed via json files.<br />
<br />
= Connection details and s3cmd Configuration = <!--T:19--><br />
<br />
<!--T:20--><br />
Object storage is accessible via an HTTPS endpoint:<br />
<br />
<!--T:21--><br />
<code>object-arbutus.cloud.computecanada.ca:443</code><br />
<br />
<!--T:22--><br />
The following is an example of a minimal s3cmd configuration file. You will need these values, but are free to explore additional s3cmd configuration options to fit your use case. Note that in the example the keys are redacted and you will need to replace them with your provided key values:<br />
<br />
<!--T:23--><br />
<pre>[default]<br />
access_key = <redacted><br />
check_ssl_certificate = True<br />
check_ssl_hostname = True<br />
host_base = object-arbutus.cloud.computecanada.ca<br />
host_bucket = object-arbutus.cloud.computecanada.ca<br />
secret_key = <redacted><br />
use_https = True<br />
</pre><br />
<br />
<!--T:24--><br />
Using s3cmd's <code>--configure</code> feature is [[Arbutus_Object_Storage_Clients#Configuring_s3cmd | described here]].<br />
<br />
= Example operations on a bucket = <!--T:25--><br />
<br />
<!--T:26--><br />
<ul><br />
<li><p>Make a bucket public so that it is web accessible:</p><br />
<p><code>s3cmd setacl s3://testbucket --acl-public</code></p></li><br />
<li><p>Make the bucket private again:</p><br />
<p><code>s3cmd setacl s3://testbucket --acl-private</code></p></li><br />
<li><p>Example bucket policy:</p><br />
<p>You need to first create a policy json file:</p><br />
<pre>{<br />
&quot;Version&quot;: &quot;2012-10-17&quot;,<br />
&quot;Statement&quot;: [{<br />
&quot;Effect&quot;: &quot;Allow&quot;,<br />
&quot;Principal&quot;: {&quot;AWS&quot;: [<br />
&quot;arn:aws:iam::rrg_cjhuofw:user/parsa7&quot;,<br />
&quot;arn:aws:iam::rrg_cjhuofw:user/dilbar&quot;<br />
]},<br />
&quot;Action&quot;: [<br />
&quot;s3:ListBucket&quot;,<br />
&quot;s3:PutObject&quot;,<br />
&quot;s3:DeleteObject&quot;,<br />
&quot;s3:GetObject&quot;<br />
],<br />
&quot;Resource&quot;: [<br />
&quot;arn:aws:s3:::testbucket/*&quot;,<br />
&quot;arn:aws:s3:::testbucket&quot;<br />
]<br />
}]<br />
}<br />
</pre><br />
<p>This file allows you to set specific permissions for any number of users of that bucket.</p><br />
<p>You can even specify users from another tenant if there is a user from another project working with you.</p><br />
<p>Now that you have your policy file, you can implement that policy on the bucket:</p><br />
<p><code>s3cmd setpolicy testbucket.policy s3://testbucket</code></p><br />
<p>More extensive examples and actions can be found here: https://www.linode.com/docs/platform/object-storage/how-to-use-object-storage-acls-and-bucket-policies/</p></li></ul><br />
</translate><br />
[[Category:CC-Cloud]]</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Arbutus_object_storage_clients&diff=121172Arbutus object storage clients2022-10-25T20:20:58Z<p>Rptaylor: typo</p>
<hr />
<div><languages /><br />
<translate><br />
<br />
<!--T:1--><br />
For information on obtaining Arbutus Object Storage, please see [[Arbutus_Object_Storage|this page]]. Below, we describe how to configure and use three common object storage clients:<br />
# s3cmd<br />
# WinSCP<br />
# awscli<br />
<br />
<!--T:2--><br />
It is important to note that Arbutus' Object Storage solution does not use Amazon's [https://documentation.help/s3-dg-20060301/VirtualHosting.html S3 Virtual Hosting] (i.e. DNS-based bucket) approach which these clients assume by default. They need to be configured not to use that approach as described below.<br />
<br />
== s3cmd == <!--T:3--><br />
=== Installing s3cmd ===<br />
Depending on your Linux distribution, the <code>s3cmd</code> command can be installed using the appropriate <code>yum</code> (RHEL, CentOS) or <code>apt</code> (Debian, Ubuntu) command:<br />
<br />
<!--T:4--><br />
<code>$ sudo yum install s3cmd</code><br/><br />
<code>$ sudo apt install s3cmd </code><br />
<br />
=== Configuring s3cmd === <!--T:5--><br />
To configure the <code>s3cmd</code> tool use the command:</br><br />
<code>$ s3cmd --configure</code><br />
<br />
<!--T:6--><br />
And make the following configurations with the keys provided or created with the <code>openstack ec2 credentials create</code> command:<br />
<pre><br />
Enter new values or accept defaults in brackets with Enter.<br />
Refer to user manual for detailed description of all options.<br />
<br />
<!--T:7--><br />
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.<br />
Access Key []: 20_DIGIT_ACCESS_KEY<br />
Secret Key []: 40_DIGIT_SECRET_KEY<br />
Default Region [US]:<br />
<br />
<!--T:8--><br />
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.<br />
S3 Endpoint []: object-arbutus.cloud.computecanada.ca<br />
<br />
<!--T:9--><br />
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used<br />
if the target S3 system supports dns based buckets.<br />
DNS-style bucket+hostname:port template for accessing a bucket []: object-arbutus.cloud.computecanada.ca<br />
<br />
<!--T:10--><br />
Encryption password is used to protect your files from reading<br />
by unauthorized persons while in transfer to S3<br />
Encryption password []: PASSWORD<br />
Path to GPG program []: /usr/bin/gpg<br />
<br />
<!--T:11--><br />
When using secure HTTPS protocol all communication with Amazon S3<br />
servers is protected from 3rd party eavesdropping. This method is<br />
slower than plain HTTP, and can only be proxied with Python 2.7 or newer<br />
Use HTTPS protocol []: Yes<br />
<br />
<!--T:12--><br />
On some networks all internet access must go through a HTTP proxy.<br />
Try setting it here if you can't connect to S3 directly<br />
HTTP Proxy server name:<br />
</pre><br />
<br />
=== Create buckets === <!--T:13--><br />
The next task is to make a bucket. Buckets contain files. Bucket names must be unique across the Arbutus object storage solution. Therefore, you will need to create a uniquely named bucket which will not conflict with other users. For example, buckets <tt>s3://test/</tt> and <tt>s3://data/</tt> are likely already taken. Consider creating buckets reflective of your project, for example <tt>s3://def-test-bucket1</tt> or <tt>s3://atlas_project_bucket</tt>. Valid bucket names may only use the upper case characters, lower case characters, digits, periods, hyphens, and underscores (i.e. A-Z, a-z, 0-9, ., -, and _ ).<br />
<br />
<!--T:14--><br />
To create a bucket, use the tool's <code>mb</code> (make bucket) command:<br />
<br />
<!--T:15--><br />
<code>$ s3cmd mb s3://BUCKET_NAME/</code><br />
<br />
<!--T:16--><br />
To see the status of a bucket, use the <code>info</code> command:<br />
<br />
<!--T:17--><br />
<code>$ s3cmd info s3://BUCKET_NAME/</code><br />
<br />
<!--T:18--><br />
The output will look something like this:<br />
<br />
<!--T:19--><br />
<pre><br />
s3://BUCKET_NAME/ (bucket):<br />
Location: default<br />
Payer: BucketOwner<br />
Expiration Rule: none<br />
Policy: none<br />
CORS: none<br />
ACL: *anon*: READ<br />
ACL: USER: FULL_CONTROL<br />
URL: http://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/<br />
</pre><br />
<br />
=== Upload files === <!--T:20--><br />
To upload a file to the bucket, use the <code>put</code> command similar to this:<br />
<br />
<!--T:21--><br />
<code>$ s3cmd put --guess-mime-type FILE_NAME.dat s3://BUCKET_NAME/FILE_NAME.dat</code><br />
<br />
<!--T:22--><br />
Where the bucket name and the file name are specified. Multipurpose Internet Mail Extensions (MIME) is a mechanism for handling files based on their type. The <code>--guess-mime-type</code> command parameter will guess the MIME type based on the file extension. The default MIME type is <code>binary/octet-stream</code>.<br />
<br />
=== Delete File === <!--T:23--><br />
To delete a file from the bucket, use the <code>rm</code> command similar to this:<br/><br />
<code>$ s3cmd rm s3://BUCKET_NAME/FILE_NAME.dat</code><br />
<br />
=== Access Control Lists (ACLs) and Policies === <!--T:24--><br />
Buckets can have ACLs and policies which govern who can access what resources in the object store. These features are quite sophisticated. Here are two simple examples of using ACLs using the tool's <code>setacl</code> command.<br />
<br />
<!--T:25--><br />
<code>$ s3cmd setacl --acl-public -r s3://BUCKET_NAME/</code><br />
<br />
<!--T:26--><br />
The result of this command is that the public can access the bucket and recursively (-r) every file in the bucket. Files can be accessed via URLs such as<br/><br />
<code><nowiki>https://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/FILE_NAME.dat</nowiki></code><br />
<br />
<!--T:27--><br />
The second ACL example limits access to the bucket to only the owner:<br />
<br />
<!--T:28--><br />
<code>$ s3cmd setacl --acl-private s3://BUCKET_NAME/</code><br />
<br />
<!--T:29--><br />
Other more sophisticated examples can be found in the s3cmd [https://www.s3express.com/help/help.html help site] or s3cmd(1) man page.<br />
<br />
== WinSCP == <!--T:30--><br />
<br />
=== Installing WinSCP === <!--T:31--><br />
WinSCP can be installed from https://winscp.net/.<br />
<br />
=== Configuring WinSCP === <!--T:32--><br />
Under "New Session", make the following configurations:<br />
<ul><br />
<li>File protocol: Amazon S3</li><br />
<li>Host name: object-arbutus.cloud.computecanada.ca</li><br />
<li>Port number: 443</li><br />
<li>Access key ID: 20_DIGIT_ACCESS_KEY provided by the Arbutus team</li><br />
</ul><br />
and "Save" these settings as shown below<br />
<br />
<!--T:33--><br />
[[File:WinSCP Configuration.png|600px|thumb|center|WinSCP configuration screen]]<br />
<br />
<!--T:34--><br />
Next, click on the "Edit" button and then click on "Advanced..." and navigate to "Environment" to "S3" to "Protocol options" to "URL style:" which <b>must</b> changed from "Virtual Host" to "Path" as shown below:<br />
<br />
<!--T:35--><br />
[[File:WinSCP Path Configuration.png|600px|thumb|center|WinSCP Path Configuration]]<br />
<br />
<!--T:36--><br />
This "Path" setting is important, otherwise WinSCP will not work and you will see hostname resolution errors, like this:<br />
[[File:WinSCP resolve error.png|400px|thumb|center|WinSCP resolve error]]<br />
<br />
=== Using WinSCP === <!--T:37--><br />
Click on the "Login" button and use the WinSCP GUI to create buckets and to transfer files:<br />
<br />
<!--T:38--><br />
[[File:WinSCP transfers.png|800px|thumb|center|WinSCP file transfer screen]]<br />
<br />
=== Access Control Lists (ACLs) and Policies === <!--T:41--><br />
Right-clicking on a file will allow you to set a file's ACL, like this:<br />
[[File:WinSCP ACL.png|400px|thumb|center|WinSCP ACL screen]]<br />
<br />
== AWS CLI == <!--T:43--><br />
<br />
<!--T:44--><br />
The <code>awscli</code> client also works with the Object Store service with better support for large (>5GB) files and the helpful <code>sync</code> command. However, not all features have not been tested.<br />
<br />
=== Installing awscli === <!--T:45--><br />
<br />
<!--T:46--><br />
<pre><br />
pip install awscli awscli-plugin-endpoint<br />
</pre><br />
<br />
=== Configuring awscli === <!--T:47--><br />
<br />
<!--T:48--><br />
Generate an access key ID & secret key<br />
<br />
<!--T:49--><br />
<pre><br />
openstack ec2 credentials create<br />
</pre><br />
<br />
<!--T:50--><br />
Edit or create <code>~/.aws/credentials</code> and add the credentials generated above<br />
<br />
<!--T:51--><br />
<pre><br />
[default]<br />
aws_access_key_id = <access_key><br />
aws_secret_access_key = <secret_key><br />
</pre><br />
<br />
<!--T:52--><br />
Edit <code>~/.aws/config</code> and add the following configuration<br />
<br />
<!--T:53--><br />
<pre><br />
[plugins]<br />
endpoint = awscli_plugin_endpoint<br />
<br />
<!--T:54--><br />
[profile default]<br />
s3 =<br />
endpoint_url = https://object-arbutus.cloud.computecanada.ca<br />
signature_version = s3v4<br />
s3api =<br />
endpoint_url = https://object-arbutus.cloud.computecanada.ca<br />
</pre><br />
<br />
=== Using awscli === <!--T:55--><br />
<br />
<!--T:56--><br />
<pre><br />
export AWS_PROFILE=default<br />
aws s3 ls <container-name><br />
aws s3 sync local_directory s3://container-name/prefix<br />
</pre><br />
<br />
<!--T:57--><br />
More examples can be found here: https://docs.ovh.com/us/en/storage/getting_started_with_the_swift_S3_API/<br />
<br />
<!--T:42--><br />
[[Category:CC-Cloud]]<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=119371CVMFS2022-09-13T18:07:13Z<p>Rptaylor: add 2022 CernVM workshop presentation</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes the CERN Virtual Machine File System (CVMFS). We use CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access content, and to the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only content distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. It is designed to deliver software in a fast, scalable and reliable fashion, and is now also used to distribute data. The scale of usage across dozens of projects involves ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of the software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository;<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0;<br />
** HTTP proxy servers cache requests from clients to stratum 1 servers;<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 2021-09-27 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimized scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community and with wider hardware support.<br />
* [https://indico.cern.ch/event/1079490/contributions/4939532/ 2022-09-13 CVMFS in Canadian Advanced Research Computing] 2022 CernVM Workshop<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=118006CVMFS2022-07-20T18:55:38Z<p>Rptaylor: split sentence</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes the CERN Virtual Machine File System (CVMFS). We use CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access content, and to the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only content distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. It is designed to deliver software in a fast, scalable and reliable fashion, and is now also used to distribute data. The scale of usage across dozens of projects involves ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of the software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository;<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0;<br />
** HTTP proxy servers cache requests from clients to stratum 1 servers;<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimized scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community and with wider hardware support.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=117465Accessing CVMFS2022-07-12T23:30:15Z<p>Rptaylor: wording</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
Refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
For standard client configuration, refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
The <tt>soft.computecanada.ca</tt> repository is available out-of-the-box via the default configuration, so no additional steps are required to access it (though you may wish to include it in <tt>CVMFS_REPOSITORIES</tt> in your client configuration).<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On our systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available externally, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=117464Accessing CVMFS2022-07-12T23:28:21Z<p>Rptaylor: remove CC config repo warning, no longer applicable for external use</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
Refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
For standard client configuration, refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
The <tt>soft.computecanada.ca</tt> repository is available by default so no additional configuration is required to access it (though you may wish to include it in <tt>CVMFS_REPOSITORIES</tt> in your client configuration).<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On our systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available externally, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=117463Accessing CVMFS2022-07-12T23:19:59Z<p>Rptaylor: remove unnecessary config for external use</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
Refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
For standard client configuration, refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
The <tt>soft.computecanada.ca</tt> repository is available by default so no additional configuration is required to access it (though you may wish to include it in <tt>CVMFS_REPOSITORIES</tt> in your client configuration).<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On our systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available externally, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=117462Accessing CVMFS2022-07-12T23:15:01Z<p>Rptaylor: remove unnecessary configuration instructions</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
Refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
Please note that this configuration should not be used for a production environment. If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
<nowiki><br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers</nowiki><br />
<br />
<!--T:71--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On our systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available externally, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=117028Accessing CVMFS2022-07-06T20:07:20Z<p>Rptaylor: name agnostic</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8 (including clones such as AlmaLinux 8)<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages, you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
Please note that this configuration should not be used for a production environment. If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
<nowiki><br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers</nowiki><br />
<br />
<!--T:71--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On our systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available externally, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=117027CVMFS2022-07-06T19:44:34Z<p>Rptaylor: reword</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes the CERN Virtual Machine File System (CVMFS). We use CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access content, and to the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only content distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, it is now also used to distribute data, and the scale of usage across dozens of projects involves ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of the software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository;<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0;<br />
** HTTP proxy servers cache requests from clients to stratum 1 servers;<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimized scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community and with wider hardware support.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=116221CVMFS2022-06-02T19:37:39Z<p>Rptaylor: name agnosticism</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes CERN Virtual Machine File System (CVMFS). We use CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only content distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world, and distribution of data and other content in addition to software. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimised scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community and with wider hardware support.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=116220CVMFS2022-06-02T19:36:01Z<p>Rptaylor: generalize to not just software</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only content distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world, and distribution of data and other content in addition to software. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimised scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community and with wider hardware support.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=115839Accessing CVMFS2022-05-17T19:18:10Z<p>Rptaylor: EL8 clones</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8 (including clones such as AlmaLinux 8)<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages, you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
Please note that this configuration should not be used for a production environment. If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
<nowiki><br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers</nowiki><br />
<br />
<!--T:71--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=115669Accessing CVMFS2022-05-13T01:57:21Z<p>Rptaylor: reorder sentences</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages, you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
Please note that this configuration should not be used for a production environment. If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
<nowiki><br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers</nowiki><br />
<br />
<!--T:71--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=114738Accessing CVMFS2022-04-25T21:17:59Z<p>Rptaylor: fix conjugation error</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provide repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages, you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance, disk usage, or production-readiness:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
<nowiki><br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers</nowiki><br />
<br />
<!--T:71--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=114377Accessing CVMFS2022-04-20T01:11:41Z<p>Rptaylor: explicitly non-production</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provides repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages, you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance, disk usage, or production-readiness:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
<nowiki><br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers</nowiki><br />
<br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=114376Accessing CVMFS2022-04-20T01:03:10Z<p>Rptaylor: fix formatting quirk</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provides repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages, you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
<nowiki><br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers</nowiki><br />
<br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=114375Accessing CVMFS2022-04-20T00:55:46Z<p>Rptaylor: CVMFS_HTTP_PROXY is required now for the standard setup</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
We provides repositories of software and data via a file system called the [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On our systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using our software environment, please refer to wiki pages [[Available software]], [[Using modules]], [[Python]], [[R]] and [[Installing software in your home directory]].<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on ours.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Note to staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding our software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We would appreciate that you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by our CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to our CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Our staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. Our CVMFS repositories are provided '''without any warranty'''. We reserve the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html other option].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done '''before''' CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated file system so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that file system '''before''' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages, you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="http://proxy1.example.org:3128|http://proxy2.example.org:3128" # example definition of proxy servers<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* <code>CVMFS_HTTP_PROXY</code> defines the proxy servers to use. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=114064Accessing CVMFS2022-04-12T20:46:36Z<p>Rptaylor: update group link</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/u/0/a/gw.alliancecan.ca/g/cvmfs-announce/about here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=114063Accessing CVMFS2022-04-12T20:45:41Z<p>Rptaylor: change to DRAC mailing list</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@gw.alliancecan.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@gw.alliancecan.ca cvmfs-announce+subscribe@gw.alliancecan.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=113167Accessing CVMFS2022-03-21T20:58:59Z<p>Rptaylor: change link</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers at your site to improve performance and bandwidth usage, especially if you have a large number of clients. Refer to [https://cvmfs.readthedocs.io/en/stable/cpt-squid.html setting up a local squid proxy].<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=112414Accessing CVMFS2022-03-01T01:30:27Z<p>Rptaylor: /* Use of software environment by system administrators */ clarify, simplify</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site to improve performance and bandwidth usage, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
If you perform privileged system operations, or operations related to CVMFS, [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that your session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may interfere with CVMFS operations, or hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully without encountering a circular dependency.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=112263CVMFS2022-02-22T23:41:34Z<p>Rptaylor: </p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimised scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community and with wider hardware support.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=112262CVMFS2022-02-22T23:40:59Z<p>Rptaylor: update PEARC link</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://dl.acm.org/doi/10.1145/3332186.3332210 2019-07-28 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] Proceedings of the Practice and Experience in Advanced Research Computing '19<br />
** PDF also available [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf here]<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 2022-02-16 EESSI: A cross-platform ready-to-use optimised scientific software stack] Journal of Software: Practice and Experience, 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community with wider hardware support.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=112261CVMFS2022-02-22T23:35:54Z<p>Rptaylor: add EESSI paper</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf 2019-08-01 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] PEARC 2019<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS).<br />
* [https://onlinelibrary.wiley.com/doi/10.1002/spe.3075 EESSI: A cross-platform ready-to-use optimised scientific software stack] Journal of Software: Practice and Experience, Feb 16 2022<br />
** Illustrates an extension to the Compute Canada approach to distributing software, for a broader research community with wider hardware support.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=109701Accessing CVMFS2022-01-10T23:35:03Z<p>Rptaylor: update link to github</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site to improve performance and bandwidth usage, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://github.com/cvmfs-contrib/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=109262Accessing CVMFS2022-01-04T21:34:52Z<p>Rptaylor: fix terms of use link</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://ccdb.computecanada.ca/agreements/user_aup_2021/user_display terms of use] or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site to improve performance and bandwidth usage, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://git.computecanada.ca/cc-cvmfs-public/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:67--><br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Using_ipv6_in_cloud&diff=105367Using ipv6 in cloud2021-10-29T17:34:08Z<p>Rptaylor: fix typo</p>
<hr />
<div><languages /><br />
<translate><br />
<!--T:2--><br />
===== IPv6 in Arbutus Cloud =====<br />
IPv6 Link-Local (LLA) and Global Unicast (GUA) addresses are generally available within the Arbutus Cloud environment.<br />
GUA can be set up via a separate interface, which in turn also handles only the IPv6 traffic.<br />
Addresses are being setup using ''Stateless Address Auto Configuration'' (SLAAC), which automatically sets up the IP on the VM interface. By default, the security group rules will allow all outbound traffic from the VM via the IPv6 GUA, but no traffic that originates from outside the VM will be allowed until specific security group rules have been defined. This is the same behaviour as IPv4.<br />
<br />
===== Example configuration =====<br />
Login into the dashboard and go to the Instances menu, click on ''Attach Interface'', which will open a dialog.<br />
Use IPv6-GUA (2607:f8f0:c11:7004::/64) from the network menu and click on attach.<br />
<br />
<gallery widths=300px heights=200px><br />
Instancemenu.png|Dashboard showing Instances<br />
Interface menu attach.png.png|Drop down menu to attach an interface<br />
netlist.png|Available Networks Menu<br />
show_attached.png| Show the second IPv6 interface<br />
</gallery><br />
<br />
The shown IPv6 address is now available and can be used until the interface is detached. Every time the interface is detached, <br />
the GUA is released and put back into the pool and thus, can be used by anyone else. Rebuilding or restarting the VM however, will not<br />
release the GUA.<br />
<br />
Access from any IPv6 GUA can be granted via the ''Security Groups'' in Openstack; the only difference is the CIDR which automatically detects the address type.<br />
<br />
[[File:secpol.png|thumb|left|Allow icmp from any IPv6 GUA]]<br />
<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=104665CVMFS2021-10-12T19:56:10Z<p>Rptaylor: add blog post</p>
<hr />
<div><languages /><br />
[[Category:CVMFS]]<br />
<translate><br />
<!--T:1--><br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction == <!--T:2--><br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features === <!--T:3--><br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material == <!--T:4--><br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf 2019-08-01 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] PEARC 2019<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]<br />
* [https://towardsdatascience.com/unlimited-scientific-libraries-and-applications-in-kubernetes-instantly-b69b192ec5e5 Unlimited scientific libraries and applications in Kubernetes, instantly!] Towards Data Science article, Sep 27 2021<br />
** Illustrates the Compute Canada approach to distributing research applications for users (although the deployment described in the article is only used for a single demo cluster, and uses CephFS instead of CVMFS)<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=101890Accessing CVMFS2021-07-13T19:57:11Z<p>Rptaylor: specify uninstall would be needed</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://www.computecanada.ca/research-portal/information-security/terms-of-use/ terms of use] (such as, by way of example and without limitation, sections 3.5 or 3.11), or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site to improve performance and bandwidth usage, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://git.computecanada.ca/cc-cvmfs-public/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, uninstall this package and follow the standard setup instructions instead.<br />
</tab><br />
<tab name="Standard setup"><br />
<!--T:23--><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=101889Accessing CVMFS2021-07-13T19:56:12Z<p>Rptaylor: add simple "quickstart" config option</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://www.computecanada.ca/research-portal/information-security/terms-of-use/ terms of use] (such as, by way of example and without limitation, sections 3.5 or 3.11), or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site to improve performance and bandwidth usage, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://git.computecanada.ca/cc-cvmfs-public/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<tabs><br />
<tab name="Simple setup"><br />
On RPM-based systems, if you want an easy way to get started and are not concerned with performance or disk usage, just do:<br />
{{Command|sudo yum install cvmfs-quickstart-computecanada}}<br />
If you encounter any issues, follow the standard setup instructions.<br />
</tab><br />
<tab name="Standard setup"><br />
<!--T:23--><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
</tab><br />
</tabs><br />
<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=101881Accessing CVMFS2021-07-09T23:45:00Z<p>Rptaylor: </p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://www.computecanada.ca/research-portal/information-security/terms-of-use/ terms of use] (such as, by way of example and without limitation, sections 3.5 or 3.11), or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site to improve performance and bandwidth usage, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://git.computecanada.ca/cc-cvmfs-public/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:66--><br />
You can create this file with the command:<br />
{{Command|sudo bash -c 'cat > /etc/cvmfs/default.local <<EOF<br />
CVMFS_REPOSITORIES{{=}}"cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT{{=}}"yes"<br />
CVMFS_QUOTA_LIMIT{{=}}10000<br />
EOF'}}<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=101880Accessing CVMFS2021-07-09T23:29:56Z<p>Rptaylor: more noticeable note for staff, to see important internal docs as well</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems.<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
{{Note|Compute Canada staff: see the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].|reminder}}<br />
<br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://www.computecanada.ca/research-portal/information-security/terms-of-use/ terms of use] (such as, by way of example and without limitation, sections 3.5 or 3.11), or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://git.computecanada.ca/cc-cvmfs-public/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
<br />
<!--T:66--><br />
You can create this file with the command:<br />
{{Command|sudo bash -c 'cat > /etc/cvmfs/default.local <<EOF<br />
CVMFS_REPOSITORIES{{=}}"cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT{{=}}"yes"<br />
CVMFS_QUOTA_LIMIT{{=}}10000<br />
EOF'}}<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to under 85% the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=101620Accessing CVMFS2021-07-05T18:33:24Z<p>Rptaylor: add some guidelines for cache size, encourage reading the doc and discourage copy pasting</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems. (If you are a Compute Canada staff member, refer to the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].)<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://www.computecanada.ca/research-portal/information-security/terms-of-use/ terms of use] (such as, by way of example and without limitation, sections 3.5 or 3.11), or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://git.computecanada.ca/cc-cvmfs-public/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository:<br />
wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb<br />
sudo dpkg -i cvmfs-release-latest_all.deb<br />
rm -f cvmfs-release-latest_all.deb<br />
sudo apt-get update<br />
* Install the CVMFS client from that repository:<br />
sudo apt-get install cvmfs cvmfs-config-default<br />
* Apply the initial client setup:<br />
sudo cvmfs_config setup<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
sudo dpkg -i cvmfs-config-computecanada-latest.all.deb<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=10000 # see below and adjust as needed<br />
CVMFS_HTTP_PROXY="DIRECT"<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to (at most) 15% less than the size of your local cache filesystem. It should be at least 50 GB for compute nodes in heavy use, while ~ 5-10 GB may suffice for light use.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running the bash script <tt>/cvmfs/soft.computecanada.ca/config/profile/bash.sh</tt>. <br />
This will load some default modules. If you want to mimic a specific cluster exactly, simply define the environment variable <tt>CC_CLUSTER</tt> to one of <tt>beluga</tt>, <tt>cedar</tt> or <tt>graham</tt> before using the script, for example: <br />
{{Command|export CC_CLUSTER{{=}}beluga}}<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Cloud_resources&diff=100078Cloud resources2021-05-10T21:48:36Z<p>Rptaylor: /* Arbutus cloud (arbutus.cloud.computecanada.ca) */ fix broken link</p>
<hr />
<div><languages/><br />
<translate><br />
''Parent page: [[Cloud]]''<br />
==Hardware== <!--T:1--><br />
===Arbutus cloud ([https://arbutus.cloud.computecanada.ca arbutus.cloud.computecanada.ca])===<br />
{| class="wikitable sortable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 156 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 384 ||2 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0 RAID0] || 1 x 25GbE || N/A || 6,240 || 12,480<br />
|-<br />
| 8 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 1024 ||2 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1] || 1 x 25GbE || N/A || 320 || 6,400<br />
|-<br />
| 26 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Gold 6248] || 384 ||2 x 1.6TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0 RAID0] || 1 x 25GbE || 4 x [https://www.nvidia.com/en-us/data-center/v100/ V100 32GB] || 1,040 || 2,080<br />
|-<br />
| 32 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 256 ||6 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#Nested_RAID RAID10] || 1 x 10GbE || N/A || 1,024 || 2,048<br />
|-<br />
| 4 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 768 ||6 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#Nested_RAID RAID10] || 2 x 10GbE || N/A || 128 || 2,560<br />
|-<br />
| 8 || 2 x [https://ark.intel.com/products/120492/Intel-Xeon-Gold-6130-Processor-22M-Cache-2_10-GHz Gold 6130] || 256 ||4 x 1.92TB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5 RAID5] || 1 x 10GbE || N/A || 256 || 512<br />
|-<br />
| 240 || 2 x [https://ark.intel.com/products/91754/Intel-Xeon-Processor-E5-2680-v4-35M-Cache-2_40-GHz E5-2680 v4] || 256 ||4 x 900GB 10k SAS in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5 RAID5] || 1 x 10GbE || N/A || 6,720 || 13,440<br />
|-<br />
| 8 || 2 x E5-2680 v4 || 512 || 4 x 900GB 10k SAS in RAID5 || 2 x 10GbE || N/A || 224 || 4,480<br />
|-<br />
| 2 || 2 x E5-2680 v4 || 128 || 4 x 900GB 10k SAS in RAID5 || 1 x 10GbE || 2 x [https://www.nvidia.com/en-us/data-center/tesla-k80/ Tesla K80] || 56 || 112<br />
|}<br />
Location: University of Victoria<br/><br />
Total CPUs: 16,008 (484 nodes)<br/><br />
Total vCPUs: 44,112<br/><br />
Total GPUs: 108 (28 nodes)<br/><br />
Total RAM: 157,184 GB<br/><br />
5.3 PB of Volume and Snapshot [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage.<br /><br />
12 PB of Object/Shared Filesystem [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage.<br /><br />
<br />
===Cedar cloud ([http://cedar.cloud.computecanada.ca cedar.cloud.computecanada.ca])=== <!--T:3--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 28 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/91766/intel-xeon-processor-e5-2683-v4-40m-cache-2-10-ghz.html E5-2683 v4] || 256 || 2 x 480GB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1]|| 1 x 10GbE || N/A || 896 || 1,792<br />
|-<br />
| 4 || 2 x [https://ark.intel.com/content/www/us/en/ark/products/91766/intel-xeon-processor-e5-2683-v4-40m-cache-2-10-ghz.html E5-2683 v4] || 256 || 2 x 480GB SSD in [https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_1 RAID1]|| 1 x 10GbE || N/A || 128 || 2,560<br />
|}<br />
Location: Simon Fraser University<br/><br />
Total CPUs: 1,024<br/><br />
Total vCPUs: 4,352<br/><br />
Total RAM: 7,680 GB<br/><br />
500 TB of persistent [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
===Graham cloud ([https://graham.cloud.computecanada.ca graham.cloud.computecanada.ca])=== <!--T:8--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUS<br />
|-<br />
| 6 || 2 x E5-2683 v4 || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 192 || <br />
|-<br />
| 2 || 2 x E5-2683 v4 || 512 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 64 || <br />
|-<br />
| 8 || 2 x E5-2637 v4 || 128 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 256 ||<br />
|-<br />
| 8 || 2 x Xeon(R) Gold 6130 CPU || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 256 ||<br />
|-<br />
| 3 || 2 x E5-2640 v4 || 256 || 2x 500GB SSD in RAID0 || 1 x 10GbE || N/A || 120 ||<br />
|-<br />
| 12 || 2 x Xeon(R) Gold 6248 CPU || 768 || 2x 1TB SSD in RAID0 || 1 x 10GbE || N/A || 480 || <br />
|-<br />
|}<br />
Location: University of Waterloo<br/><br />
Total CPUs: 1,368<br/><br />
Total vCPUs: <br/><br />
Total RAM: 15,616 GB<br/><br />
84 TB of persistent [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
===Béluga cloud ([https://beluga.cloud.computecanada.ca beluga.cloud.computecanada.ca])=== <!--T:12--><br />
{| class="wikitable"<br />
|-<br />
! Node count !! CPU !! Memory (GB) !! Local (ephemeral) storage !! Interconnect !! GPU !! Total CPUs !! Total vCPUs<br />
|-<br />
| 96 || 2 x Intel Xeon Gold 5218 || 256 || N/A, ephemeral storage in ceph || 1 x 25GbE || N/A || 3,072 || 6,144<br />
|-<br />
| 16 || 2 x Intel Xeon Gold 5218 || 768 || N/A, ephemeral storage in ceph || 1 x 25GbE || N/A || 512 || 10,240<br />
|-<br />
|}<br />
Location: École de Technologie Supérieure<br/><br />
Total CPUs: 3,584<br/><br />
Total vCPUs: 16,384<br/><br />
Total RAM: 36,864 GiB<br/><br />
200 TiB of replicated persistent SSD [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
1.7 PiB of erasure coded persistent HDD [https://en.wikipedia.org/wiki/Ceph_(software) Ceph] storage. <br/><br />
<br />
==Software== <!--T:2--><br />
Compute Canada cloud OpenStack platform versions as of March 11, 2021<br/><br />
* Arbutus: Train<br />
* Cedar: Train<br />
* Graham: Train<br />
* Béluga: Victoria<br />
<br />
<br />
<br />
<!--T:4--><br />
See the [http://releases.openstack.org/ OpenStack releases] for a list of all OpenStack versions.<br />
<br />
==Images== <!--T:9--><br />
Below are the images provided by Compute Canada staff on the Compute Canada Clouds. New images will be added periodically as new releases and updates become available. As releases have an end of life (EOL) after which support and updates are no longer provided, we encourage you to migrate systems and platforms to newer releases in order to continue receiving patches and security updates. The EOL dates listed in the table are the dates at which these images will be removed from the Compute Canada clouds.<br />
<br />
<!--T:10--><br />
For more details about using images see [[OpenStack#Working_with_images|working with images]].<br />
<br />
<!--T:11--><br />
{| class="wikitable sortable" style="width:85%"<br />
! style="width: 15%" align="center" | Name<br />
! style="width: 25%" align="center" | Sites<br />
! style="width: 15%" align="center" | End Of Life<br />
|-<br />
| align="center" | CentOS-8-x64-2019-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''Dec 31, 2021''' <ref name="accelerated">Accelerated end-of-life for CentOS 8 [https://blog.centos.org/2020/12/future-is-centos-stream/ announced Dec 2020].</ref><br />
|-<br />
| align="center" | CentOS-7-x64-2019-07<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | March 31, 2024<br />
|-<br />
| align="center" | CentOS-6-x64-2019-07<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020''' <ref name="removed">These images have been removed from Compute Canada clouds.</ref><br />
|-<br />
| align="center" | Debian-10.6.2-Buster-x64-2020-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | June 30, 2023<br />
|-<br />
| align="center" | Debian-10.2.0-Buster-2019-11<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | March 31, 2022<br />
|-<br />
| align="center" | Debian-9.11.6-Stretch-2019-12<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
| align="center" | Fedora-33-1.2-x64-2020-10<br />
| align="center" | Arbutus<br />
| align="center" | TBD Fedora 33 will be maintained until four weeks after the release of Fedora 35<br />
|-<br />
| align="center" | Fedora-32-1.6-x64-2020-04<br />
| align="center" | Arbutus<br />
| align="center" | May 18, 2021<br />
|-<br />
| align="center" | Fedora-31-1.9-x64-2020-01<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | November 24, 2020<br />
|-<br />
| align="center" | Fedora-30-1.2-x86-2019-07<br />
| align="center" | Arbutus, East<br />
| align="center" | May 26, 2020<br />
|-<br />
| align="center" | Ubuntu-20.04-Focal-minimal-x64-2020-12<br />
| align="center" | Arbutus, East, Graham<br />
| align="center" | June 30, 2024<br />
|-<br />
| align="center" | Ubuntu-20.04-Focal-x64-2020-12<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | June 30, 2024<br />
|-<br />
| align="center" | Ubuntu-18.04.3-Bionic-minimal-x64-2020-01<br />
| align="center" | Arbutus<br />
| align="center" | March 31, 2023<br />
|-<br />
| align="center" | Ubuntu-18.04.3-Bionic-x64-2020-01<br />
| align="center" | Arbutus<br />
| align="center" | March 31, 2023<br />
|-<br />
| align="center" | Ubuntu-16.04.6-Xenial-minimal-x64-2020-01<br />
| align="center" | Arbutus, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
| align="center" | Ubuntu-16.04.6-Xenial-x64-2020-01<br />
| align="center" | Arbutus, Cedar, East, Graham<br />
| align="center" | '''March 31, 2020'''<ref name="removed"/><br />
|-<br />
|}<br />
<br />
<!--T:6--><br />
[[Category:CC-Cloud]]<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=95982CVMFS2021-02-20T00:09:51Z<p>Rptaylor: /* Compute Canada CVMFS Reference Material */</p>
<hr />
<div>[[Category:CVMFS]]<br />
<br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction ==<br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features ===<br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material ==<br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf 2019-08-01 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] PEARC 2019<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021<br />
** [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/eum21-cvmfs-tutorial-slides.pdf tutorial slides]</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=95981CVMFS2021-02-20T00:08:39Z<p>Rptaylor: /* Compute Canada CVMFS Reference Material */ add tutorial link</p>
<hr />
<div>[[Category:CVMFS]]<br />
<br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction ==<br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features ===<br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material ==<br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf 2019-08-01 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] PEARC 2019<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020<br />
* [https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/ 2021-01-26 CVMFS Tutorial] Easybuild User Meeting 2021</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=Accessing_CVMFS&diff=95977Accessing CVMFS2021-02-19T23:43:13Z<p>Rptaylor: link to cloud script</p>
<hr />
<div>[[Category:CVMFS]]<br />
<languages /><br />
<br />
<translate><br />
= Introduction = <!--T:1--><br />
Compute Canada provides repositories of software and data via a file system called [[CVMFS|CERN Virtual Machine File System]] (CVMFS). On Compute Canada systems, CVMFS is already set up for you, so the repositories are automatically available for your use. For more information on using the Compute Canada software environment, please refer to [[available software]], [[using modules]], [[Python]], [[R]] and [[Installing software in your home directory]] pages.<br />
<br />
<!--T:2--><br />
The purpose of this page is to describe how you can install and configure CVMFS on ''your'' computer or cluster, so that you can access the same repositories (and software environment) on your system that are available on Compute Canada systems. (If you are a Compute Canada staff member, refer to the [https://wiki.computecanada.ca/staff/CVMFS_client_setup internal documentation].)<br />
<br />
<!--T:3--><br />
The software environment described on this page has been [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf presented] at Practices and Experience in Advanced Research Computing 2019 (PEARC 2019).<br />
<br />
= Before you start = <!--T:4--><br />
</translate><br />
{{Panel<br />
|title=Important<br />
|panelstyle=callout<br />
|content=<br />
<translate><!--T:55--> '''Please [[Accessing_CVMFS#Subscribe_to_announcements|subscribe to announcements]] to remain informed of important changes regarding the Compute Canada software environment and CVMFS, and fill out the [https://docs.google.com/forms/d/1eDJEeaMgooVoc4lTkxcZ9y65iR8hl4qeXMOEU9slEck/viewform registration form]. If use of our software environment contributes to your research, please acknowledge it according to [https://www.computecanada.ca/research-portal/accessing-resources/acknowledging-compute-canada/ these guidelines].''' (We appreciate it if you also cite our [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf paper]). </translate><br />
}}<br />
<translate><br />
== Subscribe to announcements == <!--T:5--><br />
Occasionally, changes will be made regarding CVMFS or the software or other content provided by Compute Canada CVMFS repositories, which '''may affect users''' or '''require administrators to take action''' in order to ensure uninterrupted access to the Compute Canada CVMFS repositories. Subscribe to the cvmfs-announce@computecanada.ca mailing list in order to receive important but infrequent notifications about these changes, by emailing [mailto:cvmfs-announce+subscribe@computecanada.ca cvmfs-announce+subscribe@computecanada.ca] and then replying to the confirmation email you subsequently receive. (Compute Canada staff can alternatively subscribe [https://groups.google.com/a/computecanada.ca/forum/#!forum/cvmfs-announce here].)<br />
<br />
== Terms of use and support == <!--T:6--><br />
The CVMFS client software is provided by CERN. The Compute Canada CVMFS repositories are provided by Compute Canada '''without any warranty'''. Compute Canada reserves the right to limit or block your access to the CVMFS repositories and software environment if you violate applicable [https://www.computecanada.ca/research-portal/information-security/terms-of-use/ terms of use] (such as, by way of example and without limitation, sections 3.5 or 3.11), or at our discretion.<br />
<br />
== CVMFS requirements == <!--T:7--><br />
=== For a single system ===<br />
To install CVMFS on an individual system, such as your laptop or desktop, you will need:<br />
* A supported operating system (see [[Accessing_CVMFS#Installation|installation]]).<br />
* Support for [https://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE].<br />
* Approximately 50 GB of available local storage, for the cache. (It will only be filled based on usage, and a larger or smaller cache may be suitable in different situations. For light use on a personal computer, just ~ 5-10 GB may suffice. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#sct-cache cache settings] for more details.)<br />
* Outbound HTTP access to the internet.<br />
** Or at least outbound HTTP access to one or more local proxy servers.<br />
<br />
<!--T:8--><br />
If your system lacks FUSE support or local storage, or has limited network connectivity or other restrictions, you may be able to use some [https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html alternative approaches].<br />
<br />
=== For multiple systems === <!--T:9--><br />
If multiple CVMFS clients are deployed, for example in a cluster, laboratory, campus or other site, each system must meet the above requirements, and the following considerations apply as well:<br />
* We recommend that you deploy forward caching HTTP proxy servers (such as [http://www.squid-cache.org/ Squid]) at your site, especially if you have a large number of clients.<br />
** Note that if you have only one such proxy server it will be a single point of failure for your site. Generally you should have at least two local proxies at your site, and potentially additional nearby or regional proxies as backups.<br />
* It is recommended to synchronize the identity of the <code>cvmfs</code> service account across all client nodes (e.g. using LDAP or other means).<br />
** This facilitates use of an [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache alien cache] and should be done before CVMFS is installed. Even if you do not anticipate using an alien cache at this time, it is easier to synchronize the accounts initially than to try to potentially change them later.<br />
<br />
== Software environment requirements == <!--T:10--><br />
=== Minimal requirements ===<br />
*Supported operating systems:<br />
** Linux: with a Kernel 2.6.32 or newer. <br />
** Windows: with Windows Subsystem for Linux version 2, with a distribution of Linux that matches the requirement above.<br />
** Mac OS: only through a virtual machine.<br />
* CPU: x86 CPU supporting at least one of SSE3, AVX, AVX2 or AVX512 instruction sets.<br />
<br />
=== Optimal requirements === <!--T:11--><br />
* Scheduler: Slurm or Torque, for tight integration with OpenMPI applications.<br />
* Network interconnect: Ethernet, InfiniBand or OmniPath, for parallel applications.<br />
* GPU: NVidia GPU with CUDA drivers (7.5 or newer) installed, for CUDA-enabled applications. (See below for caveats about CUDA.)<br />
* As few Linux packages installed as possible (fewer packages reduce the odds of conflicts).<br />
<br />
= Installing CVMFS = <!--T:12--><br />
If you wish to use [https://docs.ansible.com/ansible/latest/index.html Ansible], a [https://git.computecanada.ca/cc-cvmfs-public/ansible-cvmfs-client CVMFS client role] is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. <br />
Also, some [https://github.com/ComputeCanada/CVMFS/tree/main/cvmfs-cloud-scripts scripts] may be used to facilitate installing CVMFS on cloud instances.<br />
Otherwise, use the following instructions.<br />
<br />
== Pre-installation == <!--T:54--><br />
It is recommended that the local CVMFS cache (located at <code>/var/lib/cvmfs</code> by default, configurable via the <code>CVMFS_CACHE_BASE</code> setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem ''before'' installing CVMFS.<br />
<br />
== Installation == <!--T:13--><br />
<br />
<!--T:14--><br />
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions: <br />
* CentOS 6, CentOS 7, CentOS 8<br />
* Fedora 29, Fedora 32<br />
* Debian 9<br />
* Ubuntu 18.04<br />
<br />
<!--T:15--><br />
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:<br />
* CernVM key: <code>70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7</code><br />
* Compute Canada CVMFS key one: <code>C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC</code><br />
* Compute Canada CVMFS key two: <code>DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476</code><br />
</translate><br />
<tabs><br />
<tab name="RedHat/CentOS"><br />
<translate><br />
<!--T:16--><br />
* Install the CERN YUM repository and GPG key:<br />
{{Command|sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the CVMFS client and configuration packages from those YUM repositories: <br />
{{Command|sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup}}<br />
</translate><br />
</tab><br />
<tab name="Fedora"><br />
<translate><br />
<!--T:17--><br />
* Install the default configuration package:<br />
{{Command|sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm}}<br />
* Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with <code>dnf</code> (or <code>yum</code>).<br />
** Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Install the Compute Canada YUM repository and GPG keys:<br />
{{Command|sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/prod/RPM/computecanada-release-latest.noarch.rpm}}<br />
* Install the Compute Canada CVMFS configuration from that YUM repository:<br />
{{Command|sudo dnf install cvmfs-config-computecanada}}<br />
</translate><br />
</tab><br />
<tab name="Debian/Ubuntu"><br />
<translate><br />
<!--T:18--><br />
* Follow the instructions [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#debian-ubuntu here] to add the CERN apt repository.<br />
* Install the CVMFS client from that repository:<br />
{{Command|sudo apt-get install cvmfs cvmfs-config-default}}<br />
* Apply the initial client setup:<br />
{{Command|sudo cvmfs_config setup}}<br />
* Download and install the Compute Canada CVMFS configuration package: <br />
{{Commands|wget https://package.computecanada.ca/yum/cc-cvmfs-public/prod/other/cvmfs-config-computecanada-latest.all.deb<br />
|sudo dpkg -i cvmfs-config-computecanada-latest.all.deb}}<br />
:* Since an apt repository is not available for this package, make sure you are [[Accessing_CVMFS#Subscribe_to_announcements|subscribed]] to be informed of updates.<br />
</translate><br />
</tab><br />
<tab name="SLES/openSuSE"><br />
<translate><br />
<!--T:19--><br />
As these operating systems are RPM-based, following the same instructions as for Fedora should work.<br />
</translate><br />
</tab><br />
<tab name="Windows"><br />
<translate><br />
<!--T:20--><br />
* For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [https://docs.microsoft.com/en-us/windows/wsl/wsl2-install]. <br />
* Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs. <br />
* Under WSL2, with Ubuntu, <tt>/dev/fuse</tt> is not usable by other users than <tt>root</tt>. This does not allow CVMFS to work properly. To fix this, run<br />
{{Command|chmod go+rw /dev/fuse}}<br />
</translate><br />
</tab><br />
</tabs><br />
<br />
<translate><br />
<!--T:21--><br />
For more information refer to the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#getting-the-software quickstart guide].<br />
<br />
== Configuration == <!--T:22--><br />
<br />
<!--T:23--><br />
Do not create any CVMFS configuration files ending with <code>.conf</code>. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in <code>.local</code> files. See [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#structure-of-etc-cvmfs structure of /etc/cvmfs] for more information. <br />
<br />
<!--T:24--><br />
In particular, create the file <code>/etc/cvmfs/default.local</code>, with at least the following minimal configuration:<br />
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca"<br />
CVMFS_STRICT_MOUNT="yes"<br />
CVMFS_QUOTA_LIMIT=44500<br />
<br />
<!--T:25--><br />
* <code>CVMFS_REPOSITORIES</code> is a comma-separated list of the repositories to use.<br />
* <code>CVMFS_QUOTA_LIMIT</code> is the amount of local cache space in MB for CVMFS to use; set it to about 15% less than the size of your local cache filesystem.<br />
* If you have proxy servers, specify them with <code>CVMFS_HTTP_PROXY</code>. See the [https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-lists documentation] about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.<br />
<br />
<!--T:26--><br />
For more information on client configuration see the [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#setting-up-the-software quickstart guide] and [http://cvmfs.readthedocs.io/en/stable/apx-parameters.html#client-parameters client parameters documentation].<br />
<br />
== Testing == <!--T:27--><br />
<br />
<!--T:28--><br />
* Validate the configuration:<br />
{{Command|sudo cvmfs_config chksetup}}<br />
* Make sure to address any warnings or errors that are reported.<br />
* Check that the repositories are OK:<br />
{{Command|cvmfs_config probe}}<br />
<br />
<!--T:29--><br />
If you encounter problems, [https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#troubleshooting this debugging guide] may help.<br />
<br />
= Enabling our environment in your session = <!--T:33--><br />
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running<br />
{{Command|source /cvmfs/soft.computecanada.ca/config/profile/bash.sh}}<br />
<br />
<!--T:34--><br />
The above command '''will not run anything if your user ID is below 1000'''. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable <tt>FORCE_CC_CVMFS=1</tt>, with the command<br />
{{Command|export FORCE_CC_CVMFS{{=}}1}}<br />
or you can create a file <tt>$HOME/.force_cc_cvmfs</tt> in your home folder if you want it to always be active, with<br />
{{Command|touch $HOME/.force_cc_cvmfs}}<br />
<br />
<!--T:35--><br />
If, on the contrary, you want to avoid enabling our environment, you can define <tt>SKIP_CC_CVMFS=1</tt> or create the file <tt>$HOME/.skip_cc_cvmfs</tt> to ensure that the environment is never enabled in a given account.<br />
<br />
== Customizing your environment == <!--T:36--><br />
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below. <br />
<br />
=== Environment variables === <!--T:37--><br />
==== <tt>CC_CLUSTER</tt> ====<br />
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is <tt>computecanada</tt>. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.<br />
<br />
==== <tt>RSNT_ARCH</tt> ==== <!--T:38--><br />
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on <tt>/proc/cpuinfo</tt>. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:<br />
* sse3<br />
* avx<br />
* avx2<br />
* avx512<br />
<br />
==== <tt>RSNT_INTERCONNECT</tt> ==== <!--T:39--><br />
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of <tt>/sys/module/opa_vnic</tt> (for Intel OmniPath) or <tt>/sys/module/ib_core</tt> (for InfiniBand). The fall-back value is <tt>ethernet</tt>. The supported values are<br />
* omnipath<br />
* infiniband<br />
* ethernet<br />
<br />
<!--T:40--><br />
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.<br />
<br />
==== <tt>RSNT_CUDA_DRIVER_VERSION</tt> ==== <!--T:61--><br />
This environment variable is used to hide or show some versions of our CUDA modules, according to the required version of NVidia drivers, as documented [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html here]]. If not defined, this is detected based on the files founds under <tt>/usr/lib64/nvidia</tt>. <br />
<br />
<!--T:62--><br />
For backward compatibility reasons, if no library is found under <tt>/usr/lib64/nvidia</tt>, we assume that the driver versions are enough for CUDA 10.2. This is because this feature was introduced just as CUDA 11.0 was released.<br />
<br />
<!--T:63--><br />
Defining <tt>RSNT_CUDA_DRIVER_VERSION=0.0</tt> will hide all versions of CUDA.<br />
<br />
==== <tt>RSNT_LOCAL_MODULEPATHS</tt> ==== <!--T:64--><br />
This environment variable allows to define locations for local module trees, which will be automatically mesh into our central tree. To use it, define<br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
and then install your EasyBuild recipe using <br />
{{Command|eb --installpath /opt/software/easybuild <your recipe>.eb}}<br />
<br />
<!--T:65--><br />
This will use our module naming scheme to install your recipe locally, and it will be picked up by the module hierarchy. For example, if this recipe was using the <tt>iompi,2018.3</tt> toolchain, the module will become available after loading the <tt>intel/2018.3</tt> and the <tt>openmpi/3.1.2</tt> modules.<br />
<br />
==== <tt>LMOD_SYSTEM_DEFAULT_MODULES</tt> ==== <!--T:41--><br />
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the <tt>StdEnv</tt> module, which will load by default a version of the Intel compiler, and a version of OpenMPI.<br />
<br />
==== <tt>MODULERCFILE</tt> ==== <!--T:42--><br />
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own <tt>modulerc</tt> file and add it to the environment variable <tt>MODULERCFILE</tt>. This will take precedence over what is defined in our environment.<br />
<br />
=== System paths === <!--T:43--><br />
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths. <br />
<br />
==== <tt>/opt/software/modulefiles</tt> ==== <!--T:44--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also maintaining locally installed modules. <br />
<br />
==== <tt>$HOME/modulefiles</tt> ==== <!--T:45--><br />
If this path exists, it will automatically be added to the default <tt>MODULEPATH</tt>. This allows the use of our software environment while also allowing installation of modules inside of home directories.<br />
<br />
==== <tt>/opt/software/slurm/bin</tt>, <tt>/opt/software/bin</tt>, <tt>/opt/slurm/bin</tt> ==== <!--T:46--><br />
These paths are all automatically added to the default <tt>PATH</tt>. This allows your own executable to be added in the search path.<br />
<br />
== Installing software locally == <!--T:57--><br />
Since June 2020, we support installing additional modules locally and have it discovered by our central hierarchy. This was discussed and implemented in [https://github.com/ComputeCanada/software-stack/issues/11 this issue]. <br />
<br />
<!--T:58--><br />
To do so, first identify a path where you want to install local software. For example <tt>/opt/software/easybuild</tt>. Make sure that folder exists. Then, export the environment variable <tt>RSNT_LOCAL_MODULEPATHS</tt>: <br />
{{Command|export RSNT_LOCAL_MODULEPATHS{{=}}/opt/software/easybuild/modules}}<br />
<br />
<!--T:59--><br />
If you want this branch of the software hierarchy to be found by your users, we recommend you define this environment variable in the cluster's common profile. Then, install the software packages you want using EasyBuild: <br />
{{Command|eb --installpath /opt/software/easybuild <some easyconfig recipe>}}<br />
<br />
<!--T:60--><br />
This will install the piece of software locally, using the hierarchical layout driven by our module naming scheme. It will also be automatically found when users load our compiler, MPI and Cuda modules.<br />
<br />
= Caveats = <!--T:47--><br />
== Use of software environment by system administrators ==<br />
System administrators (or users managing their own personal system) who perform diagnostic operations on CVMFS, or privileged system operations, should [[Accessing_CVMFS#Enabling_our_environment_in_your_session|ensure]] that their session does ''not'' depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)<br />
<br />
== Compute Canada configuration repository == <!--T:48--><br />
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a [http://cvmfs.readthedocs.io/en/stable/cpt-configure.html#the-config-repository configuration repository], be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, ''which may conflict with your use of any other configuration repository'' and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in <tt>/cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/</tt> .)<br />
<br />
== Software packages that are not available == <!--T:49--><br />
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.<br />
<br />
== CUDA location == <!--T:50--><br />
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path <tt>/usr/lib64/nvidia</tt>. However on some platforms, recent NVidia drivers will install libraries in <tt>/usr/lib64</tt> instead. Because it is not possible to add <tt>/usr/lib64</tt> to the <tt>LD_LIBRARY_PATH</tt> without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in <tt>/usr/lib64/nvidia</tt> pointing to the installed NVidia libraries. The script below will install the drivers and create the symbolic links that are needed (adjust the driver version that you want) <br />
<br />
<!--T:56--><br />
{{File|name=script.sh|contents=<br />
NVIDIA_DRV_VER="410.48"<br />
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )<br />
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER{{)}}{{)}}<br />
for file in $(rpm -ql ${nv_pkg[@]}); do<br />
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \ <br />
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"<br />
done<br />
}}<br />
<br />
== <tt>LD_LIBRARY_PATH</tt> == <!--T:51--><br />
Our software environment is designed to use [https://en.wikipedia.org/wiki/Rpath RUNPATH]. Defining <tt>LD_LIBRARY_PATH</tt> is [https://gms.tf/ld_library_path-considered-harmful.html not recommended] and can lead to the environment not working. <br />
<br />
== Missing libraries == <!--T:52--><br />
Because we do not define <tt>LD_LIBRARY_PATH</tt>, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on [[Installing_software_in_your_home_directory#Installing_binary_packages|Installing binary packages]].<br />
<br />
== dbus == <!--T:53--><br />
For some applications, <tt>dbus</tt> needs to be installed. This needs to be installed locally, on the host operating system.<br />
</translate></div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=95976CVMFS2021-02-19T21:17:03Z<p>Rptaylor: add CC CVMFS public presentations and papers</p>
<hr />
<div>[[Category:CVMFS]]<br />
<br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction ==<br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features ===<br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.<br />
<br />
== Compute Canada CVMFS Reference Material ==<br />
* [https://indico.cern.ch/event/608592/contributions/2858287/ 2018-01-31 Compute Canada Software Installation and Distribution] 2018 CernVM Workshop<br />
* [https://indico.cern.ch/event/757415/contributions/3433887/ 2019-06-03 CVMFS at Compute Canada] 2019 CernVM Workshop<br />
* [https://guidebook.com/g/canheitarc2019/#/session/23411098 2019-06-20 Providing A Unified User Environment for Canada’s National Advanced Computing Centers] CANHEIT 2019<br />
* [https://ssl.linklings.net/conferences/pearc/pearc19_program/views/includes/files/pap139s3-file1.pdf 2019-08-01 Providing a Unified Software Environment for Canada’s National Advanced Computing Centers] PEARC 2019<br />
* [https://bc.net/distributing-software-across-campuses-and-world-cernvm-fs-0 2020-09-24 Distributing software across campuses and the world with CVMFS] BCNET Connect 2020</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=95975CVMFS2021-02-19T21:08:12Z<p>Rptaylor: </p>
<hr />
<div>[[Category:CVMFS]]<br />
<br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction ==<br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features ===<br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees and content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.</div>Rptaylorhttps://docs.alliancecan.ca/mediawiki/index.php?title=CVMFS&diff=95974CVMFS2021-02-19T21:06:42Z<p>Rptaylor: add note about Merkle trees</p>
<hr />
<div>[[Category:CVMFS]]<br />
<br />
This page describes CERN Virtual Machine File System (CVMFS). Compute Canada uses CVMFS to distribute software, data and other content. Refer to [[accessing CVMFS]] for instructions on configuring a CVMFS client to access this content, and the official [https://cvmfs.readthedocs.io/ documentation] and [https://cernvm.cern.ch/fs/ webpage] for further information.<br />
<br />
== Introduction ==<br />
CVMFS is a distributed read-only software distribution system, implemented as a POSIX filesystem in user space (FUSE) using HTTP transport. It was originally developed for the LHC (Large Hadron Collider) experiments at CERN to deliver software to virtual machines and to replace diverse shared software installation areas and package management systems at numerous computing sites. Designed to deliver software in a fast, scalable and reliable fashion, its successful use has rapidly grown over recent years to include dozens of projects, ~10<sup>10</sup> files and directories, ~10<sup>2</sup> compute sites, and ~10<sup>5</sup> clients around the world. The [http://cernvm-monitor.cern.ch/cvmfs-monitor/ CernVM Monitor] shows many research groups which use CVMFS and the stratum sites which replicate their repositories.<br />
<br />
=== Features ===<br />
* Only one copy of software needs to be maintained, and can be propagated to and used at multiple sites. Commonly used software can be installed on CVMFS in order to reduce remote software administration.<br />
* Software applications and their prerequisites can be run from CVMFS, eliminating any requirement on the Linux distribution type or release level of a client node.<br />
* The project software stack and OS can be decoupled. For the cloud use case in particular, this allows software to be accessed in a VM without being embedded in the VM image, enabling VM images and software to be updated and distributed separately.<br />
* Content versioning is provided via repository catalog revisions. Updates are committed in transactions and can be rolled back to a previous state.<br />
* Updates are propagated to clients automatically and atomically.<br />
* Clients can view historical versions of repository content.<br />
* Files are fetched using the standard HTTP protocol. Client nodes do not require ports or firewalls to be opened.<br />
* Fault-tolerance and reliability are achieved by using multiple redundant proxy and stratum servers. Clients transparently fail over to the next available proxy or server.<br />
* Hierarchical caching makes the CVMFS model highly scalable and robust and minimizes network traffic. There can be several levels in the content delivery and caching hierarchy:<br />
** The stratum 0 holds the master copy of the repository<br />
** Multiple stratum 1 servers replicate the repository contents from the stratum 0<br />
** HTTP proxy servers cache network requests from clients to stratum 1 servers<br />
** The CVMFS client downloads files on demand into the local client cache(s).<br />
*** Two tiers of local cache can be used, e.g. a fast SSD cache and a large HDD cache. A cluster filesystem can also be used as a shared cache for all nodes in a cluster.<br />
* CVMFS clients have read-only access to the filesystem.<br />
* By using Merkle trees, content-addressable storage, and encoding metadata in catalogs, all metadata is treated as data, and practically all data is immutable and highly amenable to caching.<br />
* Metadata storage and operations scale by using nested catalogs, allowing resolution of metadata queries to be performed locally by the client.<br />
* File integrity and authenticity are verified using signed cryptographic hashes, avoiding data corruption or tampering. <br />
* Automatic de-duplication and compression minimize storage usage on the server side. File chunking and on-demand access minimize storage usage on the client side.<br />
* Versatile configurations can be deployed by writing authorization helpers or cache plugins to interact with external authorization or storage providers.</div>Rptaylor