Accessing CVMFS/fr: Difference between revisions
(Created page with "== Exigences de l’environnement logiciel == === Exigences de base === *Système d’exploitation : ** Linux : avec noyau (''kernel'') 2.6.32 ou plus, ** Windows : avec la ve...") |
(Created page with "=== Pour une utilisation optimale === * Ordonnanceur : Slurm ou Torque, pour une intégration étroite avec les applications OpenMPI; * Interconnexion réseau : Ethernet, Infi...") |
||
Line 51: | Line 51: | ||
* CPU : x86, pour jeux d’instructions SSE3, AVX, AVX2 ou AVX512. | * CPU : x86, pour jeux d’instructions SSE3, AVX, AVX2 ou AVX512. | ||
=== | === Pour une utilisation optimale === | ||
* | * Ordonnanceur : Slurm ou Torque, pour une intégration étroite avec les applications OpenMPI; | ||
* | * Interconnexion réseau : Ethernet, InfiniBand ou OmniPath, pour les applications parallèles; | ||
* GPU: NVidia | * GPU : NVidia avec pilotes CUDA 7.5 ou plus, pour les applications CUDA (voir la mise en garde ci-dessous); | ||
* | * Un minimum de paquets Linux, pour éviter les risques de conflits. | ||
= Installing CVMFS = | = Installing CVMFS = |
Revision as of 18:16, 14 August 2019
Introduction
Les répertoires de logiciels et de données offerts par Calcul Canada sont accessibles via CVMFS (CERN Virtual Machine File System). Puisque CVMFS est préconfiguré pour vous, vous pouvez utiliser ses répertoires directement. Pour plus d’information sur l’environnement logiciel de Calcul Canada, consultez les pages wiki Logiciels disponibles, Utiliser des modules, Python, R et Installation de logiciels dans votre répertoire /home.
Nous décrivons ici comment installer et configurer CVMFS sur votre propre ordinateur ou grappe; vous aurez ainsi accès aux mêmes répertoires et environnements logiciels que ceux des systèmes de Calcul Canada.
Nous utilisons comme exemple l'environnement logiciel présenté à la conférence PEARC 2019, Practices and Experience in Advanced Research Computing.
Avant de commencer
Veuillez vous abonner au service d'annonces et remplir ce formulaire d'enregistrement (en anglais). Si vous utilisez notre environnement logiciel dans votre recherche, veuillez reconnaître la contribution de Calcul Canada selon ces directives.
Nous vous remercions de mentionner aussi notre présentation.
S'abonner au service d'annonces
Des modifications peuvent être apportées au CVMFS ou au logiciels et autre contenu des répertoires fournis par Calcul Canada; ces modifications touchent les utilisateurs ou nécessitent l’intervention de l’administrateur pour assurer la continuité du service.
Abonnez-vous à la liste de diffusion cvmfs-announce@calculcanada.ca afin de recevoir les annonces importantes occasionnelles. Vous pouvez vous abonner avec un compte Google ou en écrivant à cvmfs-announce+subscribe@calculcanada.ca et en répondant au courriel de confirmation.
Conditions d’utilisation et soutien technique
Le logiciel client CVMFS est fourni par le CERN. Calcul Canada fournit ses répertoires CVMFS sans aucune forme de garantie. Votre accès aux répertoires et à l’environnement logiciel peut être limité ou bloqué si vous contrevenez aux conditions d’utilisation (par exemple et sans s’y limiter, aux articles 3.5 ou 3.11) ou à la discrétion de Calcul Canada.
Exigences techniques
Pour un seul système
Pour installer CVMFS sur un ordinateur personnel, les exigences sont :
- un système d’exploitation compatible (voir la section Installation);
- le logiciel libre FUSE;
- environ 50Mo d’espace de stockage local pour la cache; l'espace peut être moindre et ne sera utilisé que pour les fichiers auxquels vous accédez;
- l’accès HTTP vers l’internet,
- ou l’accès HTTP vers un ou plusieurs serveurs proxies locaux.
Si ces conditions ne sont pas respectées ou que vous avez d’autres restrictions, considérez ces alternative autres options.
Pour plusieurs systèmes
Pour déployer plusieurs clients CVMFS, par exemple avec une grappe, dans un laboratoire, sur un campus ou autre, chacun des systèmes doit satisfaire les exigences particulières énoncées ci-dessus. Tenez compte en plus des points suivants :
- Pour votre site, vous devez déployer des serveurs proxies HTTP avec cache externe (forward caching), comme Squid.
- Le fait de ne disposer que d’un seul serveur proxy est un point individuel de défaillance. Règle générale, vous devriez disposer d’au moins deux serveurs proxies locaux et préférablement un ou plusieurs autres serveurs proxies supplémentaires à proximité pour prendre la relève en cas de problème.
- Nous recommandons de synchroniser l’identité du compte de service
cvmfs
de tous les nœuds clients avec LDAP ou autrement.- Ceci facilitera l’utilisation d’une cache externe et devrait être fait avant que CVMFS ne soit installé. Même si l’utilisation d’une cache externe n’est pas prévue, il est plus facile de synchroniser les comptes dès le départ que d’essayer de les changer plus tard.
Exigences de l’environnement logiciel
Exigences de base
- Système d’exploitation :
- Linux : avec noyau (kernel) 2.6.32 ou plus,
- Windows : avec la version 2 du sous-système Windows pour Linux (WSL) et une distribution Linux avec noyau (kernel) 2.6.32 ou plus,
- Mac OS : par instance virtuelle seulement;
- CPU : x86, pour jeux d’instructions SSE3, AVX, AVX2 ou AVX512.
Pour une utilisation optimale
- Ordonnanceur : Slurm ou Torque, pour une intégration étroite avec les applications OpenMPI;
- Interconnexion réseau : Ethernet, InfiniBand ou OmniPath, pour les applications parallèles;
- GPU : NVidia avec pilotes CUDA 7.5 ou plus, pour les applications CUDA (voir la mise en garde ci-dessous);
- Un minimum de paquets Linux, pour éviter les risques de conflits.
Installing CVMFS
If you wish to use Ansible, a CVMFS client role is provided as-is, for basic minimal configuration of a CVMFS client on an RPM-based system. Otherwise, use the following instructions.
Pre-installation
It is recommended that the local CVMFS cache (located at /var/lib/cvmfs
by default, configurable via the CVMFS_CACHE_BASE
setting) be on a dedicated filesystem so that the storage usage of CVMFS is not shared with that of other applications. Accordingly, you should provision that filesystem before installing CVMFS. The cache should typically be about 50 GB in size, but more or less may be suitable in different situations. For more details see the client configuration documentation.
Installation
Follow the instructions relative to your operating system in order to install CVMFS. These instructions have been tested on the following distributions:
- CentOS 6, CentOS 7
- Fedora 29
- Debian 9
- Ubuntu 18.04
When installing packages you may be prompted to accept some GPG keys. You should ensure that their fingerprints match these expected values:
- CernVM key:
70B9 8904 8820 8E31 5ED4 5208 230D 389D 8AE4 5CE7
- Compute Canada CVMFS key one:
C0C4 0F04 70A3 6AF2 7CC4 4D5A 3B9F C55A CF21 4CFC
- Compute Canada CVMFS key two:
DDCD 3C84 ACDF 133F 4BEC FBFA 49DE 2015 FF55 B476
- Install the CERN YUM repository and GPG key:
[name@server ~]$ sudo yum install https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm
- Install the Compute Canada YUM repository and GPG keys:
[name@server ~]$ sudo yum install https://package.computecanada.ca/yum/cc-cvmfs-public/Packages/computecanada-release-latest.noarch.rpm
- Install the CVMFS client and configuration packages from those YUM repositories:
[name@server ~]$ sudo yum install cvmfs cvmfs-config-default cvmfs-config-computecanada cvmfs-auto-setup
- Install the default configuration package:
[name@server ~]$ sudo dnf install https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/cvmfs-config-default-latest.noarch.rpm
- Download the CVMFS client RPM for your operating system from https://cernvm.cern.ch/portal/filesystem/downloads and install it with
dnf
(oryum
).- Since a yum repository for CVMFS is not available for this operating system, you will need to periodically check for updates to the CVMFS client and default configuration and install them manually.
- Apply the initial client setup:
[name@server ~]$ sudo cvmfs_config setup
- Install the Compute Canada YUM repository and GPG keys:
[name@server ~]$ sudo dnf install https://package.computecanada.ca/yum/cc-cvmfs-public/Packages/computecanada-release-latest.noarch.rpm
- Install the Compute Canada CVMFS configuration from that YUM repository:
[name@server ~]$ sudo dnf install cvmfs-config-computecanada
- Follow the instructions here to add the CERN apt repository.
- Install the CVMFS client from that repository:
[name@server ~]$ sudo apt-get install cvmfs cvmfs-config-default uuid-runtime
- Apply the initial client setup:
[name@server ~]$ sudo cvmfs_config setup
- Download and install the Compute Canada CVMFS configuration package:
[name@server ~]$ wget https://package.computecanada.ca/yum/cc-cvmfs-public/OtherPackages/cvmfs-config-computecanada-latest.all.deb
[name@server ~]$ sudo dpkg -i cvmfs-config-computecanada-latest.all.deb
- Since an apt repository is not available for this package, make sure you are subscribed to be informed of updates.
As these operating systems are RPM-based, following the same instructions as for Fedora should work.
- For Windows, you first need to have Windows Subsystem for Linux, version 2. As of this writing (July 2019), this is supported only in a developer version of Windows. The instructions for installing it are here [1].
- Once it is installed, install the Linux distribution of your choice, and follow the appropriate instructions from one of the other tabs.
- Under WSL2, with Ubuntu, /dev/fuse is not usable by other users than root. This does not allow CVMFS to work properly. To fix this, run
[name@server ~]$ chmod go+rw /dev/fuse
For more information refer to the quickstart guide.
Configuration
Do not create any CVMFS configuration files ending with .conf
. In order to avoid collisions with upstream configuration sources, all locally-applied configuration must be in .local
files. See structure of /etc/cvmfs for more information.
In particular, create the file /etc/cvmfs/default.local
, with at least the following minimal configuration:
CVMFS_REPOSITORIES="cvmfs-config.computecanada.ca,soft.computecanada.ca" CVMFS_QUOTA_LIMIT=44500 CVMFS_HTTP_PROXY="<define this parameter according to the information below>"
CVMFS_REPOSITORIES
is a comma-separated list of the repositories that you are interested in.CVMFS_QUOTA_LIMIT
is the amount of space in MB that CVMFS will use for the local cache; it should be about 15% less than the size of the local cache filesystem.CVMFS_HTTP_PROXY
lists the proxy servers to use. See the documentation about this parameter, including syntax, examples, and use of load-balancing groups and round-robin DNS.- If you are an individual user installing CVMFS on a single computer for your own use; you may use
CVMFS_HTTP_PROXY="DIRECT"
. However, you should ask your local system administration team at your organization (if applicable) if there are forward caching HTTP proxy servers available for your use, as this will improve the performance of your CVMFS client. - If you are an administrator installing CVMFS on multiple systems (such as in a cluster, laboratory, campus or other site), the proxies that you have deployed according to the requirements must be specified here in the
CVMFS_HTTP_PROXY
parameter. Moreover, you should inform users at your site or organization (if applicable) that they may use these proxy servers.
- If you are an individual user installing CVMFS on a single computer for your own use; you may use
For more information on client configuration see the quickstart guide and client parameters documentation.
Testing
- Validate the configuration:
[name@server ~]$ sudo cvmfs_config chksetup
- Make sure to address any warnings or errors that are reported.
- Check that the repositories are OK:
[name@server ~]$ cvmfs_config probe
If you encounter problems, this debugging guide may help.
Enabling our environment in your session
Once you have mounted the CVMFS repository, enabling our environment in your sessions is as simple as running
[name@server ~]$ source /cvmfs/soft.computecanada.ca/config/profile/bash.sh
The above command will not run anything if your user ID is below 1000. This is a safeguard, because you should not rely on our software environment for privileged operation. If you nevertheless want to enable our environment, you can first define the environment variable FORCE_CC_CVMFS=1, with the command
[name@server ~]$ export FORCE_CC_CVMFS=1
or you can create a file $HOME/.force_cc_cvmfs in your home folder if you want it to always be active, with
[name@server ~]$ touch $HOME/.force_cc_cvmfs
If, on the contrary, you want to avoid enabling our environment, you can define SKIP_CC_CVMFS=1 or create the file $HOME/.skip_cc_cvmfs to ensure that the environment is never enabled in a given account.
Customizing your environment
By default, enabling our environment will automatically detect a number of features of your system, and load default modules. You can control the default behaviour by defining specific environment variables prior to enabling the environment. These are described below.
Environment variables
CC_CLUSTER
This variable is used to identify a cluster. It is used to send some information to the system logs, as well as define behaviour relative to licensed software. By default, its value is computecanada. You may want to set the value of this variable if you want to have system logs tailored to the name of your system.
RSNT_ARCH
This environment variable is used to identify the set of CPU instructions supported by the system. By default, it will be automatically detected based on /proc/cpuinfo. However if you want to force a specific one to be used, you can define it before enabling the environment. The supported instruction sets for our software environment are:
- sse3
- avx
- avx2
- avx512
RSNT_INTERCONNECT
This environment variable is used to identify the type of interconnect supported by the system. By default, it will be automatically detected based on the presence of /sys/module/opa_vnic (for Intel OmniPath) or /sys/module/ib_core (for InfiniBand). The fall-back value is ethernet. The supported values are
- omnipath
- infiniband
- ethernet
The value of this variable will trigger different options of transport protocol to be used in OpenMPI.
LMOD_SYSTEM_DEFAULT_MODULES
This environment variable defines which modules are loaded by default. If it is left undefined, our environment will define it to load the StdEnv module, which will load by default a version of the Intel compiler, and a version of OpenMPI.
MODULERCFILE
This is an environment variable used by Lmod to define the default version of modules and aliases. You can define your own modulerc file and add it to the environment variable MODULERCFILE. This will take precedence over what is defined in our environment.
System paths
While our software environment strives to be as independent from the host operating system as possible, there are a number of system paths that are taken into account by our environment to facilitate interaction with tools installed on the host operating system. Below are some of these paths.
/opt/software/modulefiles
If this path exists, it will automatically be added to the default MODULEPATH. This allows the use of our software environment while also maintaining locally installed modules.
$HOME/modulefiles
If this path exists, it will automatically be added to the default MODULEPATH. This allows the use of our software environment while also allowing installation of modules inside of home directories.
/opt/software/slurm/bin, /opt/software/bin, /opt/slurm/bin
These paths are all automatically added to the default PATH. This allows your own executable to be added in the search path.
Caveats
Use of software environment by system administrators
System administrators (or users managing their own personal system) who perform privileged system operations should ensure that their session does not depend on the Compute Canada software environment when performing any such operations. For example, if you attempt to update CVMFS using YUM while your session uses a Python module loaded from CVMFS, YUM may run using that module and lose access to it during the update, and the update may become deadlocked. Similarly, if your environment depends on CVMFS and you reconfigure CVMFS in a way that temporarily interrupts access to CVMFS, your session may hang. (When these precautions are taken, in most cases CVMFS can be updated and reconfigured without interrupting access to CVMFS for users, because the update or reconfiguration itself will complete successfully.)
Compute Canada configuration repository
If you already have CVMFS installed and configured in order to use other repositories (like CERN's repositories), and if your CVMFS client configuration relies on the use of a configuration repository, be aware that the cvmfs-config-computecanada package sets up and enables the cvmfs-config.computecanada.ca configuration repository, which may conflict with your use of any other configuration repository and potentially break your pre-existing CVMFS client configuration, since clients can only use a single configuration repository. (The Compute Canada CVMFS configuration repository is a central source of configuration that makes all other Compute Canada CVMFS repositories available. It provides all site-independent client configuration required for Compute Canada usage and allows client configuration updates to be automatically propagated. The contents can be seen in /cvmfs/cvmfs-config.computecanada.ca/etc/cvmfs/ .)
Software packages that are not available
On Compute Canada systems, a number of commercial software packages are made available to authorized users according to the terms of the license owners, but they are not available outside of Compute Canada systems, and following the instructions on this page will not grant you access to them. This includes for example the Intel and Portland Group compilers. While the modules for the Intel and PGI compilers are available, you will only have access to the redistributable parts of these packages, usually the shared objects. These are sufficient to run software packages compiled with these compilers, but not to compile new software.
CUDA location
For CUDA-enabled software packages, our software environment relies on having driver libraries installed in the path /usr/lib64/nvidia. However on some platforms, recent NVidia drivers will install libraries in /usr/lib64 instead. Because it is not possible to add /usr/lib64 to the LD_LIBRARY_PATH without also pulling in all system libraries (which may have incompatibilities with our software environment), we recommend that you create symbolic links in /usr/lib64/nvidia pointing to the installed NVidia libraries. The script below will create the symbolic links that are needed (adjust the driver version that you have)
NVIDIA_DRV_VER="410.48"
nv_pkg=( "nvidia-driver" "nvidia-driver-libs" "nvidia-driver-cuda" "nvidia-driver-cuda-libs" "nvidia-driver-NVML" "nvidia-driver-NvFBCOpenGL" "nvidia-modprobe" )
yum -y install ${nv_pkg[@]/%/-${NVIDIA_DRV_VER}}
for file in $(rpm -ql ${nv_pkg[@]}); do
[ "${file%/*}" = '/usr/lib64' ] && [ ! -d "${file}" ] && \
ln -snf "$file" "${file%/*}/nvidia/${file##*/}"
done
LD_LIBRARY_PATH
Our software environment is designed to use RUNPATH. Defining LD_LIBRARY_PATH is not recommended and can lead to the environment not working.
Missing libraries
Because we do not define LD_LIBRARY_PATH, and because our libraries are not installed in default Linux locations, binary packages, such as Anaconda, will often not find libraries that they would usually expect. Please see our documentation on Installing binary packages.
dbus
For some applications, dbus needs to be installed. This needs to be installed locally, on the host operating system.