Anaconda/en: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
<languages /> | <languages /> | ||
[[Category:Software]] | [[Category:Software]] | ||
Anaconda is a Python distribution. We | Anaconda is a Python distribution. We ask our users to '''not install Anaconda on our clusters'''. | ||
==Do not install Anaconda on our clusters== | ==Do not install Anaconda on our clusters== |
Revision as of 16:49, 19 February 2020
Anaconda is a Python distribution. We ask our users to not install Anaconda on our clusters.
Do not install Anaconda on our clusters
We are aware of the fact that Anaconda is widely used in several domains, such as data science, AI, bioinformatics etc. Anaconda is a useful solution for simplifying the management of Python and scientific libraries on a personal computer. However, on a cluster like those supported by Compute Canada, the management of these libraries and dependencies should be done by our staff, in order to ensure compatibility and optimal performance. Here is a list of reasons:
- Anaconda very often installs software (compilers, scientific libraries etc.) which already exist on Compute Canada clusters as modules, with a configuration that is not optimal.
- It installs binaries which are not optimized for the processor architecture on our clusters.
- It makes incorrect assumptions about the location of various system libraries.
- Anaconda uses the $HOME directory for its installtion, where it writes an enormous number of files. A single Anaconda installation can easily absorb almost half of your quota for the number of files in your home directory.
- Anaconda is slower than the installation of packages via Python wheels.
- Anaconda modifies the $HOME/.bashrc file, which can easily cause conflicts.
How to transition from Conda to Virtualenv
A virtual environment offers you all the functionality which you need to use Python on our clusters. Here is how to convert to the use of virtual environments if you use Anaconda on your personal computer:
- List the dependencies (requirements) of the application you want to use.
- Find which dependencies are Python modules and which are libraries provided by Anaconda. For example, CUDA and CuDNN are libraries which are available on Anaconda Cloud but which you should not install yourself on our clusters - they are already installed.
- Remove from the list of dependencies everything which is not a Python module (e.g. cudatoolkit and cudnn).
- Use a virtual environment in which you will install your dependencies.
Your software should run - if it doesn't, don't hesitate to contact us.