Anaconda/en: Difference between revisions
(Updating to match new version of source page) |
No edit summary |
||
Line 18: | Line 18: | ||
A [[Python#Creating_and_using_a_virtual_environment|virtual environment]] offers you all the functionality which you need to use Python on our clusters. Here is how to convert to the use of virtual environments if you use Anaconda on your personal computer: | A [[Python#Creating_and_using_a_virtual_environment|virtual environment]] offers you all the functionality which you need to use Python on our clusters. Here is how to convert to the use of virtual environments if you use Anaconda on your personal computer: | ||
# List the dependencies (requirements) of the application you want to use. To do so, you can: | |||
# List the dependencies (requirements) of the application you want to use. | ## Run <code>pip show <package_name></code> from your virtual environment (if the package exists on [https://pypi.org/ PyPI]) | ||
## Or, check if there is a <tt>requirements.txt</tt> file in the Git repository. | |||
## Or, check the variable <tt>install_requires</tt> of the file <tt>setup.py</tt>, which lists the requirements. | |||
# Find which dependencies are Python modules and which are libraries provided by Anaconda. For example, CUDA and CuDNN are libraries which are available on Anaconda Cloud but which you should not install yourself on our clusters - they are already installed. | # Find which dependencies are Python modules and which are libraries provided by Anaconda. For example, CUDA and CuDNN are libraries which are available on Anaconda Cloud but which you should not install yourself on our clusters - they are already installed. | ||
# Remove from the list of dependencies everything which is not a Python module (e.g. <tt>cudatoolkit</tt> and <tt>cudnn</tt>). | # Remove from the list of dependencies everything which is not a Python module (e.g. <tt>cudatoolkit</tt> and <tt>cudnn</tt>). | ||
# Use a [[Python#Creating_and_using_a_virtual_environment|virtual environment]] in which you will install your dependencies. | # Use a [[Python#Creating_and_using_a_virtual_environment|virtual environment]] in which you will install your dependencies. | ||
Your software should run - if it doesn't, don't hesitate to [[Technical support|contact us]]. | Your software should run - if it doesn't, don't hesitate to [[Technical support|contact us]]. |
Revision as of 16:38, 5 March 2020
Anaconda is a Python distribution. We ask our users to not install Anaconda on our clusters.
Do not install Anaconda on our clusters
We are aware of the fact that Anaconda is widely used in several domains, such as data science, AI, bioinformatics etc. Anaconda is a useful solution for simplifying the management of Python and scientific libraries on a personal computer. However, on a cluster like those supported by Compute Canada, the management of these libraries and dependencies should be done by our staff, in order to ensure compatibility and optimal performance. Here is a list of reasons:
- Anaconda very often installs software (compilers, scientific libraries etc.) which already exist on Compute Canada clusters as modules, with a configuration that is not optimal.
- It installs binaries which are not optimized for the processor architecture on our clusters.
- It makes incorrect assumptions about the location of various system libraries.
- Anaconda uses the $HOME directory for its installtion, where it writes an enormous number of files. A single Anaconda installation can easily absorb almost half of your quota for the number of files in your home directory.
- Anaconda is slower than the installation of packages via Python wheels.
- Anaconda modifies the $HOME/.bashrc file, which can easily cause conflicts.
How to transition from Conda to Virtualenv
A virtual environment offers you all the functionality which you need to use Python on our clusters. Here is how to convert to the use of virtual environments if you use Anaconda on your personal computer:
- List the dependencies (requirements) of the application you want to use. To do so, you can:
- Run
pip show <package_name>
from your virtual environment (if the package exists on PyPI) - Or, check if there is a requirements.txt file in the Git repository.
- Or, check the variable install_requires of the file setup.py, which lists the requirements.
- Run
- Find which dependencies are Python modules and which are libraries provided by Anaconda. For example, CUDA and CuDNN are libraries which are available on Anaconda Cloud but which you should not install yourself on our clusters - they are already installed.
- Remove from the list of dependencies everything which is not a Python module (e.g. cudatoolkit and cudnn).
- Use a virtual environment in which you will install your dependencies.
Your software should run - if it doesn't, don't hesitate to contact us.
Exemples où Anaconda ne fonctionne pas
- R
- Une recette conda force l'installation de R. Cette installation ne performe pas aussi bien que le R disponible par les modules (qui lui utilise Intel MKL). Ce même R fonctionne mal et les tâches meurt, gaspillant ainsi des ressources et votre temps.
- Quota de fichiers
- Conda installe une grande quantité de fichiers dans votre $HOME, allant jusqu'à atteindre les quotas en vigueur.