Python: Difference between revisions

Uniformise “Python packages” vs “Lmod modules” to avoid confusion
(Marked this version for translation)
(Uniformise “Python packages” vs “Lmod modules” to avoid confusion)
Line 4: Line 4:


== Description == <!--T:1-->
== Description == <!--T:1-->
[http://www.python.org/ Python] is an interpreted programming language with a design philosophy stressing the readability of code. Its syntax is simple and expressive. Python has an extensive, easy-to-use library of standard modules.
[http://www.python.org/ Python] is an interpreted programming language with a design philosophy stressing the readability of code. Its syntax is simple and expressive. Python has an extensive, easy-to-use standard library.


<!--T:2-->
<!--T:2-->
The capabilities of Python can be extended with modules developed by third parties. In general, to simplify operations, it is left up to individual users and groups to install these third-party modules in their own directories. However, most systems offer several versions of Python as well as tools to help you install the third-party modules that you need.
The capabilities of Python can be extended with packages developed by third parties. In general, to simplify operations, it is left up to individual users and groups to install these third-party packages in their own directories. However, most systems offer several versions of Python as well as tools to help you install the third-party packages that you need.


<!--T:3-->
<!--T:3-->
The following sections discuss the Python interpreter, and how to install and use modules.
The following sections discuss the Python interpreter, and how to install and use packages.


== Loading an interpreter == <!--T:4-->
== Loading an interpreter == <!--T:4-->
Line 39: Line 39:


<!--T:32-->
<!--T:32-->
If you want to use any of these Python modules, load a Python version of your choice and then <code>module load scipy-stack</code>.  
If you want to use any of these Python packages, load a Python version of your choice and then <code>module load scipy-stack</code>.  


<!--T:33-->
<!--T:33-->
Line 47: Line 47:


<!--T:8-->
<!--T:8-->
With each version of Python, we provide the tool [http://pypi.python.org/pypi/virtualenv virtualenv]. This tool allows users to create virtual environments within which you can easily install Python modules. These environments allow one to install many versions of the same module, for example, or to compartmentalize a Python installation according to the needs of a specific project. We recommend that you create your Python virtual environment(s) in your home directory.  
With each version of Python, we provide the tool [http://pypi.python.org/pypi/virtualenv virtualenv]. This tool allows users to create virtual environments within which you can easily install Python packages. These environments allow one to install many versions of the same package, for example, or to compartmentalize a Python installation according to the needs of a specific project. We recommend that you create your Python virtual environment(s) in your home directory.  


<!--T:9-->
<!--T:9-->
Line 65: Line 65:
{{Command|prompt=(ENV) [name@server ~]|deactivate}}
{{Command|prompt=(ENV) [name@server ~]|deactivate}}


=== Installing modules === <!--T:13-->
=== Installing packages === <!--T:13-->


<!--T:14-->
<!--T:14-->
Once you have a virtual environment loaded, you will be able to run the [http://www.pip-installer.org/ pip] command. This command takes care of compiling and installing most of Python modules and their dependencies. A comprehensive index of Python packages can be found at [https://pypi.python.org/pypi PyPI].
Once you have a virtual environment loaded, you will be able to run the [http://www.pip-installer.org/ pip] command. This command takes care of compiling and installing most of Python packages and their dependencies. A comprehensive index of Python packages can be found at [https://pypi.python.org/pypi PyPI].


<!--T:15-->
<!--T:15-->
All of <tt>pip</tt>'s commands are explained in detail in the [https://pip.pypa.io/en/stable/user_guide/ user guide]. We will cover only the most important commands and use the [http://numpy.scipy.org/ Numpy] module as an example.
All of <tt>pip</tt>'s commands are explained in detail in the [https://pip.pypa.io/en/stable/user_guide/ user guide]. We will cover only the most important commands and use the [http://numpy.scipy.org/ Numpy] package as an example.


<!--T:16-->
<!--T:16-->
Line 88: Line 88:


<!--T:29-->
<!--T:29-->
In the first invocation of the <tt>pip</tt> command above it isn't as obvious where the <tt>numpy</tt> module is being installed from. One might assume it is being installed from [https://pypi.org/ PyPI] but in the particular case of numpy it is actually being installed from a distribution package offered by Compute Canada. This distribution package is called a python [https://pythonwheels.com/ wheel]. It will install from Compute Canada's local wheel provided the package version Compute Canada provides is current. If the PyPI has a newer version, that version will be installed instead of the version Compute Canada provides. To disable this default behaviour and use the older Compute Canada specific wheel use the [https://pip.pypa.io/en/stable/reference/pip_wheel/#cmdoption-no-index <tt>--no-index</tt>] option.
In the first invocation of the <tt>pip</tt> command above it isn't as obvious where the <tt>numpy</tt> package is being installed from. One might assume it is being installed from [https://pypi.org/ PyPI] but in the particular case of numpy it is actually being installed from a distribution package offered by Compute Canada. This distribution package is called a python [https://pythonwheels.com/ wheel]. It will install from Compute Canada's local wheel provided the package version Compute Canada provides is current. If the PyPI has a newer version, that version will be installed instead of the version Compute Canada provides. To disable this default behaviour and use the older Compute Canada specific wheel use the [https://pip.pypa.io/en/stable/reference/pip_wheel/#cmdoption-no-index <tt>--no-index</tt>] option.


<!--T:30-->
<!--T:30-->
To see where the <tt>pip</tt> command is installing a python module from you can tell it to be more verbose with the <tt>-vvv</tt> option. Compute Canada provides python wheels for many common python modules which are configured to make the best use of the hardware and installed libraries on our clusters.
To see where the <tt>pip</tt> command is installing a python package from you can tell it to be more verbose with the <tt>-vvv</tt> option. Compute Canada provides python wheels for many common python packages which are configured to make the best use of the hardware and installed libraries on our clusters.


==== Installing dependent packages ==== <!--T:35-->
==== Installing dependent packages ==== <!--T:35-->
In some cases, such as TensorFlow or Pytorch, Compute Canada provides wheels for a specific host (cpu or gpu), suffixed with <tt>_cpu</tt> or <tt>_gpu</tt>. Packages dependent on <tt>torch</tt> will then fail to install.  
In some cases, such as TensorFlow or Pytorch, Compute Canada provides wheels for a specific host (cpu or gpu), suffixed with <tt>_cpu</tt> or <tt>_gpu</tt>. Packages dependent on <tt>torch</tt> will then fail to install.  
If <tt>my_package</tt> depend on <tt>numpy</tt> and <tt>torch</tt>, then the following will allow us to install it:
If <tt>my_package</tt> depend on <tt>numpy</tt> and <tt>torch</tt>, then the following will allow us to install it:
Line 103: Line 104:


=== Creating virtual environments inside of your jobs === <!--T:36-->
=== Creating virtual environments inside of your jobs === <!--T:36-->
Parallel filesystems such as the ones used on our clusters are very good at reading or writing large chunks of data, but can be bad for intensive use of small files. Launching a software and loading libraries, such as starting python and loading a virtual environment, can be slow for this reason.  
Parallel filesystems such as the ones used on our clusters are very good at reading or writing large chunks of data, but can be bad for intensive use of small files. Launching a software and loading libraries, such as starting python and loading a virtual environment, can be slow for this reason.  


Line 236: Line 238:
# Then, when installing, use the path for file <tt>pip install tensorboardX-1.9-py2.py3-none-any.whl</tt>.
# Then, when installing, use the path for file <tt>pip install tensorboardX-1.9-py2.py3-none-any.whl</tt>.


== Parallel programming with Python <tt>multiprocessing</tt> module == <!--T:45-->
== Parallel programming with the Python <tt>multiprocessing</tt> module == <!--T:45-->
 
Doing parallel programming with Python can be an easy way to get results faster. An usual way of doing so is to use the [https://sebastianraschka.com/Articles/2014_multiprocessing.html <tt>multiprocessing</tt>] module. Of particular interest is the <tt>Pool</tt> class of this module, since it allows one to control the number of processes started in parallel, and apply the same calculation to multiple data. As an example, suppose we want to calculate the <tt>cube</tt> of a list of numbers. The serial code would look like this :  
Doing parallel programming with Python can be an easy way to get results faster. An usual way of doing so is to use the [https://sebastianraschka.com/Articles/2014_multiprocessing.html <tt>multiprocessing</tt>] module. Of particular interest is the <tt>Pool</tt> class of this module, since it allows one to control the number of processes started in parallel, and apply the same calculation to multiple data. As an example, suppose we want to calculate the <tt>cube</tt> of a list of numbers. The serial code would look like this :  
<tabs>
<tabs>
cc_staff
127

edits