Python: Difference between revisions

Jump to navigation Jump to search
no edit summary
(Marked this version for translation)
No edit summary
Line 124: Line 124:


<!--T:68-->
<!--T:68-->
The <code>pip</code> command can install packages from a variety of sources, including PyPI and pre-built distribution packages called Python [https://pythonwheels.com/ wheels]. Compute Canada provides Python wheels for a number of packages. In the above example, the [https://pip.pypa.io/en/stable/reference/pip_wheel/#cmdoption-no-index <code>--no-index</code>] option tells <code>pip</code> to ''not'' install from PyPI, but instead to install only from locally-available packages, i.e. the Compute Canada wheels.
The <code>pip</code> command can install packages from a variety of sources, including PyPI and pre-built distribution packages called Python [https://pythonwheels.com/ wheels]. We provide Python wheels for a number of packages. In the above example, the [https://pip.pypa.io/en/stable/reference/pip_wheel/#cmdoption-no-index <code>--no-index</code>] option tells <code>pip</code> to ''not'' install from PyPI, but instead to install only from locally-available packages, i.e. our wheels.


<!--T:69-->
<!--T:69-->
Whenever a Compute Canada wheel is available for a given package, we strongly recommend to use it by way of the <code>--no-index</code> option. Compared to using packages from PyPI, wheels that have been compiled by Compute Canada staff can prevent issues with missing or conflicting dependencies, and were optimised for our clusters hardware and libraries. See [[#Available_wheels|Available wheels]].
Whenever we provide a wheel for a given package, we strongly recommend to use it by way of the <code>--no-index</code> option. Compared to using packages from PyPI, wheels that have been compiled by our staff can prevent issues with missing or conflicting dependencies, and were optimised for our clusters hardware and libraries. See [[#Available_wheels|Available wheels]].


<!--T:34-->
<!--T:34-->
If you omit the <code>--no-index</code> option, <code>pip</code> will search both PyPI and local packages, and use the latest version available. If PyPI has a newer version, it will be installed instead of the Compute Canada wheel, possibly causing issues. If you are certain that you prefer to download a package from PyPI rather than use a wheel, you can use the <code>--no-binary</code> option, which tells <code>pip</code> to ignore pre-built packages entirely. Note that this will also ignore wheels that are distributed through PyPI, and will always compile the package from source.
If you omit the <code>--no-index</code> option, <code>pip</code> will search both PyPI and local packages, and use the latest version available. If PyPI has a newer version, it will be installed instead of our wheel, possibly causing issues. If you are certain that you prefer to download a package from PyPI rather than use a wheel, you can use the <code>--no-binary</code> option, which tells <code>pip</code> to ignore pre-built packages entirely. Note that this will also ignore wheels that are distributed through PyPI, and will always compile the package from source.


<!--T:30-->
<!--T:30-->
Line 138: Line 138:


<!--T:65-->
<!--T:65-->
In some cases, such as TensorFlow, Compute Canada provides wheels for a specific host (cpu or gpu), suffixed with <tt>_cpu</tt> or <tt>_gpu</tt>. Packages dependent on <tt>tensorflow</tt> will then fail to install.  
In some cases, such as TensorFlow, we provide wheels for a specific host (cpu or gpu), suffixed with <tt>_cpu</tt> or <tt>_gpu</tt>. Packages dependent on <tt>tensorflow</tt> will then fail to install.  
If <tt>my_package</tt> depend on <tt>numpy</tt> and <tt>tensorflow</tt>, then the following will allow us to install it:
If <tt>my_package</tt> depend on <tt>numpy</tt> and <tt>tensorflow</tt>, then the following will allow us to install it:
{{Commands|prompt=(ENV) [name@server ~]
{{Commands|prompt=(ENV) [name@server ~]
Line 241: Line 241:


<!--T:43-->
<!--T:43-->
Note that the above instructions require all of the packages you need to be available in the python wheels that we provide (see "Available wheels" below). If the wheel is not available in our wheelhouse, you can pre-download it (see "Pre-downloading packages" section below). If you think that the missing wheel should be included in the Compute Canada wheelhouse, please contact [[Technical support]] to make a request.
Note that the above instructions require all of the packages you need to be available in the python wheels that we provide (see "Available wheels" below). If the wheel is not available in our wheelhouse, you can pre-download it (see "Pre-downloading packages" section below). If you think that the missing wheel should be included in our wheelhouse, please contact [[Technical support]] to make a request.


=== Available wheels === <!--T:23-->
=== Available wheels === <!--T:23-->
Line 500: Line 500:


<!--T:60-->
<!--T:60-->
Note that the multiprocessing module is restricted to using a single compute node, so the speedup achievable by your program is usually limited to the total number of CPU cores in that node.  If you want to go beyond this limit and use multiple nodes, consider using mpi4py or [[Apache Spark/en#PySpark|PySpark]].  Other methods of parallelizing Python (not all of them necessarily supported on Compute Canada clusters) are listed [https://wiki.python.org/moin/ParallelProcessing here]. Also note that you can greatly improve  the performance of your Python program by ensuring it is written efficiently, so that should be done first before parallelizing.  If you are not sure if your Python code is efficient, please contact [[technical support]] and have them look at your code.
Note that the multiprocessing module is restricted to using a single compute node, so the speedup achievable by your program is usually limited to the total number of CPU cores in that node.  If you want to go beyond this limit and use multiple nodes, consider using mpi4py or [[Apache Spark/en#PySpark|PySpark]].  Other methods of parallelizing Python (not all of them necessarily supported on our clusters) are listed [https://wiki.python.org/moin/ParallelProcessing here]. Also note that you can greatly improve  the performance of your Python program by ensuring it is written efficiently, so that should be done first before parallelizing.  If you are not sure if your Python code is efficient, please contact [[technical support]] and have them look at your code.


== Anaconda == <!--T:21-->
== Anaconda == <!--T:21-->
Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837

edits

Navigation menu