Python: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 133: Line 133:


<!--T:30-->
<!--T:30-->
To see where the <code>pip</code> command is installing a python package from, diagnosing installation issues, you can tell it to be more verbose with the <code>-vvv</code> option. It also worth mentioning that when installing multiple packages it is advisable to install them with one command as it helps pip resolve dependencies.
To see where the <code>pip</code> command is installing a python package from, diagnosing installation issues, you can tell it to be more verbose with the <code>-vvv</code> option. It is also worth mentioning that when installing multiple packages it is advisable to install them with one command as it helps pip resolve dependencies.


==== Installing dependent packages ==== <!--T:35-->
==== Installing dependent packages ==== <!--T:35-->
Line 401: Line 401:


<!--T:48-->
<!--T:48-->
Using the <tt>Pool</tt> class, running in parallel, the above codes become :  
Using the <code>Pool</code> class, running in parallel, the above codes become :  
<tabs>
<tabs>
<tab name="Using a loop">
<tab name="Using a loop">
Line 443: Line 443:


<!--T:53-->
<!--T:53-->
The above examples will however be limited to using <tt>4</tt> processes. On a cluster, it is very important to use the cores that are allocated to your job. Launching more processes than you have cores requested will slow down your calculation and possibly overload the compute node. Launching fewer processes than you have cores will result in wasted resources and cores remaining idle. The correct number of cores to use in your code is determined by the amount of resources you requested to the scheduler. For example, if you have the same computation to perform on many tens of data or more, it would make sense to use all of the cores of a node. In this case, you can write your job submission script with the following header :  
The above examples will however be limited to using <code>4</code> processes. On a cluster, it is very important to use the cores that are allocated to your job. Launching more processes than you have cores requested will slow down your calculation and possibly overload the compute node. Launching fewer processes than you have cores will result in wasted resources and cores remaining idle. The correct number of cores to use in your code is determined by the amount of resources you requested to the scheduler. For example, if you have the same computation to perform on many tens of data or more, it would make sense to use all of the cores of a node. In this case, you can write your job submission script with the following header :  
{{File|language=bash|name=submit.sh|contents=
{{File|language=bash|name=submit.sh|contents=
#SBATCH --ntasks-per-node=1
#SBATCH --ntasks-per-node=1
Line 497: Line 497:


<!--T:59-->
<!--T:59-->
Note that in the above example, the function <tt>cube</tt> itself is sequential. If you are calling some external library, such as <tt>numpy</tt>, it is possible that the functions called by your code are themselves parallel. If you want to distribute processes with the technique above, you should verify whether the functions you call are themselves parallel, and if they are, you need to control how many threads they will take themselves. If, for example, they take all the cores available (32 in the above example), and you are yourself starting 32 processes, this will slow down your code and possibly overload the node as well.
Note that in the above example, the function <code>cube</code> itself is sequential. If you are calling some external library, such as <code>numpy</code>, it is possible that the functions called by your code are themselves parallel. If you want to distribute processes with the technique above, you should verify whether the functions you call are themselves parallel, and if they are, you need to control how many threads they will take themselves. If, for example, they take all the cores available (32 in the above example), and you are yourself starting 32 processes, this will slow down your code and possibly overload the node as well.


<!--T:60-->
<!--T:60-->
Line 553: Line 553:
ERROR: No matching distribution found for X
ERROR: No matching distribution found for X
}}
}}
<tt>pip</tt> did not find a package to install that satisfies the requirements (name, version or tags).
<code>pip</code> did not find a package to install that satisfies the requirements (name, version or tags).
Verify that the name and version are correct.  
Verify that the name and version are correct.  
Note also that <tt>manylinux_x_y</tt> wheels are discarded.
Note also that <code>manylinux_x_y</code> wheels are discarded.


<!--T:87-->
<!--T:87-->
Line 567: Line 567:
|pip install package1 package2 package3 package4
|pip install package1 package2 package3 package4
}}
}}
as this helps <tt>pip</tt> resolve dependencies issues.
as this helps <code>pip</code> resolve dependencies issues.


=== My virtual environment was working yesterday but not anymore === <!--T:92-->
=== My virtual environment was working yesterday but not anymore === <!--T:92-->
rsnt_translations
56,420

edits