MPI4py: Difference between revisions
(Added MPI4py page) |
No edit summary |
||
(35 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
<languages /> | <languages /> | ||
<translate> | <translate> | ||
<!--T:1--> | |||
[https://mpi4py.readthedocs.io/en/stable/ MPI for Python] provides Python bindings for the Message Passing Interface (MPI) standard, allowing Python applications to exploit multiple processors on workstations, clusters and supercomputers. | [https://mpi4py.readthedocs.io/en/stable/ MPI for Python] provides Python bindings for the Message Passing Interface (MPI) standard, allowing Python applications to exploit multiple processors on workstations, clusters and supercomputers. | ||
__FORCETOC__ | __FORCETOC__ | ||
= Available versions = | = Available versions = <!--T:2--> | ||
<code>mpi4py</code> is available as a module, and not from the [ | <code>mpi4py</code> is available as a module, and not from the [[Available Python wheels|wheelhouse]] as typical Python packages are. | ||
You can find available version with | |||
{{Command|module spider mpi4py}} | {{Command|module spider mpi4py}} | ||
and look for more information on a specific version | <!--T:3--> | ||
and look for more information on a specific version with | |||
{{Command|module spider mpi4py/X.Y.Z}} | {{Command|module spider mpi4py/X.Y.Z}} | ||
where <code>X.Y.Z</code> is the exact desired version, for instance <code>4.0.0</code>. | where <code>X.Y.Z</code> is the exact desired version, for instance <code>4.0.0</code>. | ||
= | = Famous first words: Hello World = <!--T:4--> | ||
1. Run a short [ | 1. Run a short [[Running jobs#Interactive_jobs|interactive job]]. | ||
{{Command|salloc --account{{=}}<your account> --ntasks{{=}}5}} | {{Command|salloc --account{{=}}<your account> --ntasks{{=}}5}} | ||
2. Load the module | <!--T:5--> | ||
2. Load the module. | |||
{{Command|module load mpi4py/4.0.0 python/3.12}} | {{Command|module load mpi4py/4.0.0 python/3.12}} | ||
3. Run a Hello World test | <!--T:6--> | ||
3. Run a Hello World test. | |||
{{Command | {{Command | ||
|srun python -m mpi4py.bench helloworld | |srun python -m mpi4py.bench helloworld | ||
Line 30: | Line 36: | ||
Hello, World! I am process 4 of 5 on node3. | Hello, World! I am process 4 of 5 on node3. | ||
}} | }} | ||
In the case above, two nodes (<code>node1</code> and <code>node3</code>) were allocated, and the | In the case above, two nodes (<code>node1</code> and <code>node3</code>) were allocated, and the jobs were distributed across the available resources. | ||
= mpi4py as a package dependency = | = mpi4py as a package dependency = <!--T:7--> | ||
Often <code>mpi4py</code> is a dependency | Often <code>mpi4py</code> is a dependency of another package. In order to fulfill this dependency : | ||
1. Deactivate any Python virtual environment | <!--T:8--> | ||
1. Deactivate any Python virtual environment. | |||
{{Command|test $VIRTUAL_ENV && deactivate}} | {{Command|test $VIRTUAL_ENV && deactivate}} | ||
<!--T:9--> | |||
<b>Note:</b> If you had a virtual environment activated, it is important to deactivate it first, then load the module, before reactivating your virtual environment. | |||
2. Load the module | <!--T:10--> | ||
2. Load the module. | |||
{{Command|module load mpi4py/4.0.0 python/3.12}} | {{Command|module load mpi4py/4.0.0 python/3.12}} | ||
3. Check that it is visible by <code>pip</code> | <!--T:11--> | ||
3. Check that it is visible by <code>pip</code> | |||
{{Command | {{Command | ||
|pip list {{!}} grep mpi4py | |pip list {{!}} grep mpi4py | ||
Line 49: | Line 59: | ||
mpi4py 4.0.0 | mpi4py 4.0.0 | ||
}} | }} | ||
and is accessible for your | and is accessible for your currently loaded python module. | ||
{{Command|python -c 'import mpi4py'}} | {{Command|python -c 'import mpi4py'}} | ||
If no errors are raised, then everything is | If no errors are raised, then everything is OK! | ||
4. [ | <!--T:12--> | ||
4. [[Python#Creating_and_using_a_virtual_environment|Create a virtual environment and install your packages]]. | |||
= Running jobs | = Running jobs = <!--T:13--> | ||
You can run mpi jobs distributed across multiple nodes or cores. | You can run mpi jobs distributed across multiple nodes or cores. | ||
For efficient MPI scheduling, please see: | For efficient MPI scheduling, please see: | ||
Line 61: | Line 72: | ||
* [[Advanced MPI scheduling]] | * [[Advanced MPI scheduling]] | ||
== CPU == | == CPU == <!--T:14--> | ||
1. Write your python code, for instance broadcasting a numpy array | 1. Write your python code, for instance, broadcasting a numpy array. | ||
{{File | {{File | ||
|name="mpi4py-np-bc.py" | |name="mpi4py-np-bc.py" | ||
Line 70: | Line 81: | ||
import numpy as np | import numpy as np | ||
<!--T:15--> | |||
comm = MPI.COMM_WORLD | comm = MPI.COMM_WORLD | ||
rank = comm.Get_rank() | rank = comm.Get_rank() | ||
<!--T:16--> | |||
if rank == 0: | if rank == 0: | ||
data = np.arange(100, dtype='i') | data = np.arange(100, dtype='i') | ||
Line 78: | Line 91: | ||
data = np.empty(100, dtype='i') | data = np.empty(100, dtype='i') | ||
<!--T:17--> | |||
comm.Bcast(data, root=0) | comm.Bcast(data, root=0) | ||
<!--T:18--> | |||
for i in range(100): | for i in range(100): | ||
assert data[i] == i | assert data[i] == i | ||
}} | }} | ||
The example above is based on [https://mpi4py.readthedocs.io/en/stable/tutorial.html#running-python-scripts-with-mpi mpi4py tutorial]. | The example above is based on the [https://mpi4py.readthedocs.io/en/stable/tutorial.html#running-python-scripts-with-mpi mpi4py tutorial]. | ||
2. Write your submission script | <!--T:19--> | ||
2. Write your submission script. | |||
<tabs> | <tabs> | ||
<tab name="Distributed"> | <tab name="Distributed"> | ||
Line 94: | Line 110: | ||
#!/bin/bash | #!/bin/bash | ||
<!--T:20--> | |||
#SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs | #SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs | ||
#SBATCH --time=08:00:00 # adjust this to match the walltime of your job | #SBATCH --time=08:00:00 # adjust this to match the walltime of your job | ||
Line 99: | Line 116: | ||
#SBATCH --mem-per-cpu=4G # adjust this according to the memory you need per process | #SBATCH --mem-per-cpu=4G # adjust this according to the memory you need per process | ||
<!--T:21--> | |||
# Run on cores across the system : https://docs.alliancecan.ca/wiki/Advanced_MPI_scheduling#Few_cores,_any_number_of_nodes | # Run on cores across the system : https://docs.alliancecan.ca/wiki/Advanced_MPI_scheduling#Few_cores,_any_number_of_nodes | ||
<!--T:22--> | |||
# Load modules dependencies. | # Load modules dependencies. | ||
module load StdEnv/2023 gcc mpi4py/4.0.0 python/3.12 | module load StdEnv/2023 gcc mpi4py/4.0.0 python/3.12 | ||
<!--T:23--> | |||
# create the virtual environment on each allocated node: | # create the virtual environment on each allocated node: | ||
srun --ntasks $SLURM_NNODES --tasks-per-node=1 bash << EOF | srun --ntasks $SLURM_NNODES --tasks-per-node=1 bash << EOF | ||
Line 109: | Line 129: | ||
source $SLURM_TMPDIR/env/bin/activate | source $SLURM_TMPDIR/env/bin/activate | ||
<!--T:24--> | |||
pip install --no-index --upgrade pip | pip install --no-index --upgrade pip | ||
pip install --no-index numpy==2.1.1 | pip install --no-index numpy==2.1.1 | ||
EOF | EOF | ||
<!--T:25--> | |||
# activate only on main node | # activate only on main node | ||
source $SLURM_TMPDIR/env/bin/activate; | source $SLURM_TMPDIR/env/bin/activate; | ||
<!--T:26--> | |||
# srun exports the current env, which contains $VIRTUAL_ENV and $PATH variables | # srun exports the current env, which contains $VIRTUAL_ENV and $PATH variables | ||
srun python mpi4py-np-bc.py; | srun python mpi4py-np-bc.py; | ||
Line 121: | Line 144: | ||
</tab> | </tab> | ||
<!--T:27--> | |||
<tab name="Whole nodes"> | <tab name="Whole nodes"> | ||
{{File | {{File | ||
Line 128: | Line 152: | ||
#!/bin/bash | #!/bin/bash | ||
<!--T:28--> | |||
#SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs | #SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs | ||
#SBATCH --time=01:00:00 # adjust this to match the walltime of your job | #SBATCH --time=01:00:00 # adjust this to match the walltime of your job | ||
Line 134: | Line 159: | ||
#SBATCH --mem-per-cpu=1G # adjust this according to the memory you need per process | #SBATCH --mem-per-cpu=1G # adjust this according to the memory you need per process | ||
<!--T:29--> | |||
# Run on N whole nodes : https://docs.alliancecan.ca/wiki/Advanced_MPI_scheduling#Whole_nodes | # Run on N whole nodes : https://docs.alliancecan.ca/wiki/Advanced_MPI_scheduling#Whole_nodes | ||
<!--T:30--> | |||
# Load modules dependencies. | # Load modules dependencies. | ||
module load StdEnv/2023 gcc openmpi mpi4py/4.0.0 python/3.12 | module load StdEnv/2023 gcc openmpi mpi4py/4.0.0 python/3.12 | ||
<!--T:31--> | |||
# create the virtual environment on each allocated node: | # create the virtual environment on each allocated node: | ||
srun --ntasks $SLURM_NNODES --tasks-per-node=1 bash << EOF | srun --ntasks $SLURM_NNODES --tasks-per-node=1 bash << EOF | ||
Line 144: | Line 172: | ||
source $SLURM_TMPDIR/env/bin/activate | source $SLURM_TMPDIR/env/bin/activate | ||
<!--T:32--> | |||
pip install --no-index --upgrade pip | pip install --no-index --upgrade pip | ||
pip install --no-index numpy==2.1.1 | pip install --no-index numpy==2.1.1 | ||
EOF | EOF | ||
<!--T:33--> | |||
# activate only on main node | # activate only on main node | ||
source $SLURM_TMPDIR/env/bin/activate; | source $SLURM_TMPDIR/env/bin/activate; | ||
<!--T:34--> | |||
# srun exports the current env, which contains $VIRTUAL_ENV and $PATH variables | # srun exports the current env, which contains $VIRTUAL_ENV and $PATH variables | ||
srun python mpi4py-np-bc.py; | srun python mpi4py-np-bc.py; | ||
Line 157: | Line 188: | ||
</tabs> | </tabs> | ||
<!--T:36--> | |||
3. Test your script. | |||
<!--T:37--> | |||
Before submitting your job, it is important to test that your submission script will start without errors. You can do a quick test in an [[Running_jobs#Interactive_jobs|interactive job]]. | |||
<!--T:38--> | |||
4. Submit your job to the scheduler. | |||
{{Command|sbatch submit-mpi4py-distributed.sh}} | {{Command|sbatch submit-mpi4py-distributed.sh}} | ||
== GPU == | == GPU == <!--T:39--> | ||
1. From a login node, download the demo example | 1. From a login node, download the demo example. | ||
{{Command | {{Command | ||
|wget https://raw.githubusercontent.com/mpi4py/mpi4py/refs/heads/master/demo/cuda-aware-mpi/use_cupy.py | |wget https://raw.githubusercontent.com/mpi4py/mpi4py/refs/heads/master/demo/cuda-aware-mpi/use_cupy.py | ||
Line 174: | Line 205: | ||
The example above and others, can be found in the [https://github.com/mpi4py/mpi4py/tree/master/demo demo folder]. | The example above and others, can be found in the [https://github.com/mpi4py/mpi4py/tree/master/demo demo folder]. | ||
2. Write your submission script | <!--T:40--> | ||
2. Write your submission script. | |||
{{File | {{File | ||
|name=submit-mpi4py-gpu.sh | |name=submit-mpi4py-gpu.sh | ||
Line 181: | Line 213: | ||
#!/bin/bash | #!/bin/bash | ||
<!--T:41--> | |||
#SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs | #SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs | ||
#SBATCH --time=08:00:00 # adjust this to match the walltime of your job | #SBATCH --time=08:00:00 # adjust this to match the walltime of your job | ||
Line 187: | Line 220: | ||
#SBATCH --gpus=1 | #SBATCH --gpus=1 | ||
<!--T:42--> | |||
# Load modules dependencies. | # Load modules dependencies. | ||
module load StdEnv/2023 gcc cuda/12 mpi4py/4.0.0 python/3.11 | module load StdEnv/2023 gcc cuda/12 mpi4py/4.0.0 python/3.11 | ||
<!--T:43--> | |||
# create the virtual environment on each allocated node: | # create the virtual environment on each allocated node: | ||
virtualenv --no-download $SLURM_TMPDIR/env | virtualenv --no-download $SLURM_TMPDIR/env | ||
source $SLURM_TMPDIR/env/bin/activate | source $SLURM_TMPDIR/env/bin/activate | ||
<!--T:44--> | |||
pip install --no-index --upgrade pip | pip install --no-index --upgrade pip | ||
pip install --no-index cupy numba | pip install --no-index cupy numba | ||
<!--T:45--> | |||
srun python use_cupy.py; | srun python use_cupy.py; | ||
}} | }} | ||
<!--T:47--> | |||
3. Test your script. | |||
<!--T:48--> | |||
Before submitting your job, it is important to test that your submission script will start without errors. | Before submitting your job, it is important to test that your submission script will start without errors. | ||
You can do a quick test in an [[Running_jobs#Interactive_jobs|interactive job]]. | You can do a quick test in an [[Running_jobs#Interactive_jobs|interactive job]]. | ||
<!--T:49--> | |||
4. Submit your job | |||
{{Command|sbatch submit-mpi4py-gpu.sh}} | {{Command|sbatch submit-mpi4py-gpu.sh}} | ||
= Troubleshooting = | = Troubleshooting = <!--T:50---> | ||
== ModuleNotFoundError: No module named 'mpi4py' == | |||
If <code>mpi4py</code> is not accessible, | == ModuleNotFoundError: No module named 'mpi4py' == <!--T:53--> | ||
If <code>mpi4py</code> is not accessible, you may get the following error when importing it: | |||
<code> | <code> | ||
ModuleNotFoundError: No module named 'mpi4py' | ModuleNotFoundError: No module named 'mpi4py' | ||
</code> | </code> | ||
Possible | <!--T:51--> | ||
* check compatible | Possible solutions: | ||
* load the module before activating | * check which Python versions are compatible with your loaded mpi4py module using <code>module spider mpi4py/X.Y.Z</code>. Once a compatible Python module is loaded, check that <code>python -c 'import mpi4py'</code> works. | ||
* load the module before activating your virtual environment: please see the [[MPI4py#mpi4py_as_a_package_dependency|mpi4py as a package dependency]] section above. | |||
<!--T:52--> | |||
See also [[Python#ModuleNotFoundError:_No_module_named_'X'|ModuleNotFoundError: No module named 'X']]. | |||
</translate> | </translate> |
Latest revision as of 19:22, 17 October 2024
MPI for Python provides Python bindings for the Message Passing Interface (MPI) standard, allowing Python applications to exploit multiple processors on workstations, clusters and supercomputers.
Available versions[edit]
mpi4py
is available as a module, and not from the wheelhouse as typical Python packages are.
You can find available version with
[name@server ~]$ module spider mpi4py
and look for more information on a specific version with
[name@server ~]$ module spider mpi4py/X.Y.Z
where X.Y.Z
is the exact desired version, for instance 4.0.0
.
Famous first words: Hello World[edit]
1. Run a short interactive job.
[name@server ~]$ salloc --account=<your account> --ntasks=5
2. Load the module.
[name@server ~]$ module load mpi4py/4.0.0 python/3.12
3. Run a Hello World test.
[name@server ~]$ srun python -m mpi4py.bench helloworld
Hello, World! I am process 0 of 5 on node1.
Hello, World! I am process 1 of 5 on node1.
Hello, World! I am process 2 of 5 on node3.
Hello, World! I am process 3 of 5 on node3.
Hello, World! I am process 4 of 5 on node3.
In the case above, two nodes (node1
and node3
) were allocated, and the jobs were distributed across the available resources.
mpi4py as a package dependency[edit]
Often mpi4py
is a dependency of another package. In order to fulfill this dependency :
1. Deactivate any Python virtual environment.
[name@server ~]$ test $VIRTUAL_ENV && deactivate
Note: If you had a virtual environment activated, it is important to deactivate it first, then load the module, before reactivating your virtual environment.
2. Load the module.
[name@server ~]$ module load mpi4py/4.0.0 python/3.12
3. Check that it is visible by pip
[name@server ~]$ pip list | grep mpi4py
mpi4py 4.0.0
and is accessible for your currently loaded python module.
[name@server ~]$ python -c 'import mpi4py'
If no errors are raised, then everything is OK!
4. Create a virtual environment and install your packages.
Running jobs[edit]
You can run mpi jobs distributed across multiple nodes or cores. For efficient MPI scheduling, please see:
CPU[edit]
1. Write your python code, for instance, broadcasting a numpy array.
from mpi4py import MPI
import numpy as np
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank == 0:
data = np.arange(100, dtype='i')
else:
data = np.empty(100, dtype='i')
comm.Bcast(data, root=0)
for i in range(100):
assert data[i] == i
The example above is based on the mpi4py tutorial.
2. Write your submission script.
#!/bin/bash
#SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs
#SBATCH --time=08:00:00 # adjust this to match the walltime of your job
#SBATCH --ntasks=4 # adjust this to match the number of tasks/processes to run
#SBATCH --mem-per-cpu=4G # adjust this according to the memory you need per process
# Run on cores across the system : https://docs.alliancecan.ca/wiki/Advanced_MPI_scheduling#Few_cores,_any_number_of_nodes
# Load modules dependencies.
module load StdEnv/2023 gcc mpi4py/4.0.0 python/3.12
# create the virtual environment on each allocated node:
srun --ntasks $SLURM_NNODES --tasks-per-node=1 bash << EOF
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install --no-index --upgrade pip
pip install --no-index numpy==2.1.1
EOF
# activate only on main node
source $SLURM_TMPDIR/env/bin/activate;
# srun exports the current env, which contains $VIRTUAL_ENV and $PATH variables
srun python mpi4py-np-bc.py;
#!/bin/bash
#SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs
#SBATCH --time=01:00:00 # adjust this to match the walltime of your job
#SBATCH --nodes=2 # adjust this to match the number of whole node
#SBATCH --ntasks-per-node=40 # adjust this to match the number of tasks/processes to run per node
#SBATCH --mem-per-cpu=1G # adjust this according to the memory you need per process
# Run on N whole nodes : https://docs.alliancecan.ca/wiki/Advanced_MPI_scheduling#Whole_nodes
# Load modules dependencies.
module load StdEnv/2023 gcc openmpi mpi4py/4.0.0 python/3.12
# create the virtual environment on each allocated node:
srun --ntasks $SLURM_NNODES --tasks-per-node=1 bash << EOF
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install --no-index --upgrade pip
pip install --no-index numpy==2.1.1
EOF
# activate only on main node
source $SLURM_TMPDIR/env/bin/activate;
# srun exports the current env, which contains $VIRTUAL_ENV and $PATH variables
srun python mpi4py-np-bc.py;
3. Test your script.
Before submitting your job, it is important to test that your submission script will start without errors. You can do a quick test in an interactive job.
4. Submit your job to the scheduler.
[name@server ~]$ sbatch submit-mpi4py-distributed.sh
GPU[edit]
1. From a login node, download the demo example.
[name@server ~]$ wget https://raw.githubusercontent.com/mpi4py/mpi4py/refs/heads/master/demo/cuda-aware-mpi/use_cupy.py
The example above and others, can be found in the demo folder.
2. Write your submission script.
#!/bin/bash
#SBATCH --account=def-someprof # adjust this to match the accounting group you are using to submit jobs
#SBATCH --time=08:00:00 # adjust this to match the walltime of your job
#SBATCH --ntasks=2 # adjust this to match the number of tasks/processes to run
#SBATCH --mem-per-cpu=2G # adjust this according to the memory you need per process
#SBATCH --gpus=1
# Load modules dependencies.
module load StdEnv/2023 gcc cuda/12 mpi4py/4.0.0 python/3.11
# create the virtual environment on each allocated node:
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install --no-index --upgrade pip
pip install --no-index cupy numba
srun python use_cupy.py;
3. Test your script.
Before submitting your job, it is important to test that your submission script will start without errors. You can do a quick test in an interactive job.
4. Submit your job
[name@server ~]$ sbatch submit-mpi4py-gpu.sh
Troubleshooting[edit]
ModuleNotFoundError: No module named 'mpi4py'[edit]
If mpi4py
is not accessible, you may get the following error when importing it:
ModuleNotFoundError: No module named 'mpi4py'
Possible solutions:
- check which Python versions are compatible with your loaded mpi4py module using
module spider mpi4py/X.Y.Z
. Once a compatible Python module is loaded, check thatpython -c 'import mpi4py'
works. - load the module before activating your virtual environment: please see the mpi4py as a package dependency section above.
See also ModuleNotFoundError: No module named 'X'.