cc_staff
4
edits
No edit summary |
(Adding a Python example where we iterate a parameter though a list or NumPy array.) |
||
Line 84: | Line 84: | ||
* Take care that the number of tasks you request matches the number of entries in the file. | * Take care that the number of tasks you request matches the number of entries in the file. | ||
* The file <code>case_list</code> should not be changed until all the tasks in the array have run, since it will be read each time a new task starts. | * The file <code>case_list</code> should not be changed until all the tasks in the array have run, since it will be read each time a new task starts. | ||
== Example: Data-parallel Python script == | |||
Suppose you have a Python script doing a certain calculations with some parameters defined in a Python list or a NumPy array such as | |||
{{File | |||
|name=my_script.py | |||
|language=python | |||
|contents= | |||
import time | |||
import numpy as np | |||
def calculation(x, beta): | |||
time.sleep(2) #simulate a long run | |||
return beta * np.linalg.norm(x**2) | |||
if __name__ == "__main__": | |||
x = np.random.rand(100) | |||
betas = np.linspace(10,36.5,100) #subdivise the interval [10,36.5] with 100 values | |||
for i in range(len(betas)): #iterate through the beta parameter | |||
res = calculation(x,betas[i]) | |||
print(res) #show the results on screen | |||
# Run with: python my_script.py | |||
}} | |||
The above task can be processed in a job array so that each value of the beta parameter can be treated in parallel. | |||
The idea is to pass the <code>$SLURM_ARRAY_TASK_ID</code> to the Python script and get the beta parameter based on its value. | |||
The Python script become | |||
{{File | |||
|name=my_script_parallel.py | |||
|language=python | |||
|contents= | |||
import time | |||
import numpy as np | |||
import sys | |||
def calculation(x, beta): | |||
time.sleep(2) #simulate a long run | |||
return beta * np.linalg.norm(x**2) | |||
if __name__ == "__main__": | |||
x = np.random.rand(100) | |||
betas = np.linspace(10,36.5,100) #subdivise the interval [10,36.5] with 100 values | |||
i = int(sys.argv[1]) #get the value of the $SLURM_ARRAY_TASK_ID | |||
res = calculation(x,betas[i]) | |||
print(res) #show the results on screen | |||
# Run with: python my_script_parallel.py $SLURM_ARRAY_TASK_ID | |||
}} | |||
The job submission script is (note the array parameters goes from 0 to 99 like the indexes of the NumPy array) | |||
{{File | |||
|name=data_parallel_python.sh | |||
|language=bash | |||
|contents= | |||
#!/bin/bash | |||
#SBATCH --array=0-99 | |||
#SBATCH --time=1:00:00 | |||
module load scipy-stack | |||
python my_script_parallel.py $SLURM_ARRAY_TASK_ID | |||
}} | |||
</translate> | </translate> |