cc_staff
238
edits
No edit summary |
|||
Line 21: | Line 21: | ||
Many additional details on MPS can be found in this document: [https://docs.nvidia.com/deploy/mps/index.html|CUDA Multi Process Service (MPS) - NVIDIA Documentation]. | Many additional details on MPS can be found in this document: [https://docs.nvidia.com/deploy/mps/index.html|CUDA Multi Process Service (MPS) - NVIDIA Documentation]. | ||
==GPU farming== | |||
One situation when the MPS feature can be very useful is when you need to run multiple instances of your CUDA code, when your code is too small to saturate a modern GPU. What you can do is to run multiple instances of your code sharing a single GPU. (This will work as long as there is enough of GPU memory for all of your code instances.) In many cases this should result in a significantly increased collective throughput from all of your GPU processes. | |||
Here is an example of a job script to set up GPU farming: | |||
#!/bin/bash | |||
#SBATCH --gpus-per-node=v100:1 | |||
#SBATCH -t 0-10:00 | |||
#SBATCH --mem=64G | |||
#SBATCH -c 8 | |||
mkdir -p $HOME/tmp | |||
export CUDA_MPS_LOG_DIRECTORY=$HOME/tmp | |||
nvidia-cuda-mps-control -d | |||
for ((i=0; i<8; i++)) | |||
do | |||
echo $i | |||
./my_code $i & | |||
done | |||
wait | |||
In the above example, we are sharing a single V100 gpu between 8 instances of "my_code" (which takes a single argument - the loop index $i). We request 8 CPU cores (#SBATCH -c 8) for the farm, so there is one CPU core per code instance. The two important elements are "&" on the code execution line (this sends the code processes to the background), and the "wait" command at the end of the script (which ensures that the job runs until all background processes finished running.) | |||
[[Category:Software]] | [[Category:Software]] |