38,789
edits
(Updating to match new version of source page) |
(Updating to match new version of source page) |
||
Line 183: | Line 183: | ||
</pre> | </pre> | ||
Vous pouvez ainsi soumettre plusieurs tâches. Le paramètre <code>-j4</code> fait en sorte que GNU Parallel exécutera quatre tâches concurremment en lançant une tâche aussitôt que la précédente est terminée. Pour éviter que deux tâches se disputent le même GPU, utilisez CUDA_VISIBLE_DEVICES. | Vous pouvez ainsi soumettre plusieurs tâches. Le paramètre <code>-j4</code> fait en sorte que GNU Parallel exécutera quatre tâches concurremment en lançant une tâche aussitôt que la précédente est terminée. Pour éviter que deux tâches se disputent le même GPU, utilisez CUDA_VISIBLE_DEVICES. | ||
== CUDA Compute Capability == | |||
When you are compiling CUDA code on clusters it’s important to know what is the Compute Capability of the GPU that you are targeting. If you get the following error during the compile time: | |||
<pre> | |||
nvcc fatal : Unsupported gpu architecture 'compute_XX' | |||
</pre> | |||
or this error during running your CUDA code on a compute node with GPU: | |||
<pre> | |||
no kernel image is available for execution on the device (209) | |||
</pre> | |||
you can fix it by adding the correct FLAG to “nvcc” call: | |||
<pre> | |||
-gencode arch=compute_XX,code=[sm_XX,compute_XX] | |||
</pre> | |||
or if you are using CMake to build your project, by providing the following flag: | |||
<pre> | |||
cmake .. -DCMAKE_CUDA_ARCHITECTURES=XX | |||
</pre> | |||
where “XX” is the Compute Capability of the Nvidia GPU board that you are going to use. Now you need to know the correct value to replace “XX“, you can find it under Compute Capability column on the above table. | |||
For example, if you are running your code on a Narval A100 node, you find that its Compute Capability is 80, so the correct FLAG to use in the compiler is | |||
<pre> | |||
-gencode arch=compute_80,code=[sm_80,compute_80] | |||
</pre> | |||
or the following command to configure CMake: | |||
<pre> | |||
cmake .. -DCMAKE_CUDA_ARCHITECTURES=80 | |||
</pre> | |||
[[Category:SLURM]] | [[Category:SLURM]] |