CUDA: Difference between revisions

1,722 bytes added ,  2 years ago
move section on Compute Capability from Slurm page
(Marked this version for translation)
(move section on Compute Capability from Slurm page)
Line 105: Line 105:
<!--T:47-->
<!--T:47-->
To learn more about how the above program works and how to make the use of a GPUs parallelism see [[CUDA tutorial]].
To learn more about how the above program works and how to make the use of a GPUs parallelism see [[CUDA tutorial]].
== Troubleshooting ==
=== "Compute Capability" ===
NVidia has created a technical term "compute capabilty" which they describe as follows:
<blockquote>
The ''compute capability'' of a device is represented by a version number, also sometimes called its "SM version". This version number identifies the features supported by the GPU hardware and is used by applications at runtime to determine which hardware features and/or instructions are available on the present GPU."  ([https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capability CUDA Toolkit Documentation, section 2.6])
</blockquote>
The following errors are connected with "compute capability":
<pre>
nvcc fatal : Unsupported gpu architecture 'compute_XX'
</pre>
<pre>
no kernel image is available for execution on the device (209)
</pre>
If you encounter either of these errors, you may be able to fix it by adding the correct FLAG to the <code>nvcc</code> call:
<pre>
-gencode arch=compute_XX,code=[sm_XX,compute_XX]
</pre>
Or if you are using <code>cmake</code> instead of <code>nvcc</code> directly, provide the following flag:
<pre>
cmake .. -DCMAKE_CUDA_ARCHITECTURES=XX
</pre>
where “XX” is the "compute capability" of the Nvidia GPU that you expect to run the application on.
To find the value to replace “XX“, see the Available Hardware table on the page [[Using GPUs with Slurm]].
'''For example,''' if you will run your code on a Narval A100 node, its "compute capability" is 80.
The correct FLAG to use when compiling with <code>nvcc</code> is
<pre>
-gencode arch=compute_80,code=[sm_80,compute_80]
</pre>
The flag to supply to <code>cmake</code> is:
<pre>
cmake .. -DCMAKE_CUDA_ARCHITECTURES=80
</pre>


</translate>
</translate>
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits