CUDA: Difference between revisions

192 bytes added ,  2 years ago
Marked this version for translation
(move section on Compute Capability from Slurm page)
(Marked this version for translation)
Line 106: Line 106:
To learn more about how the above program works and how to make the use of a GPUs parallelism see [[CUDA tutorial]].
To learn more about how the above program works and how to make the use of a GPUs parallelism see [[CUDA tutorial]].


== Troubleshooting ==
== Troubleshooting == <!--T:49-->


=== "Compute Capability" ===
=== "Compute Capability" === <!--T:50-->


<!--T:51-->
NVidia has created a technical term "compute capabilty" which they describe as follows:
NVidia has created a technical term "compute capabilty" which they describe as follows:


<!--T:52-->
<blockquote>
<blockquote>
The ''compute capability'' of a device is represented by a version number, also sometimes called its "SM version". This version number identifies the features supported by the GPU hardware and is used by applications at runtime to determine which hardware features and/or instructions are available on the present GPU."  ([https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capability CUDA Toolkit Documentation, section 2.6])
The ''compute capability'' of a device is represented by a version number, also sometimes called its "SM version". This version number identifies the features supported by the GPU hardware and is used by applications at runtime to determine which hardware features and/or instructions are available on the present GPU."  ([https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capability CUDA Toolkit Documentation, section 2.6])
</blockquote>
</blockquote>


<!--T:53-->
The following errors are connected with "compute capability":
The following errors are connected with "compute capability":


<!--T:54-->
<pre>
<pre>
nvcc fatal : Unsupported gpu architecture 'compute_XX'
nvcc fatal : Unsupported gpu architecture 'compute_XX'
</pre>
</pre>


<!--T:55-->
<pre>
<pre>
no kernel image is available for execution on the device (209)
no kernel image is available for execution on the device (209)
</pre>
</pre>


<!--T:56-->
If you encounter either of these errors, you may be able to fix it by adding the correct FLAG to the <code>nvcc</code> call:
If you encounter either of these errors, you may be able to fix it by adding the correct FLAG to the <code>nvcc</code> call:


<!--T:57-->
<pre>
<pre>
-gencode arch=compute_XX,code=[sm_XX,compute_XX]
-gencode arch=compute_XX,code=[sm_XX,compute_XX]
</pre>
</pre>


<!--T:58-->
Or if you are using <code>cmake</code> instead of <code>nvcc</code> directly, provide the following flag:
Or if you are using <code>cmake</code> instead of <code>nvcc</code> directly, provide the following flag:


<!--T:59-->
<pre>
<pre>
cmake .. -DCMAKE_CUDA_ARCHITECTURES=XX
cmake .. -DCMAKE_CUDA_ARCHITECTURES=XX
</pre>
</pre>


<!--T:60-->
where “XX” is the "compute capability" of the Nvidia GPU that you expect to run the application on.  
where “XX” is the "compute capability" of the Nvidia GPU that you expect to run the application on.  
To find the value to replace “XX“, see the Available Hardware table on the page [[Using GPUs with Slurm]].
To find the value to replace “XX“, see the Available Hardware table on the page [[Using GPUs with Slurm]].


<!--T:61-->
'''For example,''' if you will run your code on a Narval A100 node, its "compute capability" is 80.
'''For example,''' if you will run your code on a Narval A100 node, its "compute capability" is 80.
The correct FLAG to use when compiling with <code>nvcc</code> is
The correct FLAG to use when compiling with <code>nvcc</code> is


<!--T:62-->
<pre>
<pre>
-gencode arch=compute_80,code=[sm_80,compute_80]
-gencode arch=compute_80,code=[sm_80,compute_80]
</pre>
</pre>


<!--T:63-->
The flag to supply to <code>cmake</code> is:
The flag to supply to <code>cmake</code> is:


<!--T:64-->
<pre>
<pre>
cmake .. -DCMAKE_CUDA_ARCHITECTURES=80
cmake .. -DCMAKE_CUDA_ARCHITECTURES=80
rsnt_translations
56,437

edits