CUDA tutorial: Difference between revisions

CUDA tutorial (view source)

474 bytes added , 7 years ago

no edit summary

Bureaucrats, cc_docs_admin, cc_staff

337

edits

@@ Line 93: / Line 93: @@
 add <<< N, 1 >>> (dev_a, dev_b, dev_c);
 </syntaxhighlight>
+Here we replaced 1 by N, so that N different cuda blocks will be executed at the same time. However, in order to achieve a parallelism we need to make some changes to the Kernel as well:
+<syntaxhighlight lang="cpp" line highlight="1,5">
+__global__   void add (int *a, int *b, int *c){
+	c[blockIdx.x] = a[blockIdx.x] + b[blockIdx.x];
+</syntaxhighlight>
+where blockIdx.x is the unique number identifying a cuda block. This way each cuda block adds a value from a[ ] to b[ ].