CUDA tutorial: Difference between revisions

CUDA tutorial (view source)

Revision as of 18:35, 28 September 2017

48 bytes added , 7 years ago

no edit summary

Stubbsda

Bureaucrats, cc_docs_admin, cc_staff

2,314

edits

@@ Line 60: / Line 60: @@
 . Copy results from GPU memory back to CPU memory
-= CUDA Block-Threading Model = <!--T:3-->
+= CUDA block-threading model = <!--T:3-->
 <!--T:4-->
 [[File:Cuda-threads-blocks.png|thumbnail|CUDA block-threading model where threads are organized into blocks while blocks are further organized into grid. ]]
-Given very large number of threads (and in order to achieve massive parallelism one has to use all the threads possible) in CUDA kernel, one needs to organize them somehow. in CUDA, all the threads are structured in threading blocks, the blocks are further organized into grids, as shown on FIg. In dividing the threads we make sure that the following is satisfied:
+Given a very large number of threads - in order to achieve massive parallelism one has to use all the threads possible - in a CUDA kernel, one needs to organize them somehow. In CUDA, all the threads are structured in threading blocks, the blocks are further organized into grids, as shown in the accompanying figure. In distributing the threads we must make sure that the following conditions are satisfied:
 * threads within a block cooperate via the shared memory
 * threads in different blocks can not cooperate
-In this model the threads within a block work on the same set of instructions (but perhaps with different data sets) and exchange data between each other via shared memory. Threads in other blocks do the same thing (see Figure).
+In this model the threads within a block work on the same set of instructions (but perhaps with different data sets) and exchange data between each other via shared memory. Threads in other blocks do the same thing (see the figure).
-[[File:Cuda_threads.png|thumbnail|Threads within a block intercommunicate via shared memory . ]]
+[[File:Cuda_threads.png|thumbnail|Threads within a block intercommunicate via shared memory. ]]
 <!--T:5-->
@@ Line 74: / Line 74: @@
 * Block IDs: 1D or 2D (blockIdx.x, blockIdx.y)
 * Thread IDs: 1D, 2D, or 3D (threadIdx.x, threadIdx.y, threadIdx.z)
-Such model simplifies memory addressing when processing multidimmensional data.
+Such a model simplifies memory addressing when processing multi-dimensional data.
 = Thread scheduling = <!--T:6-->

CUDA tutorial: Difference between revisions

CUDA tutorial (view source)

Revision as of 18:35, 28 September 2017

Navigation menu

Search