CUDA tutorial: Difference between revisions

CUDA tutorial (view source)

28 bytes added , 7 years ago

no edit summary

Bureaucrats, cc_docs_admin, cc_staff

2,314

edits

@@ Line 48: / Line 48: @@
 * Transfer data back to the host memory
-=CUDA Execution Model= <!--T:2-->
+=CUDA execution model= <!--T:2-->
-Simple CUDA code executed on GPU is called KERNEL. There are several questions we may ask at this point:
+Simple CUDA code executed on GPU is called a ''kernel''. There are several questions we may ask at this point:
-* How do you run a Kernel on a bunch of streaming multiprocessors (SMs) ?
+* How do you run a kernel on a bunch of streaming multiprocessors (SMs)?
-* How do you make such run massively parallel ?
+* How do you make such kernel run in a massively parallel fashion?
 Here is the execution recipe that will answer the above questions:
-* each GPU core (streaming processor) execute a sequential '''Thread''', where '''Thread''' is a smallest set of instructions handled by the operating system's schedule.
+* each GPU core (streaming processor) executes a sequential '''thread''', where a '''thread''' is a smallest set of instructions handled by the operating system's schedule.
-* all GPU cores execute the kernel in a SIMT fashion (Single Instruction Multiple Threads)
+* all GPU cores execute the kernel in a SIMT fashion (Single Instruction, Multiple Threads)
 Usually the following procedure is recommended when it comes to executing on GPU:
 . Copy input data from CPU memory to GPU memory
-. Load GPU program (Kernel) and execute it
+. Load GPU program (kernel) and execute it
 . Copy results from GPU memory back to CPU memory