CUDA tutorial: Difference between revisions

CUDA tutorial (view source)

610 bytes added , 8 years ago

no edit summary

Bureaucrats, cc_docs_admin, cc_staff

337

edits

@@ Line 31: / Line 31: @@
 ** Accessible by both CPU and GPU
 *Streaming multiprocessors (SMs)
+** Each SM consists or many streaming processors (SPs)
 **They perform actual computations
 **Each SM has its own control init, registers, execution pipelines, etc
@@ Line 46: / Line 47: @@
 * Transfer data back to the Host memory
 =CUDA Execution Model=
+Simple CUDA code executed on GPU is called KERNEL. There are several questions we may ask at this point:
+* How do you run a Kernel on a bunch of streaming multiprocessors (SMs) ?
+* How do you make such run massively parallel ?
+Here is the execution recipe that will answer the above questions:
+* each GPU core (streaming processor) execute a sequential '''Thread''', where '''Thread''' is a smallest set of instructions handled by the operating system's schedule.
+* all GPU cores execute the kernel in a SIMT fashion (Single Instruction Multiple Threads)