CUDA tutorial: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 31: Line 31:
** Accessible by both CPU and GPU
** Accessible by both CPU and GPU
*Streaming multiprocessors (SMs)
*Streaming multiprocessors (SMs)
** Each SM consists or many streaming processors (SPs)
**They perform actual computations
**They perform actual computations
**Each SM has its own control init, registers, execution pipelines, etc
**Each SM has its own control init, registers, execution pipelines, etc
Line 46: Line 47:
* Transfer data back to the Host memory
* Transfer data back to the Host memory
=CUDA Execution Model=
=CUDA Execution Model=
Simple CUDA code executed on GPU is called KERNEL. There are several questions we may ask at this point:
* How do you run a Kernel on a bunch of streaming multiprocessors (SMs) ?
* How do you make such run massively parallel ?
Here is the execution recipe that will answer the above questions:
* each GPU core (streaming processor) execute a sequential '''Thread''', where '''Thread''' is a smallest set of instructions handled by the operating system's schedule.
* all GPU cores execute the kernel in a SIMT fashion (Single Instruction Multiple Threads)
Bureaucrats, cc_docs_admin, cc_staff
337

edits