CUDA tutorial: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 89: Line 89:
* Constant memory
* Constant memory


= Few Basic CUDA Operations = <!--T:8-->
= A few basic CUDA operations = <!--T:8-->
== CUDA Memory Allocation ==
== CUDA memory allocation ==
* cudaMalloc((void**)&array, size)
* cudaMalloc((void**)&array, size)
** Allocates object in the device memory. Requires address of a pointer of allocated array and size.
** Allocates object in the device memory. Requires address of a pointer of allocated array and size.
Line 96: Line 96:
** Deallocates object from the memory. Requires just a pointer to the array.
** Deallocates object from the memory. Requires just a pointer to the array.


== CUDA Data Transfers == <!--T:9-->
== CUDA data transfer == <!--T:9-->
* cudaMemcpy(array_dest, array_orig, size, direction)
* cudaMemcpy(array_dest, array_orig, size, direction)
** Copy the data from either device to host or host to device . Requires pointers to the arrays, size and the direction type (cudaMemcpyHostToDevice, cudaMemcpyDeviceToHost, cudaMemcpyDeviceToDevice, etc)
** Copy the data from either device to host or host to device. Requires pointers to the arrays, size and the direction type (cudaMemcpyHostToDevice, cudaMemcpyDeviceToHost, cudaMemcpyDeviceToDevice, etc.)
* cudaMemcpyAsync
* cudaMemcpyAsync
** Same as cudaMemcpy, but transfers the data asynchronously which means it's not blocking execution of other processes.
** Same as cudaMemcpy, but transfers the data asynchronously which means it doesn't block the execution of other processes.


= First CUDA C Program= <!--T:10-->
= First CUDA C Program= <!--T:10-->
Bureaucrats, cc_docs_admin, cc_staff
2,318

edits