OpenACC Tutorial - Adding directives: Difference between revisions

Jump to navigation Jump to search
no edit summary
(Created page with "<languages /> {{Objectives |title=<translate>Learning objectives</translate> |content= <translate> * Understand the process of ''offloading'' * Understand what is an OpenACC...")
 
No edit summary
Line 21: Line 21:
OpenACC directives are much like OpenMP directives. They take the form of <tt>pragma</tt> in C/C++, and comments in Fortran. The advantages of this method are numerous. First, since it involves very minor modifications to the code, changes can be done ''incrementally'', one <tt>pragma</tt> at a time. This is especially useful for debugging purpose, since making a single change at a time allows one to quickly identify which change created a bug. Second, OpenACC support can be disabled at compile time. When OpenACC support is disabled, the <tt>pragma</tt> are considered comments, and ignored by the compiler. This means that a single source code can be used to compile both an accelerated version and a normal version. Third, since all of the offloading work is done by the compiler, the same code can be compiled for various accelerator types: GPUs, MIC (Xeon Phi) or CPUs. It also means that a new generation of devices only requires one to update the compiler, not to change the code.  
OpenACC directives are much like OpenMP directives. They take the form of <tt>pragma</tt> in C/C++, and comments in Fortran. The advantages of this method are numerous. First, since it involves very minor modifications to the code, changes can be done ''incrementally'', one <tt>pragma</tt> at a time. This is especially useful for debugging purpose, since making a single change at a time allows one to quickly identify which change created a bug. Second, OpenACC support can be disabled at compile time. When OpenACC support is disabled, the <tt>pragma</tt> are considered comments, and ignored by the compiler. This means that a single source code can be used to compile both an accelerated version and a normal version. Third, since all of the offloading work is done by the compiler, the same code can be compiled for various accelerator types: GPUs, MIC (Xeon Phi) or CPUs. It also means that a new generation of devices only requires one to update the compiler, not to change the code.  


In the following example, we take a code composed of two loops. The first one initializes two vectors, and the second performs a <tt>[https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_1 SAXPY]</tt>, a basic vector addition operation.  
In the following example, we take a code comprised of two loops. The first one initializes two vectors, and the second performs a <tt>[https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_1 SAXPY]</tt>, a basic vector addition operation.  
</translate>
</translate>


Line 52: Line 52:
!$acc end kernels
!$acc end kernels
</syntaxhighlight>
</syntaxhighlight>
|}


|}
Both in the C/C++ and the Fortran cases, the compiler will identify '''two''' kernels. In C/C++, the two kernels will correspond to the inside of each loops. In Fortran, the kernels will be the inside of the first loop, as well as the inside of the implicit loop that Fortran performs when it does an array operation.
 
Note that in C/C++, the OpenACC block is delimited using curly brackets, while in Fortran, the same comment needs to be repeated, with the <tt>end</tt>  keyword added.


{{Callout
|title=Loops vs Kernels
|content=
}}




<translate>
<translate>
</translate>
</translate>
Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837

edits

Navigation menu