ARM software: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 5: Line 5:


<!--T:2-->
<!--T:2-->
[https://www.arm.com/products/development-tools/hpc-tools/cross-platform/forge/ddt ARM DDT] (formerly know as Allinea DDT) is a powerful commercial parallel debugger with a graphical user interface. It can be used to debug serial, MPI, multi-threaded, and CUDA programs, or any combination of the above, written in C, C++, and FORTRAN. [https://www.arm.com/products/development-tools/hpc-tools/cross-platform/forge/map MAP] - an efficient parallel profiler - is another very useful tool from ARM (formerly Allinea).
[https://www.arm.com/products/development-tools/hpc-tools/cross-platform/forge/ddt ARM DDT] (formerly know as Allinea DDT) is a powerful commercial parallel debugger with a graphical user interface. It can be used to debug serial, MPI, multi-threaded, and CUDA programs, or any combination of the above, written in C, C++, and FORTRAN. [https://www.arm.com/products/development-tools/hpc-tools/cross-platform/forge/map MAP]—an efficient parallel profiler—is another very useful tool from ARM (formerly Allinea).


<!--T:3-->
<!--T:3-->
This software is available on Graham as two separate modules:
The following modules are available on Graham:
* allinea-cpu, for CPU debugging and profiling;
* allinea-cpu, for CPU debugging and profiling;
* allinea-gpu, for GPU or mixed CPU/GPU debugging.
* allinea-gpu, for GPU or mixed CPU/GPU debugging.
As this is a GUI application, you should log in using <code>ssh -Y</code>, and use an [[SSH|SSH client]] like [[Connecting with MobaXTerm|MobaXTerm]] (Windows) or [https://www.xquartz.org/ XQuartz] (Mac) to ensure proper X11 tunnelling.
As this is a GUI application, log in using <code>ssh -Y</code>, and use an [[SSH|SSH client]] like [[Connecting with MobaXTerm|MobaXTerm]] (Windows) or [https://www.xquartz.org/ XQuartz] (Mac) to ensure proper X11 tunnelling.


<!--T:4-->
<!--T:4-->
Line 17: Line 17:


<!--T:5-->
<!--T:5-->
The current license limits the use of DDT/MAP to a maximum of 512 CPU cores across all users at any given time. DDT-GPU is likewise limited to 8 GPUs.
The current license limits the use of DDT/MAP to a maximum of 512 CPU cores across all users at any given time while DDT-GPU is limited to 8 GPUs.


= Usage = <!--T:6-->
= Usage = <!--T:6-->
Line 23: Line 23:


<!--T:7-->
<!--T:7-->
Allocate the node or nodes on which to do the debugging or profiling with <code>salloc</code>, e.g.:
#Allocate the node or nodes on which to do the debugging or profiling. This will open a shell session on the allocated node.


  <!--T:8-->
  <!--T:8-->
Line 29: Line 29:


<!--T:9-->
<!--T:9-->
This will open a shell session on the allocated node. Then load the appropriate module:
#Load the appropriate module, for example


  <!--T:10-->
  <!--T:10-->
Line 35: Line 35:


<!--T:11-->
<!--T:11-->
This may fail with a suggestion to load an older version of OpenMPI first. If that happens, reload the OpenMPI module with the suggested command, and then reload the allinea-cpu module:
ːThis may fail with a suggestion to load an older version of OpenMPI first. In this case, reload the OpenMPI module with the suggested command, and then reload the allinea-cpu module.


  <!--T:12-->
  <!--T:12-->
Line 42: Line 42:


<!--T:13-->
<!--T:13-->
You can then run the ddt or map command as:
#Run the ddt or map command, for example


  <!--T:14-->
  <!--T:14-->
Line 49: Line 49:


<!--T:15-->
<!--T:15-->
Make sure the MPI implementation is the default "OpenMPI" in the Allinea application window, before pressing the Run button. If this is not the case, press the Change button next to the "Implementation:" string, and pick the correct option from the drop down menu.
ːMake sure the MPI implementation is the default OpenMPI in the Allinea application window, before pressing the ''Run'' button. If this is not the case, press the ''Change'' button next to the ''Implementation:'' string, and select the correct option from the drop-down menu.


<!--T:16-->
<!--T:16-->
When done, exit the shell. This will terminate the allocation.
#When done, exit the shell to terminate the allocation.


== CUDA code == <!--T:17-->
== CUDA code == <!--T:17-->


<!--T:18-->
<!--T:18-->
Allocate the node or nodes on which to do the debugging or profiling with <code>salloc</code>, e.g.:
#Allocate the node or nodes on which to do the debugging or profiling with <code>salloc</code>. This will open a shell session on the allocated node.  


  <!--T:19-->
  <!--T:19-->
Line 63: Line 63:


<!--T:20-->
<!--T:20-->
This will open a shell session on the allocated node. Then load the appropriate module:
#Load the appropriate module, for example


  <!--T:21-->
  <!--T:21-->
Line 69: Line 69:


<!--T:22-->
<!--T:22-->
This may fail with a suggestion to load an older version of OpenMPI first. If that happens, reload the OpenMPI module with the suggested command, and then reload the allinea-gpu module:
ːThis may fail with a suggestion to load an older version of OpenMPI first. In this case, reload the OpenMPI module with the suggested command, and then reload the allinea-gpu module.


  <!--T:23-->
  <!--T:23-->
Line 76: Line 76:


<!--T:24-->
<!--T:24-->
Ensure a cuda module is loaded:
# Ensure a cuda module is loaded.


  <!--T:25-->
  <!--T:25-->
Line 82: Line 82:


<!--T:26-->
<!--T:26-->
You can then run the ddt command as:
#Run the ddt command.


  <!--T:27-->
  <!--T:27-->
Line 88: Line 88:


<!--T:28-->
<!--T:28-->
When done, exit the shell. This will terminate the allocation.
#When done, exit the shell to terminate the allocation.


= Known issues = <!--T:29-->
= Known issues = <!--T:29-->


=== MPI DDT === <!--T:30-->
=== MPI DDT === <!--T:30-->
* For some reason the debugger doesn't show queued MPI messages (e.g. when paused in an MPI deadlock).
* For some reason the debugger does not show queued MPI messages (e.g. when paused in an MPI deadlock).


=== OpenMP DDT === <!--T:31-->
=== OpenMP DDT === <!--T:31-->
* Memory debugging module (which is off by default) doesn't work.
* Memory debugging module (which is off by default) does not work.


=== CUDA DDT === <!--T:32-->
=== CUDA DDT === <!--T:32-->
* Memory debugging module (which is off by default) doesn't work.
* Memory debugging module (which is off by default) does not work.


=== MAP === <!--T:33-->
=== MAP === <!--T:33-->
* MAP currently doesn't work correctly on Graham. We are working on resolving this issue. For now the workaround is to request a SHARCNET account from your Compute Canada account (ccdb), and then run MAP on the SHARCNET's legacy cluster orca's development nodes using [https://www.sharcnet.ca/help/index.php/MAP these instructions].
* MAP currently does not work correctly on Graham; we are working on resolving this issue. For the moment, the workaround is to request a SHARCNET account from your Compute Canada account (via CCDB) and run MAP on Orca's development nodes using [https://www.sharcnet.ca/help/index.php/MAP these instructions].


</translate>
</translate>
rsnt_translations
56,420

edits

Navigation menu