rsnt_translations
57,772
edits
No edit summary |
(Marked this version for translation) |
||
Line 43: | Line 43: | ||
* <code>vasp_ncl</code> for NPT ensemble and non-gamma-point calculations | * <code>vasp_ncl</code> for NPT ensemble and non-gamma-point calculations | ||
<!--T:24--> | |||
For VASP-5.4.4 and 6.1.0 with cuda module there are two different executable files as well: | For VASP-5.4.4 and 6.1.0 with cuda module there are two different executable files as well: | ||
* <code>vasp_gpu</code> for standard NVT calculation gamma and non-gamma k-point | * <code>vasp_gpu</code> for standard NVT calculation gamma and non-gamma k-point | ||
Line 55: | Line 56: | ||
If you need a version of VASP that does not appear here, you can either build it yourself (see below) or [[Technical support | write to us]] and ask that it be built and installed. | If you need a version of VASP that does not appear here, you can either build it yourself (see below) or [[Technical support | write to us]] and ask that it be built and installed. | ||
== Vasp-GPU == | == Vasp-GPU == <!--T:25--> | ||
Vasp-GPU executable files run on both GPU and CPU of a node. Basically calculation on GPU of a node is much more expensive than CPU, therefore we highly recommend to perform a benchmark using one or 2 GPU to make sure they are using maximum GPU utilization. Fig.1 show a benchmark of Si crystal contains 256 Si-atoms in the unit-cell. Blue, black and red lines show simulation time as a function of Number of CPU for GPU=0, 1, and 2 respectively. It shows the performance for GPU=1,2 and CPU=1 is more than 5 times better compare to GPU=0 and CPU=1. However, a comparison between calculation with GPU=1 and GPU=2 indicates that there is not much performance gain from GPU=1 to GPU=2. In fact GPU utilization for GPU=2 is around 50% in our monitoring system. Therefore we recommend users to first perform a benchmark like this for their own system to make sure they are not wasting any computer resources. | Vasp-GPU executable files run on both GPU and CPU of a node. Basically calculation on GPU of a node is much more expensive than CPU, therefore we highly recommend to perform a benchmark using one or 2 GPU to make sure they are using maximum GPU utilization. Fig.1 show a benchmark of Si crystal contains 256 Si-atoms in the unit-cell. Blue, black and red lines show simulation time as a function of Number of CPU for GPU=0, 1, and 2 respectively. It shows the performance for GPU=1,2 and CPU=1 is more than 5 times better compare to GPU=0 and CPU=1. However, a comparison between calculation with GPU=1 and GPU=2 indicates that there is not much performance gain from GPU=1 to GPU=2. In fact GPU utilization for GPU=2 is around 50% in our monitoring system. Therefore we recommend users to first perform a benchmark like this for their own system to make sure they are not wasting any computer resources. | ||
<!--T:26--> | |||
[[File:Vasp-GPU-benchmark.pdf|thumb|Fig.1 Simulation time as a function of number of CPU for GPU=0, 1, and 2]] | [[File:Vasp-GPU-benchmark.pdf|thumb|Fig.1 Simulation time as a function of number of CPU for GPU=0, 1, and 2]] | ||
Line 90: | Line 92: | ||
*<VASP> is the name of the executable. The above section "Executable programs" shows the various executables that you can choose for each version. | *<VASP> is the name of the executable. The above section "Executable programs" shows the various executables that you can choose for each version. | ||
<!--T:27--> | |||
{{File | {{File | ||
|name=vasp_gpu_job.sh | |name=vasp_gpu_job.sh | ||
Line 104: | Line 107: | ||
}} | }} | ||
<!--T:28--> | |||
*The above job script requests one CPU core and 1024MB memory. | *The above job script requests one CPU core and 1024MB memory. | ||
*The above job script requests one GPU type p100 which is only available in cedar. For any other machines please see which GPU type is available | *The above job script requests one GPU type p100 which is only available in cedar. For any other machines please see which GPU type is available |