rsnt_translations
56,420
edits
No edit summary |
(Marked this version for translation) |
||
Line 14: | Line 14: | ||
= Environment modules = <!--T:4--> | = Environment modules = <!--T:4--> | ||
<!--T:48--> | |||
The latest version of NAMD is 2.14 and it has been installed on all clusters. We recommend users run the newest version. | The latest version of NAMD is 2.14 and it has been installed on all clusters. We recommend users run the newest version. | ||
<!--T:49--> | |||
Older versions 2.13 and 2.12 are also available. | Older versions 2.13 and 2.12 are also available. | ||
<!--T:50--> | |||
To run jobs that span nodes, use OFI versions on cedar and UCX versions on other clusters. | To run jobs that span nodes, use OFI versions on cedar and UCX versions on other clusters. | ||
Line 48: | Line 51: | ||
== Verbs jobs == <!--T:16--> | == Verbs jobs == <!--T:16--> | ||
<!--T:51--> | |||
NOTE: For NAMD 2.14, use OFI GPU on cedar and UCX GPU on other clusters. Instructions below apply only to NAMD versions 2.13 and 2.12. | NOTE: For NAMD 2.14, use OFI GPU on cedar and UCX GPU on other clusters. Instructions below apply only to NAMD versions 2.13 and 2.12. | ||
<!--T:52--> | |||
These provisional instructions will be refined further once this configuration can be fully tested on the new clusters. | These provisional instructions will be refined further once this configuration can be fully tested on the new clusters. | ||
This example uses 64 processes in total on 2 nodes, each node running 32 processes, thus fully utilizing its 32 cores. This script assumes full nodes are used, thus <code>ntasks-per-node</code> should be 32 (on Graham). For best performance, NAMD jobs should use full nodes. | This example uses 64 processes in total on 2 nodes, each node running 32 processes, thus fully utilizing its 32 cores. This script assumes full nodes are used, thus <code>ntasks-per-node</code> should be 32 (on Graham). For best performance, NAMD jobs should use full nodes. | ||
Line 110: | Line 115: | ||
<translate> | <translate> | ||
== UCX GPU jobs == | == UCX GPU jobs == <!--T:44--> | ||
This example is for Béluga and it assumes that full nodes are used, which gives best performance for NAMD jobs. It uses 8 processes in total on 2 nodes, each process(task) using 10 threads and 1 GPU. This fully utilizes Béluga GPU nodes which have 40 cores and 4 GPUs per node. Note that 1 core per task has to be reserved for a communications thread, so NAMD will report that only 72 cores are being used but this is normal. | This example is for Béluga and it assumes that full nodes are used, which gives best performance for NAMD jobs. It uses 8 processes in total on 2 nodes, each process(task) using 10 threads and 1 GPU. This fully utilizes Béluga GPU nodes which have 40 cores and 4 GPUs per node. Note that 1 core per task has to be reserved for a communications thread, so NAMD will report that only 72 cores are being used but this is normal. | ||
Line 139: | Line 144: | ||
}} | }} | ||
== OFI jobs == | == OFI jobs == <!--T:53--> | ||
<!--T:54--> | |||
'''NOTE''': OFI versions will run '''ONLY''' on Cedar because of its different interconnect. | '''NOTE''': OFI versions will run '''ONLY''' on Cedar because of its different interconnect. | ||
{{File | {{File | ||
Line 155: | Line 161: | ||
#SBATCH -o slurm.%N.%j.out # STDOUT | #SBATCH -o slurm.%N.%j.out # STDOUT | ||
<!--T:55--> | |||
module load StdEnv/2020 namd-ofi/2.14 | module load StdEnv/2020 namd-ofi/2.14 | ||
srun --mpi=pmi2 namd2 stmv.namd | srun --mpi=pmi2 namd2 stmv.namd | ||
}} | }} | ||
== OFI GPU jobs == | == OFI GPU jobs == <!--T:56--> | ||
<!--T:57--> | |||
'''NOTE''': OFI versions will run '''ONLY''' on Cedar because of its different interconnect. | '''NOTE''': OFI versions will run '''ONLY''' on Cedar because of its different interconnect. | ||
{{File | {{File | ||
Line 175: | Line 183: | ||
#SBATCH --mem=0 # memory per node, 0 means all memory | #SBATCH --mem=0 # memory per node, 0 means all memory | ||
<!--T:58--> | |||
module load StdEnv/2020 cuda/11.0 namd-ofi-smp/2.14 | module load StdEnv/2020 cuda/11.0 namd-ofi-smp/2.14 | ||
NUM_PES=$(expr $SLURM_CPUS_PER_TASK - 1 ) | NUM_PES=$(expr $SLURM_CPUS_PER_TASK - 1 ) | ||
Line 211: | Line 220: | ||
== Verbs-GPU jobs == <!--T:20--> | == Verbs-GPU jobs == <!--T:20--> | ||
<!--T:59--> | |||
NOTE: For NAMD 2.14, use OFI GPU on cedar and UCX GPU on other clusters. Instructions below apply only to NAMD versions 2.13 and 2.12. | NOTE: For NAMD 2.14, use OFI GPU on cedar and UCX GPU on other clusters. Instructions below apply only to NAMD versions 2.13 and 2.12. | ||
<!--T:60--> | |||
This example uses 64 processes in total on 2 nodes, each node running 32 processes, thus fully utilizing its 32 cores. Each node uses 2 GPUs, so job uses 4 GPUs in total. This script assumes full nodes are used, thus <code>ntasks-per-node</code> should be 32 (on Graham). For best performance, NAMD jobs should use full nodes. | This example uses 64 processes in total on 2 nodes, each node running 32 processes, thus fully utilizing its 32 cores. Each node uses 2 GPUs, so job uses 4 GPUs in total. This script assumes full nodes are used, thus <code>ntasks-per-node</code> should be 32 (on Graham). For best performance, NAMD jobs should use full nodes. | ||