Ansys: Difference between revisions
No edit summary |
|||
Line 4: | Line 4: | ||
<translate> | <translate> | ||
<!--T:2--> | <!--T:2--> | ||
[http://www.ansys.com/ | [http://www.ansys.com/ Ansys] is a software suite for engineering simulation and 3-D design. It includes packages such as [http://www.ansys.com/Products/Fluids/ANSYS-Fluent Ansys Fluent] and [http://www.ansys.com/products/fluids/ansys-cfx Ansys CFX]. | ||
= Licensing = <!--T:4--> | = Licensing = <!--T:4--> | ||
We are a hosting provider for | We are a hosting provider for Ansys. This means that we have the software installed on our clusters, but we do not provide a generic license accessible to everyone. However, many institutions, faculties, and departments already have licenses that can be used on our clusters. Once the legal aspects are worked out for licensing, there will be remaining technical aspects. The license server on your end will need to be reachable by our compute nodes. This will require our technical team to get in touch with the technical people managing your license software. In some cases, this has already been done. You should then be able to load the Ansys module, and it should find its license automatically. If this is not the case, please contact our [[Technical support]] so that they can arrange this for you. | ||
== Configuring your own license file == <!--T:10--> | == Configuring your own license file == <!--T:10--> | ||
Our module for | Our module for Ansys is designed to look for license information in a few places. One of those places is your /home folder. If you have your own license server, write the information to access into file <tt>$HOME/.licenses/ansys.lic</tt> using the following format: | ||
<!--T:12--> | <!--T:12--> | ||
Line 22: | Line 22: | ||
<!--T:13--> | <!--T:13--> | ||
Cluster specific settings for <code>port@hostname</code> are given in the following table: | Cluster-specific settings for <code>port@hostname</code> are given in the following table: | ||
<!--T:14--> | <!--T:14--> | ||
Line 69: | Line 69: | ||
<!--T:17--> | <!--T:17--> | ||
Before a local institutional | Before a local institutional Ansys license server can be reached from our systems, firewall configuration changes will need to be made on both the institution side and on our side. To start this process, contact your local Ansys license server administrator and obtain the following information 1) fully qualified hostname of the local Ansys license server; 2) Ansys flex port (commonly 1055); 3) Ansys licensing interconnect port (commonly 2325); and 4) Ansys static vendor port (site specific). Ensure the administrator is willing to open the firewall on these three ports to accept license checkout requests from your Ansys jobs running on our systems. Contact our [[Technical support]] and send us the four pieces of information and indicate which system(s) you want to run Ansys on, for example Cedar, Beluga, Graham/gra-vdi or Niagara. | ||
== Checking License Usage == <!--T:283--> | == Checking License Usage == <!--T:283--> | ||
<!--T:2830--> | <!--T:2830--> | ||
Ansys comes with an <code>lmutil</code> tool that can be used to check the detailed usage of your licenses. Before using this tool as shown below, make sure the [[#Configuring_your_own_license_file|<code>ansys.lic</code> file is configured]], and an <code>ansys</code> module is loaded: | |||
</translate> | </translate> | ||
{{Commands2 | {{Commands2 | ||
Line 84: | Line 84: | ||
<!--T:19--> | <!--T:19--> | ||
As explained in ANSYS [https://www.ansys.com/it-solutions/platform-support Platform Support] the current release (2021R2) was tested to read and open databases from the five previous releases. | As explained in ANSYS [https://www.ansys.com/it-solutions/platform-support Platform Support] the current release (2021R2) was tested to read and open databases from the five previous releases. In addition, some products can read and open databases from releases before Ansys 18.1. | ||
= Cluster Batch Job Submission = <!--T:20--> | = Cluster Batch Job Submission = <!--T:20--> | ||
The | The Ansys software suite comes with multiple implementations of MPI to support parallel computation. Unfortunately, none of them supports our [[Running jobs|Slurm]] scheduler. For this reason, we need special instructions for each ANSYS package on how to start a parallel job. In the sections below, we give examples of submission scripts for some of the packages. If one is not covered and you want us to investigate and help you start it, please contact our [[Technical support]]. | ||
== Ansys Fluent == <!--T:30--> | == Ansys Fluent == <!--T:30--> |
Revision as of 17:12, 26 January 2023
Ansys is a software suite for engineering simulation and 3-D design. It includes packages such as Ansys Fluent and Ansys CFX.
Licensing[edit]
We are a hosting provider for Ansys. This means that we have the software installed on our clusters, but we do not provide a generic license accessible to everyone. However, many institutions, faculties, and departments already have licenses that can be used on our clusters. Once the legal aspects are worked out for licensing, there will be remaining technical aspects. The license server on your end will need to be reachable by our compute nodes. This will require our technical team to get in touch with the technical people managing your license software. In some cases, this has already been done. You should then be able to load the Ansys module, and it should find its license automatically. If this is not the case, please contact our Technical support so that they can arrange this for you.
Configuring your own license file[edit]
Our module for Ansys is designed to look for license information in a few places. One of those places is your /home folder. If you have your own license server, write the information to access into file $HOME/.licenses/ansys.lic using the following format:
setenv("ANSYSLMD_LICENSE_FILE", "port@hostname")
setenv("ANSYSLI_SERVERS", "port@hostname")
Cluster-specific settings for port@hostname
are given in the following table:
License | Cluster | ANSYSLMD_LICENSE_FILE | ANSYSLI_SERVERS | Notices |
---|---|---|---|---|
CMC | beluga | 6624@10.20.73.21
|
2325@10.20.73.21
|
None |
CMC | cedar | 6624@172.16.0.101
|
2325@172.16.0.101
|
None |
CMC | graham | 6624@199.241.167.222
|
2325@199.241.167.222
|
None |
CMC | narval | 6624@10.100.64.10
|
2325@10.100.64.10
|
None |
SHARCNET | beluga/cedar/graham/gra-vdi/narval | 1055@license3.sharcnet.ca
|
2325@license3.sharcnet.ca
|
None |
Researchers who purchase a CMC license subscription must send their Alliance account username to <cmcsupport@cmc.ca> otherwise license checkouts will fail. The number of cores that can be used with a CMC license is described in the Other Tricks and Tips section found here.
Local License Servers[edit]
Before a local institutional Ansys license server can be reached from our systems, firewall configuration changes will need to be made on both the institution side and on our side. To start this process, contact your local Ansys license server administrator and obtain the following information 1) fully qualified hostname of the local Ansys license server; 2) Ansys flex port (commonly 1055); 3) Ansys licensing interconnect port (commonly 2325); and 4) Ansys static vendor port (site specific). Ensure the administrator is willing to open the firewall on these three ports to accept license checkout requests from your Ansys jobs running on our systems. Contact our Technical support and send us the four pieces of information and indicate which system(s) you want to run Ansys on, for example Cedar, Beluga, Graham/gra-vdi or Niagara.
Checking License Usage[edit]
Ansys comes with an lmutil
tool that can be used to check the detailed usage of your licenses. Before using this tool as shown below, make sure the ansys.lic
file is configured, and an ansys
module is loaded:
[name@server ~]$ module load ansys
[name@server ~]$ $EBROOTANSYS/shared_files/licensing/linx64/lmutil lmstat -c $ANSYSLMD_LICENSE_FILE -S ansyslmd
Version Compatibility[edit]
As explained in ANSYS Platform Support the current release (2021R2) was tested to read and open databases from the five previous releases. In addition, some products can read and open databases from releases before Ansys 18.1.
Cluster Batch Job Submission[edit]
The Ansys software suite comes with multiple implementations of MPI to support parallel computation. Unfortunately, none of them supports our Slurm scheduler. For this reason, we need special instructions for each ANSYS package on how to start a parallel job. In the sections below, we give examples of submission scripts for some of the packages. If one is not covered and you want us to investigate and help you start it, please contact our Technical support.
Ansys Fluent[edit]
Typically you would use the following procedure for running Fluent on one of our clusters:
- Prepare your Fluent job using Fluent from the "ANSYS Workbench" on your Desktop machine up to the point where you would run the calculation.
- Export the "case" file "File > Export > Case..." or find the folder where Fluent saves your project's files. The "case" file will often have a name like FFF-1.cas.gz.
- If you already have data from a previous calculation, which you want to continue, export a "data" file as well (File > Export > Data...) or find it in the same project folder (FFF-1.dat.gz).
- Transfer the "case" file (and if needed the "data" file) to a directory on the project or scratch filesystem on the cluster. When exporting, you save the file(s) under a more instructive name than FFF-1.* or rename them when uploading them.
- Now you need to create a "journal" file. It's purpose is to load the case- (and optionally the data-) file, run the solver and finally write the results. See examples below and remember to adjust the filenames and desired number of iterations.
- If jobs frequently fail to start due to license shortages (and manual resubmission of failed jobs is not convenient) consider modifying your slurm script to requeue your job (up to to 4 times) as shown in the following "Fluent Slurm Script (by node + requeue)" tab. Be aware doing this will also requeue simulations that fail due to non-license related issues (such as divergence) resulting lost compute time. Therefore it is strongly recommended to monitor and inspect each slurm output file to confirm each requeue attempt is license related. When it is determined a job requeued due to a simulation issue then immediately manually kill the job progression with
scancel jobid
and correct the problem. - After running the job you can download the "data" file and import it back into Fluent with File > import > Data....
Slurm Scripts[edit]
General Purpose[edit]
Most fluent jobs should use the following by node slurm script to minimize solution latency and maximize performance over as few nodes as possible. Very large jobs however may benefit by starting significantly sooner using the by core version however the actual time to launch a job over many nodes may take significantly longer thus offsetting some of the benefits. In addition be aware that running large jobs over an unspecified number of potentially very many nodes will make it far more vulnerable to crashing if any of the compute nodes fail during the simulation thus again offsetting any benefits from shorter startup time a fixed number of nodes approach. The scripts will ensure fluent uses shared memory for communication when run on a single node or distributed memory (utilizing mpi and the appropriate hpc interconnect) when run over multiple nodes.
#!/bin/bash
#SBATCH --account=def-group # Specify account name
#SBATCH --time=00-03:00 # Specify time limit dd-hh:mm
#SBATCH --nodes=1 # Specify number of compute nodes (1 or more)
#SBATCH --ntasks-per-node=32 # Specify number of cores per node (graham 32 or 44, cedar 48, beluga 40, narval 64)
#SBATCH --mem=0 # Do not change (allocates all memory per compute node)
#SBATCH --cpus-per-task=1 # Do not change
rm -f cleanup* core*
#module load StdEnv/2016 # Applies to cedar, beluga, graham
#module load ansys/2020R2 # or older module versions
#module load StdEnv/2020 # Applies to narval only
#module load ansys/2019R3 # or newer module versions
module load StdEnv/2020 # Applies to cedar, beluga, graham
module load ansys/2021R2 # or newer module versions
if [[ "${CC_CLUSTER}" == narval && "${EBVERSIONANSYS}" == @(2019R3|2020R2) ]]; then
module load intel/2021 intelmpi
export INTELMPI_ROOT=$I_MPI_ROOT/mpi/latest
unset I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS
unset I_MPI_ROOT
fi
slurm_hl2hl.py --format ANSYS-FLUENT > /tmp/machinefile-$SLURM_JOB_ID
NCORES=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK))
# Specify 2d, 2ddp, 3d or 3ddp and replace sample with your journal filename ...
if [ "$SLURM_NNODES" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample.jou
fi
#!/bin/bash
#SBATCH --account=def-group # Specify account
#SBATCH --time=00-03:00 # Specify time limit dd-hh:mm
##SBATCH --nodes=2 # Optional (uncomment to specify 1 or more compute nodes)
#SBATCH --ntasks=16 # Specify total number of cores
#SBATCH --mem-per-cpu=4G # Specify memory per core
#SBATCH --cpus-per-task=1 # Do not change
rm -f cleanup* core*
#module load StdEnv/2016 # Applies to cedar, beluga, graham
#module load ansys/2020R2 # or older module versions
#module load StdEnv/2020 # Applies to narval only
#module load ansys/2019R3 # or newer module versions
module load StdEnv/2020 # Applies to cedar, beluga, graham
module load ansys/2021R2 # or newer module versions
if [[ "${CC_CLUSTER}" == narval && "${EBVERSIONANSYS}" == @(2019R3|2020R2) ]]; then
module load intel/2021 intelmpi
export INTELMPI_ROOT=$I_MPI_ROOT/mpi/latest
unset I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS
unset I_MPI_ROOT
fi
slurm_hl2hl.py --format ANSYS-FLUENT > /tmp/machinefile-$SLURM_JOB_ID
NCORES=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK))
# Specify 2d, 2ddp, 3d or 3ddp and replace sample with your journal filename ...
if [ "$SLURM_NNODES" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample.jou
fi
License Requeue[edit]
These two scripts can be used to automatically re-queue fluent jobs that exit abnormally during startup and require several re-queue attempts to checkout the required licenses from a remote server. The cause could be due to the license server randomly experiencing heavy load, transient network instability or a shortage of licenses. It is not recommended to use these scripts with fluent jobs that start normally and run a long time before crashing (for instance due to a node failure, divergence or running out slurm runtime before completing normally) since they will be restarted from the beginning instead of using the most recent saved solution dat file thus resulting in a significant amount of lost compute time and resources.
#!/bin/bash
#SBATCH --account=def-group # Specify account
#SBATCH --time=00-03:00 # Specify time limit dd-hh:mm
#SBATCH --nodes=1 # Specify number of compute nodes (1 or more)
#SBATCH --ntasks-per-node=32 # Specify number of cores per node (graham 32 or 44, cedar 48, beluga 40, narval 64)
#SBATCH --mem=0 # Do not change (allocates all memory per compute node)
#SBATCH --cpus-per-task=1 # Do not change
#SBATCH --array=1-5%1 # Specify number of requeue attempts (2 or more, 5 is shown)
rm -f cleanup* core*
#module load StdEnv/2016 # Applies to cedar, beluga, graham
#module load ansys/2020R2 # or older module versions
#module load StdEnv/2020 # Applies to narval only
#module load ansys/2019R3 # or newer module versions
module load StdEnv/2020 # Applies to cedar, beluga, graham
module load ansys/2021R2 # or newer module versions
if [[ "${CC_CLUSTER}" == narval && "${EBVERSIONANSYS}" == @(2019R3|2020R2) ]]; then
module load intel/2021 intelmpi
export INTELMPI_ROOT=$I_MPI_ROOT/mpi/latest
unset I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS
unset I_MPI_ROOT
fi
slurm_hl2hl.py --format ANSYS-FLUENT > /tmp/machinefile-$SLURM_JOB_ID
NCORES=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK))
# Specify 2d, 2ddp, 3d or 3ddp and replace sample with your journal filename ...
if [ "$SLURM_NNODES" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample.jou
fi
if [ $? -eq 0 ]; then
echo "Job completed successfully! Exiting now."
scancel $SLURM_ARRAY_JOB_ID
else
echo "Job attempt $SLURM_ARRAY_TASK_ID of $SLURM_ARRAY_TASK_COUNT failed due to license or simulation issue!"
if [ $SLURM_ARRAY_TASK_ID -lt $SLURM_ARRAY_TASK_COUNT ]; then
echo "Resubmitting job now ..."
else
echo "All job attempts failed exiting now."
fi
fi
#!/bin/bash
#SBATCH --account=def-group # Specify account
#SBATCH --time=00-03:00 # Specify time limit dd-hh:mm
##SBATCH --nodes=2 # Optional (uncomment to specify 1 or more compute nodes)
#SBATCH --ntasks=16 # Specify total number of cores
#SBATCH --mem-per-cpu=4G # Specify memory per core
#SBATCH --cpus-per-task=1 # Do not change
#SBATCH --array=1-5%1 # Specify number of requeue attempts (2 or more, 5 is shown)
rm -f cleanup* core*
#module load StdEnv/2016 # Applies to cedar, beluga, graham
#module load ansys/2020R2 # or older module versions
#module load StdEnv/2020 # Applies to narval only
#module load ansys/2019R3 # or newer module versions
module load StdEnv/2020 # Applies to cedar, beluga, graham
module load ansys/2021R2 # or newer module versions
if [[ "${CC_CLUSTER}" == narval && "${EBVERSIONANSYS}" == @(2019R3|2020R2) ]]; then
module load intel/2021 intelmpi
export INTELMPI_ROOT=$I_MPI_ROOT/mpi/latest
unset I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS
unset I_MPI_ROOT
fi
slurm_hl2hl.py --format ANSYS-FLUENT > /tmp/machinefile-$SLURM_JOB_ID
NCORES=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK))
# Specify 2d, 2ddp, 3d or 3ddp and replace sample with your journal filename ...
if [ "$SLURM_NNODES" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample.jou
fi
if [ $? -eq 0 ]; then
echo "Job completed successfully! Exiting now."
scancel $SLURM_ARRAY_JOB_ID
else
echo "Job attempt $SLURM_ARRAY_TASK_ID of $SLURM_ARRAY_TASK_COUNT failed due to license or simulation issue!"
if [ $SLURM_ARRAY_TASK_ID -lt $SLURM_ARRAY_TASK_COUNT ]; then
echo "Resubmitting job now ..."
else
echo "All job attempts failed exiting now."
fi
fi
Solution Restart[edit]
The following two scripts are provided to automate restarting very large jobs from the most recent saved time step files that require more than the typical 7 day maximum runtime window available on most clusters to complete. A fundamental requirement is the first time step can be completed within the requested job array time limit (specified at the top of your slurm script) when starting a simulation from an initialized solution field. It is assumed that a standard fixed time step size is being used. To begin a working set of sample.cas, sample.dat and sample.jou files must be present. Next edit your sample.jou file to contain "/solve/dual-time-iterate 1" and "/file/auto-save/data-frequency 1". Then create a restart journal file by doing "cp sample.jou sample-restart.jou" and then edit the sample-restart.jou file to contain "/file/read-cas-data sample-restart" instead of "/file/read-cas-data sample" and comment out the initialization line with a semi-colon for instance ";/solve/initialize/initialize-flow". If your 2nd and subsequent time steps are known to run twice as fast (than the initial time step) then edit sample-restart.jou to specify "/solve/dual-time-iterate 2". By doing this the solution will only be restarted after two 2 time steps are completed following the initial time step. An output file for each time step will still be saved in the output subdirectory. The value 2 is arbitrary but should be chosen such that the time for 2 steps fits within the job array time limit. Doing this will minimize the number of solution restarts restarts which are computationally expensive. If your first time step performed by sample.jou starts from a converged (previous) solution then choose 1 instead of 2 since likely all time steps will require a similar amount to wall time to complete. Assuming 2 is chosen, the total time of simulation to be completed will be 1*Dt+2*Nrestart*Dt where Nrestart is the number of solution restarts specified in the slurm script. The total number of time steps (and hence the number of output files generated) will therefore be 1+2*Nrestart. The value for the slurm time resource request should be chosen so the initial time step and subsequent time steps will complete comfortably within the slurm time window specifiable up to a maximum of "#SBATCH --time=07-00:00" days.
#!/bin/bash
#SBATCH --account=def-group # Specify account
#SBATCH --time=07-00:00 # Specify time limit dd-hh:mm
#SBATCH --nodes=1 # Specify number of compute nodes (1 or more)
#SBATCH --ntasks-per-node=32 # Specify number of cores per node (graham 32 or 44, cedar 48, beluga 40, narval 64)
#SBATCH --mem=0 # Do not change (allocates all memory per compute node)
#SBATCH --cpus-per-task=1 # Do not change
#SBATCH --array=1-5%1 # Specify number of solution restarts (2 or more, 5 is shown)
rm -f cleanup* core*
#module load StdEnv/2016 # Applies to cedar, beluga, graham
#module load ansys/2020R2 # or older module versions
#module load StdEnv/2020 # Applies to narval only
#module load ansys/2019R3 # or newer module versions
module load StdEnv/2020 # Applies to cedar, beluga, graham
module load ansys/2021R2 # or newer module versions
if [[ "${CC_CLUSTER}" == narval && "${EBVERSIONANSYS}" == @(2019R3|2020R2) ]]; then
module load intel/2021 intelmpi
export INTELMPI_ROOT=$I_MPI_ROOT/mpi/latest
unset I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS
unset I_MPI_ROOT
fi
slurm_hl2hl.py --format ANSYS-FLUENT > /tmp/machinefile-$SLURM_JOB_ID
NCORES=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK))
# Specify 2d, 2ddp, 3d or 3ddp and replace sample with your journal filename ...
if [ "$SLURM_NNODES" == 1 ]; then
if [ "$SLURM_ARRAY_TASK_ID" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -i sample-restart.jou
fi
else
if [ "$SLURM_ARRAY_TASK_ID" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample-restart.jou
fi
fi
if [ $? -eq 0 ]; then
echo
echo "SLURM_ARRAY_TASK_ID = $SLURM_ARRAY_TASK_ID"
echo "SLURM_ARRAY_TASK_COUNT = $SLURM_ARRAY_TASK_COUNT"
echo
if [ $SLURM_ARRAY_TASK_ID -lt $SLURM_ARRAY_TASK_COUNT ]; then
echo "Restarting job with the most recent output dat file ..."
ln -sfv output/$(ls -ltr output | grep .cas | tail -n1 | awk '{print $9}') sample-restart.cas.gz
ln -sfv output/$(ls -ltr output | grep .dat | tail -n1 | awk '{print $9}') sample-restart.dat.gz
ls -lh cavity* output/*
else
echo "Job completed successfully! Exiting now."
scancel $SLURM_ARRAY_JOB_ID
fi
else
echo "Simulation failed. Exiting ..."
fi
#!/bin/bash
#SBATCH --account=def-group # Specify account
#SBATCH --time=00-03:00 # Specify time limit dd-hh:mm
##SBATCH --nodes=2 # Optional (uncomment to specify 1 or more compute nodes)
#SBATCH --ntasks=16 # Specify total number of cores
#SBATCH --mem-per-cpu=4G # Specify memory per core
#SBATCH --cpus-per-task=1 # Do not change
#SBATCH --array=1-5%1 # Specify number of restart aka time steps (2 or more, 5 is shown)
rm -f cleanup* core*
#module load StdEnv/2016 # Applies to cedar, beluga, graham
#module load ansys/2020R2 # or older module versions
#module load StdEnv/2020 # Applies to narval only
#module load ansys/2019R3 # or newer module versions
module load StdEnv/2020 # Applies to cedar, beluga, graham
module load ansys/2021R2 # or newer module versions
if [[ "${CC_CLUSTER}" == narval && "${EBVERSIONANSYS}" == @(2019R3|2020R2) ]]; then
module load intel/2021 intelmpi
export INTELMPI_ROOT=$I_MPI_ROOT/mpi/latest
unset I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS
unset I_MPI_ROOT
fi
slurm_hl2hl.py --format ANSYS-FLUENT > /tmp/machinefile-$SLURM_JOB_ID
NCORES=$((SLURM_NTASKS * SLURM_CPUS_PER_TASK))
# Specify 2d, 2ddp, 3d or 3ddp and replace sample with your journal filename ...
if [ "$SLURM_NNODES" == 1 ]; then
if [ "$SLURM_ARRAY_TASK_ID" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -i sample-restart.jou
fi
else
if [ "$SLURM_ARRAY_TASK_ID" == 1 ]; then
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample.jou
else
fluent -g 2ddp -t $NCORES -affinity=0 -cnf=/tmp/machinefile-$SLURM_JOB_ID -mpi=intel -ssh -i sample-restart.jou
fi
fi
if [ $? -eq 0 ]; then
echo
echo "SLURM_ARRAY_TASK_ID = $SLURM_ARRAY_TASK_ID"
echo "SLURM_ARRAY_TASK_COUNT = $SLURM_ARRAY_TASK_COUNT"
echo
if [ $SLURM_ARRAY_TASK_ID -lt $SLURM_ARRAY_TASK_COUNT ]; then
echo "Restarting job with the most recent output dat file ..."
ln -sfv output/$(ls -ltr output | grep .cas | tail -n1 | awk '{print $9}') sample-restart.cas.gz
ln -sfv output/$(ls -ltr output | grep .dat | tail -n1 | awk '{print $9}') sample-restart.dat.gz
ls -lh cavity* output/*
else
echo "Job completed successfully! Exiting now."
scancel $SLURM_ARRAY_JOB_ID
fi
else
echo "Simulation failed. Exiting ..."
fi
Journal Files[edit]
Fluent Journal files can include basically any command from Fluent's Text-User-Interface (TUI); commands can be used to change simulation parameters like temperature, pressure and flow speed. With this you can run a series of simulations under different conditions with a single case file, by only changing the parameters in the Journal file. Refer to the Fluent User's Guide for more information and a list of all commands that can be used. The below fluent journal files are setup with "/file/cff-files no" to use legacy .cas/.dat file format (the default in module versions 2019R3 or older). Set this instead to "/file/cff-files yes" to use the more efficient .cas.h5/.dat.h5 file format (the default in module versions 2020R1 or newer).
; SAMPLE FLUENT JOURNAL FILE - STEADY SIMULATION
; ----------------------------------------------
; lines beginning with a semicolon are comments
; Read/Write ANSYS common fluids format (CFF) files? [yes by default]
/file/cff-files no
; Read input case and data files:
/file/read-case-data FFF-in
; Run the solver for this many iterations:
/solve/iterate 1000
; Overwrite output files by default:
/file/confirm-overwrite n
; Write final output data file:
/file/write-data FFF-out
; Write simulation report to file (optional):
/report/summary y "My_Simulation_Report.txt"
; Exit fluent (required to requeue):
exit
; SAMPLE FLUENT JOURNAL FILE - STEADY SIMULATION
; ----------------------------------------------
; lines beginning with a semicolon are comments
; Read/Write ANSYS common fluids format (CFF) files? [yes by default]
/file/cff-files no
; Read input files (.cas/.dat or .cas.h5/.dat.h5):
/file/read-case-data FFF-in
; Write a data file every 100 iterations:
/file/auto-save/data-frequency 100
; Retain data files from 5 most recent iterations:
/file/auto-save/retain-most-recent-files y
; Write data files to output sub-directory (appends iteration)
/file/auto-save/root-name output/FFF-out
; Run the solver for this many iterations:
/solve/iterate 1000
; Write final output case and data files:
/file/write-case-data FFF-out
; Write simulation report to file (optional):
/report/summary y "My_Simulation_Report.txt"
; Exit fluent (required to requeue):
exit
; SAMPLE FLUENT JOURNAL FILE - TRANSIENT SIMULATION
; -------------------------------------------------
; lines beginning with a semicolon are comments
; Read/Write ANSYS common fluids format (CFF) files? [yes by default]
/file/cff-files no
; Read only the input case file:
/file/read-case FFF-transient-inp
; For continuation (restart) read in both case and data input files:
;/file/read-case-data FFF-transient-inp
; Write a data (and maybe case) file every 100 time steps:
/file/auto-save/data-frequency 100
/file/auto-save/case-frequency if-case-is-modified
; Retain only the most recent 5 data (and maybe case) files:
/file/auto-save/retain-most-recent-files y
; Write to output sub-directory (appends flowtime and timestep)
/file/auto-save/root-name output/FFF-transient-out-%10.6f
; ##### Settings for Transient simulation : #####
; Set the physical time step size
/solve/set/time-step 0.0001
; Set the max number of iterations per time step:
/solve/set/max-iterations-per-time-step 20
; Set the number of iterations for which convergence monitors are reported:
/solve/set/reporting-interval 1
; ##### End of settings for Transient simulation #####
; Initialize using the hybrid initialization method:
/solve/initialize/hyb-initialization
; Perform unsteady iterations for a specified number of time steps:
/solve/dual-time-iterate 1000 ,
; Write final case and data output files:
/file/write-case-data FFF-transient-out
; Write simulation report to file (optional):
/report/summary y Report_Transient_Simulation.txt
; Exit fluent (required to requeue):
exit
Ansys CFX[edit]
Slurm Scripts[edit]
#!/bin/bash
#SBATCH --account=def-group # Specify account name
#SBATCH --time=00-03:00 # Specify time limit dd-hh:mm
#SBATCH --nodes=1 # Specify number compute nodes (1 or more)
#SBATCH --cpus-per-task=32 # Specify number cores per node (graham 32 or 44, cedar 32 or 48, beluga 40)
#SBATCH --mem=0 # Do not change (allocate all memory per compute node)
#SBATCH --ntasks-per-node=1 # Do not change
#module load StdEnv/2016 # Applies to: graham, cedar, beluga
#module load ansys/2020R2 # Or older module versions
module load StdEnv/2020 # Applies to: graham, cedar, beluga, narval
module load ansys/2021R1 # Or newer module versions
NNODES=$(slurm_hl2hl.py --format ANSYS-CFX)
# other options may be appended to the following command line as needed
cfx5solve -def YOURFILE.def -start-method "Intel MPI Distributed Parallel" -par-dist $NNODES
#!/bin/bash
#SBATCH --account=def-group # Specify account name
#SBATCH --time=00-03:00 # Specify time limit dd-hh:mm
#SBATCH --ntasks-per-node=4 # Specify number cores (narval up to 64)
#SBATCH --mem=8G # Specify 0 when using all cores
#SBATCH --nodes=1 # Do not change
module load StdEnv/2020 # Applies to: narval, graham, cedar
module load ansys/2020R2 # Or version 2019R3
# other options may be appended to the following command line as needed
cfx5solve -def YOURFILE.def -start-method "Open MPI Local Parallel" -part $SLURM_CPUS_ON_NODE
Note: You may get the following errors in your output file : /etc/tmi.conf: No such file or directory. They do not seem to affect the computation.
Workbench[edit]
Before submitting a job (sbatch script-wbpj.sh
) to the queue 1) specify the name of YOURPROJECT.wbpj file in one of the slurm scripts below and 2) initialize the project by opening it in ANSYS workbench as described in the Graphical_Use, click File -> Open to load your project, start Mechanical by double clicking Setup or Solution for the Analysis System in the main display window, click File -> Clear Generated Data -> Yes, click File -> Save Project, click File -> Close Mechanical and finally click File -> exit.
To avoid writing the solution when a job successfully completes on a cluster, remove ;Save(Overwrite=True)
from the last line of your slurm script. Doing this will make it easier to run multiple test jobs (for scaling purposes when changing ntasks) since the initialized solution will not be overwritten each time and therefore not need to be re-initialized between jobs. Alternatively a copy of the initialized YOURPROJECT.wbpj file and YOURPROJECT_files sub-directory could be saved prior to submitting each job then restored after the solution is written. For APDL based simulations submitted under the legacy StdEnv/2016 environment, nodes=1 may be either removed from the slurm script or changed to be greater than 1 to permit computations across multiple nodes.
Slurm Scripts[edit]
#!/bin/bash
#SBATCH --account=def-account
#SBATCH --time=00-03:00 # Time (DD-HH:MM)
#SBATCH --mem=16G # Total Memory (set to 0 for all node memory)
#SBATCH --ntasks=4 # Number of cores
#SBATCH --nodes=1 # Do not change
##SBATCH --exclusive # Uncomment for scaling testing
##SBATCH --constraint=broadwell # Applicable to graham or cedar
module load StdEnv/2020 ansys/2021R2 # OR newer ansys modules (DMP not supported on narval for 2021R2)
MEMPAR=0 # Set to 0 for SMP or 1 for DMP (distributed memory parallel)
rm -fv *_files/.lock
MWFILE=~/.mw/Application\ Data/Ansys/`basename $(find $EBROOTANSYS/v* -maxdepth 0 -type d)`/SolveHandlers.xml
sed -re "s/(.AnsysSolution>+)[a-zA-Z0-9]*(<\/Distribute.)/\1$MEMPAR\2/" -i "$MWFILE"
sed -re "s/(.Processors>+)[a-zA-Z0-9]*(<\/MaxNumber.)/\1$SLURM_NTASKS\2/" -i "$MWFILE"
sed -i "s!UserConfigured=\"0\"!UserConfigured=\"1\"!g" "$MWFILE"
export KMP_AFFINITY=disabled
export I_MPI_HYDRA_BOOTSTRAP=ssh
runwb2 -B -E "Update();Save(Overwrite=True)" -F YOURPROJECT.wbpj
#!/bin/bash
#SBATCH --account=def-account
#SBATCH --time=00-03:00 # Time (DD-HH:MM)
#SBATCH --mem=16G # Total Memory (set to 0 for all node memory)
#SBATCH --ntasks=4 # Number of cores
#SBATCH --nodes=1 # Do not change
module load StdEnv/2016 ansys/2019R3 # Do not change
MEMPAR=1 # Do not change
rm -fv *_files/.lock
MWFILE=~/.mw/Application\ Data/Ansys/`basename $(find $EBROOTANSYS/v* -maxdepth 0 -type d)`/SolveHandlers.xml
sed -re "s/(.AnsysSolution>+)[a-zA-Z0-9]*(<\/Distribute.)/\1$MEMPAR\2/" -i "$MWFILE"
sed -re "s/(.Processors>+)[a-zA-Z0-9]*(<\/MaxNumber.)/\1$((SLURM_NTASKS-1))\2/" -i "$MWFILE"
export KMP_AFFINITY=balanced
export I_MPI_HYDRA_BOOTSTRAP=ssh
export PATH=/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/bin:$PATH
runwb2 -B -E "Update();Save(Overwrite=True)" -F YOURPROJECT.wbpj
Mechanical[edit]
The input file can be generated from within your interactive Workbench Mechanical session by clicking Solution -> Tools -> Write Input Files then specify File name:
YOURAPDLFILE.inp and Save as type:
APDL Input Files (*.inp). APDL jobs can then be submitted to the queue by running the sbatch script-name.sh
command.
Slurm Scripts[edit]
The ANSYS modules given in each of the following scripts have been tested on graham and should work without issue (uncomment one). Once the scripts have been tested on other clusters (in particular narval) they will be updated if required.
#!/bin/bash
#SBATCH --account=def-account # Specify your account
#SBATCH --time=00-03:00 # Specify time (DD-HH:MM)
#SBATCH --mem=16G # Specify memory for all cores
#SBATCH --ntasks=8 # Specify number of cores (1 or more)
#SBATCH --nodes=1 # Specify one node (do not change)
unset SLURM_GTIDS
module load StdEnv/2016
#module load ansys/19.1
#module load ansys/19.2
#module load ansys/2019R2
#module load ansys/2019R3
#module load ansys/2020R1
module load ansys/2020R2
mapdl -smp -b nolist -np $SLURM_NTASKS -dir $SLURM_TMPDIR -I YOURAPDLFILE.inp
rm -rf results-*
mkdir results-$SLURM_JOB_ID
cp -a --no-preserve=ownership $SLURM_TMPDIR/* results-$SLURM_JOB_ID
#!/bin/bash
#SBATCH --account=def-account # Specify your account
#SBATCH --time=00-03:00 # Specify time (DD-HH:MM)
#SBATCH --mem=16G # Specify memory for all cores
#SBATCH --ntasks=8 # Specify number of cores (1 or more)
#SBATCH --nodes=1 # Specify one node (do not change)
unset SLURM_GTIDS
module load StdEnv/2020
#module load ansys/2021R2
#module load ansys/2022R1
module load ansys/2022R2
mapdl -smp -b nolist -np $SLURM_NTASKS -dir $SLURM_TMPDIR -I YOURAPDLFILE.inp
rm -rf results-*
mkdir results-$SLURM_JOB_ID
cp -a --no-preserve=ownership $SLURM_TMPDIR/* results-$SLURM_JOB_ID
#!/bin/bash
#SBATCH --account=def-account # Specify your account
#SBATCH --time=00-03:00 # Specify time (DD-HH:MM)
#SBATCH --mem-per-cpu=2G # Specify memory per core
#SBATCH --ntasks=8 # Specify number of cores (2 or more)
##SBATCH --nodes=2 # Specify number of nodes (optional)
##SBATCH --ntasks-per-node=4 # Specify cores per node (optional)
unset SLURM_GTIDS
module load StdEnv/2016
#module load ansys/2019R3
module load ansys/2020R1
export I_MPI_HYDRA_BOOTSTRAP=ssh; export KMP_AFFINITY=compact
mapdl -dis -mpi intelmpi -b nolist -np $SLURM_NTASKS -dir $SLURM_TMPDIR -I YOURAPDLFILE.inp
rm -rf results-*
mkdir results-$SLURM_JOB_ID
cp -a --no-preserve=ownership $SLURM_TMPDIR/* results-$SLURM_JOB_ID
#!/bin/bash
#SBATCH --account=def-account # Specify your account
#SBATCH --time=00-03:00 # Specify time (DD-HH:MM)
#SBATCH --mem-per-cpu=2G # Specify memory per core
#SBATCH --ntasks=8 # Specify number of cores (2 or more)
##SBATCH --nodes=2 # Specify number of nodes (optional)
##SBATCH --ntasks-per-node=4 # Specify cores per node (optional)
unset SLURM_GTIDS
module load StdEnv/2020
module load ansys/2022R2
mapdl -dis -mpi openmpi -b nolist -np $SLURM_NTASKS -dir $SLURM_TMPDIR -I YOURAPDLFILE.inp
rm -rf results-*
mkdir results-$SLURM_JOB_ID
cp -a --no-preserve=ownership $SLURM_TMPDIR/* results-$SLURM_JOB_ID
ANSYS allocates 1024 MB total memory and 1024 MB database memory by default for APDL jobs. These values can be manually specified (or changed) by adding arguments -m 1024 and/or -db 1024 to the mapdl command line in the above slurm scripts. When using a remote institutional license server with multiple ANSYS licenses it may be necessary to add arguments such as -p aa_r or -ppf anshpc. As always perform detailed scaling tests before running production jobs to ensure the optimal number of cores and minimum amount memory is specified in your slurm scripts. The Single node (SMP Shared Memory Parallel) scripts will typically perform better than the Multinode (DIS Distributed Memory Parallel) scripts and therefore should be used whenever possible. To help avoid compatibility issues the ansys module loaded in your slurm script should ideally match the version used to to generate the input file:
[gra-login2:~/ansys/mechanical/demo] cat YOURAPDLFILE.inp | grep version ! ANSYS input file written by Workbench version 2019 R3
Ansys EDT[edit]
Ansysedt can be run interactively in batch (non-gui) mode by first starting an salloc session with options salloc --time=3:00:00 --tasks=8 --mem=16G --account=def-account
and then copy pasting the full ansysedt
command given in the last line of script-local-cmd.sh being sure to manually specify $YOUR_AEDT_FILE.
Slurm Scripts[edit]
Ansys Electronic Desktop jobs may be submitted to a cluster queue with sbatch script-name.sh
command using either of the following single node slurm scripts. At the time of this writing the scripts had only been tested on graham and therefore may be updated in the future as required to support other clusters. Before using them specify the simulation time, memory, number of cores and replace YOUR_AEDT_FILE with your input file name. A full listing of ansysedt command line options can be obtained by starting ansysedt in graphical mode with commands ansysedt -help
or ansysedt -Batchoptionhelp
to obtain scrollable graphical popups.
#!/bin/bash
#SBATCH --account=account # Specify your account (def or rrg)
#SBATCH --time=00-01:00 # Specify time (DD-HH:MM)
#SBATCH --mem=16G # Specify memory (set to 0 to use all compute node memory)
#SBATCH --ntasks=8 # Specify cores (beluga 40, cedar 32 or 48, graham 32 or 44, narval 64)
#SBATCH --nodes=1 # Request one node (Do Not Change)
module load StdEnv/2020
module load ansysedt/2021R2
# Uncomment next line to run a test example:
cp -f $EBROOTANSYSEDT/AnsysEM21.2/Linux64/Examples/HFSS/Antennas/TransientGeoRadar.aedt .
# Specify input file such as:
YOUR_AEDT_FILE="TransientGeoRadar.aedt"
# Remove previous output:
rm -rf $YOUR_AEDT_FILE.* ${YOUR_AEDT_FILE}results
# ---- do not change anything below this line ---- #
echo -e "\nANSYSLI_SERVERS= $ANSYSLI_SERVERS"
echo "ANSYSLMD_LICENSE_FILE= $ANSYSLMD_LICENSE_FILE"
echo -e "SLURM_TMPDIR= $SLURM_TMPDIR on $SLURMD_NODENAME\n"
export KMP_AFFINITY=disabled
ansysedt -monitor -UseElectronicsPPE -ng -distributed -machinelist list=localhost:1:$SLURM_NTASKS \
-batchoptions "TempDirectory=$SLURM_TMPDIR HPCLicenseType=pool HFSS/EnableGPU=0" -batchsolve $YOUR_AEDT_FILE
#!/bin/bash
#SBATCH --account=account # Specify your account (def or rrg)
#SBATCH --time=00-01:00 # Specify time (DD-HH:MM)
#SBATCH --mem=16G # Specify memory (set to 0 to allocate all compute node memory)
#SBATCH --ntasks=8 # Specify cores (beluga 40, cedar 32 or 48, graham 32 or 44, narval 64)
#SBATCH --nodes=1 # Request one node (Do Not Change)
module load StdEnv/2020
module load ansysedt/2021R2
# Uncomment next line to run a test example:
cp -f $EBROOTANSYSEDT/AnsysEM21.2/Linux64/Examples/HFSS/Antennas/TransientGeoRadar.aedt .
# Specify input filename such as:
YOUR_AEDT_FILE="TransientGeoRadar.aedt"
# Remove previous output:
rm -rf $YOUR_AEDT_FILE.* ${YOUR_AEDT_FILE}results
# Specify options filename:
OPTIONS_TXT="Options.txt"
# Write sample options file
rm -f $OPTIONS_TXT
cat > $OPTIONS_TXT <<EOF
\$begin 'Config'
'TempDirectory'='$SLURM_TMPDIR'
'HPCLicenseType'='pool'
'HFSS/EnableGPU'=0
\$end 'Config'
EOF
# ---- do not change anything below this line ---- #
echo -e "\nANSYSLI_SERVERS= $ANSYSLI_SERVERS"
echo "ANSYSLMD_LICENSE_FILE= $ANSYSLMD_LICENSE_FILE"
echo -e "SLURM_TMPDIR= $SLURM_TMPDIR on $SLURMD_NODENAME\n"
export KMP_AFFINITY=disabled
ansysedt -monitor -UseElectronicsPPE -ng -distributed -machinelist list=localhost:1:$SLURM_NTASKS \
-batchoptions $OPTIONS_TXT -batchsolve $YOUR_AEDT_FILE
Graphical Use[edit]
ANSYS programs may be run interactively in gui mode on cluster Compute Nodes or graham VDI Nodes.
Compute Nodes[edit]
ANSYS can be run interactively on single cluster compute node for up to 24hours. This approach is ideal for testing large simulations since all cores and memory can be requested with salloc as described in TigerVNC. Once connected with vncviewer, any of the following program versions can be started after loading the required modules as shown below. The vertical bar |
notation is used to separate the various ANSYS commands.
Fluids[edit]
module load StdEnv/2020 ansys/2022R1
, ormodule load StdEnv/2020 ansys/2021R2
, ormodule load StdEnv/2020 ansys/2021R1
unset SESSION_MANAGER
fluent -mpi=intel | cfx5
- Note: ansys/2022R2 fluent not currently working on compute nodes
- ------------------------------------------------------------------------------------
module load StdEnv/2016 ansys/2020R2
(or older versions)fluent | cfx5
Workbench[edit]
module load StdEnv/2020 ansys/2022R1
, ormodule load StdEnv/2020 ansys/2021R2
runwb2
- Note: ansys/2022R2 runwb2 not currently working on compute nodes
- To run in parallel unTick the Distributed box in the Solve panel and specify a value for Cores uqual to the "salloc cpus value that you specified minus 1".
- NOTE: Most pulldown menus and icons do not respond as one might expect when doing a mouse over or single clicking them. The following bullets provide a workaround to give equivalent control when working within the workbench gui on cluster compute nodes:
- o To access the pulldown menus for the items in left side project tree, left click once to select the item then double right click your mouse button.
- o To access the pulldown menus for the Upper icons with tiny down arrows, left click once to select the item then double left click your mouse button. The tiny down arrow for the Solve icon which provides access to the Solve Process settings is an exception ! To access this instead please click the tiny angled arrow found in the lower right corner of the Solve panel.
- o While the pull downs found in the left side menu that typically default to Program Controlled work as expected when single clicked for various Ansys Systems, the width of the left side window is typically too narrow to see the full text for all of the pulled down items. This can rectified by dragging the vertical bar of the left window to make it wider. Doing so will also make the associated drop down arrows visible. The left side window can be restored to its default layout appearance by following the procedure in the previous bullet to specify
Home -> Layout -> Reset Layout
. - o If you are having trouble scrolling down to click
File -> Close Mechanical
or have minimized it and want to restore it to view then respectfullysingle right click -> close
orsingle left click
the Workbench button found along the bottom of the tigervnc gui frame to reselect it. - o If Mechanical will not start completely close Ansys as described in the previous bullet then exit and restart your salloc session. The cause of this inconvenience may be a uncleanly exited previous ansys session run within the same salloc session.
- ------------------------------------------------------------------------------------
module load StdEnv/2016 ansys/2019R3
(not available on narval)export PATH=$EBROOTNIXPKGS/bin:$PATH
runwb2
- o Single right click to access left side project tree pulldown menus
- o Single left click for the upper pulldown menus with tiny down arrows
- o Tick distributed box, specify number Cores = salloc value minus 1
Ansys EDT[edit]
module load CcEnv StdEnv/2020 ansysedt/2021R2
rm -rf ~/.mw
(optionally force First-time configuration)ansysedt
Ensight[edit]
module load StdEnv/2020 ansys/2022R2; A=222; B=5.12.6
, ormodule load StdEnv/2020 ansys/2022R1; A=221; B=5.12.6
, ormodule load StdEnv/2020 ansys/2021R2; A=212; B=5.12.6
, ormodule load StdEnv/2020 ansys/2021R1; A=211; B=5.12.6
, ormodule load StdEnv/2016 ansys/2020R2; A=202; B=5.12.6
, ormodule load StdEnv/2016 ansys/2020R1; A=201; B=5.10.1
, ormodule load StdEnv/2016 ansys/2019R3; A=195; B=5.10.1
, orexport LD_LIBRARY_PATH=$EBROOTANSYS/v$A/CEI/apex$A/machines/linux_2.6_64/qt-$B/lib
ensight -X
Note: ansys/2022R2 ensight is lightly tested on compute nodes. Please let us know if you find any problems using it.
SSH Considerations
Some ANSYS gui programs can be run remotely on a cluster compute node by X forwarding over ssh to your local desktop. Unlike VNC, this approach is untested and unsupported since it relies on a properly setup X display server for your particular operating system OR the selection, installation and configuration of a suitable X client emulator package such as MobaXterm. Most users will find interactive response times unacceptably slow for basic menu tasks let alone performing more complex tasks such as those involving graphics rendering. Startup times for gui programs can also be very slow depending on your internet connection. For example, in one test it took 40min to fully start ansysedt over ssh while starting it with vncviewer required on 34 seconds. Despite the potential slowness when connecting over ssh to run gui programs, doing so may still be of interest if your only goal is to open a simulation and perform some basic menu operations or run some calculations. The basic steps are given here as a starting point: 1) ssh -Y username@graham.computecanada.ca 2) salloc --x11 --time=1:00:00 --mem=16G --cpus-per-task=4 [--gpus-per-node=1] --account=def-mygroup 3) once connected onto a compute node try running xclock
. If the clock appears on your desktop then proceed to load the desired ansys module and try running the program.
VDI Nodes[edit]
ANSYS programs can be run for up to 24hours on graham VDI Nodes using a maximum of 8cores and 128GB memory. The VDI System provides gpu OpenGL acceleration therefore it is ideal for performing tasks that benefit from high performance graphics. One might use VDI to create or modify simulation input files, post process data or visualize simulation results. To get started, login to gra-vdi.computecanada.ca with TigerVNC then open a new terminal window and start one of the following supported program versions as shown below. The vertical bar |
notation is used to separate the various ANSYS commands.
Fluids[edit]
module load SnEnv ansys/2022R2; fluent
module load CcEnv StdEnv/2020 ansys/2022R2; cfx5
- Note: icemcfd 2022R2 not usable pending license server upgrade
- ------------------------------------------------------------------------------------
module load CcEnv StdEnv/2020 ansys/2022R1
, ormodule load CcEnv StdEnv/2020 ansys/2021R2
, ormodule load CcEnv StdEnv/2020 ansys/2021R1
unset SESSION_MANAGER
fluent | cfx5 | icemcfd
- Note: If icemcfd crashes on startup click Settings -> Display -> X11
- ------------------------------------------------------------------------------------
module load CcEnv StdEnv/2016 ansys/2020R2
, ormodule load CcEnv StdEnv/2016 ansys/2020R1
, ormodule load CcEnv StdEnv/2016 ansys/2019R3
export HOOPS_PICTURE=opengl
fluent | cfx5 | icemcfd
Workbench[edit]
module load SnEnv ansys/2022R2; runwb2
- ------------------------------------------------------------------------------------
module load CcEnv StdEnv/2020 ansys/2022R1
module load CcEnv StdEnv/2020 ansys/2021R2
runwb2
- ------------------------------------------------------------------------------------
module load SnEnv ansys/2021R1
, ormodule load SnEnv ansys/2020R2
runwb2
- ------------------------------------------------------------------------------------
module load CcEnv StdEnv/2016 ansys/2020R1
, ormodule load CcEnv StdEnv/2016 ansys/2019R3
export PATH=$EBROOTNIXPKGS/bin:$PATH
runwb2
- NOTE: The pull downs in the left side menu that typically default to Program Controlled do NOT work when various Ansys Systems are loaded such as Steady-State Thermal, Explicit Dynamics and so forth. Trying to use the pull downs repeatedly on gra-vdi may crash Ansys in such a way that "pkill -9 ansys" should be run afterward to completely clear any lingering license server connections before restarting workbench. As a workaround, quit out of gra-vdi and then open your project under workbench on a compute node within an salloc session. Make the desired pulldown changes then click
Home -> Save Project -> Close Mechanical, File -> Exit
then also exit your salloc session. Now return to gra-vdi where the changed settings should appear when you reopen the project under workbench. This message will be updated if a fix for this issue is found.
- NOTE: The pull downs in the left side menu that typically default to Program Controlled do NOT work when various Ansys Systems are loaded such as Steady-State Thermal, Explicit Dynamics and so forth. Trying to use the pull downs repeatedly on gra-vdi may crash Ansys in such a way that "pkill -9 ansys" should be run afterward to completely clear any lingering license server connections before restarting workbench. As a workaround, quit out of gra-vdi and then open your project under workbench on a compute node within an salloc session. Make the desired pulldown changes then click
Ansys EDT[edit]
module load CcEnv StdEnv/2020 ansysedt/2021R2
rm -rf ~/.mw
(optional: force first-time configuration)ansysedt
Ensight[edit]
module load SnEnv ansys/2022R2
, ormodule load SnEnv ansys/2022R1
, ormodule load SnEnv ansys/2021R2
, or (see NOTE below)module load SnEnv ansys/2021R1
, or (see NOTE below)module load SnEnv ansys/2020R2
, or (see NOTE below)module load SnEnv ansys/2020R1
, ormodule load SnEnv ansys/2019R3
, ormodule load SnEnv ansys/2019R2
ensight
- NOTE: Please be aware versions 2021R2, 2021R1, 2020R2 have a compatibility issue on gra-vdi that effects how some tic boxes and menu tabs in the left hand panels update when clicked. As a workaround, try dragging all panels that are overlaid (and thus have selector tabs) outside the main window so they are separate (and thus no longer have selector tabs). If a panel becomes blurred click the tiny maximize button in its upper right hand corner (to the left of the tiny x button) to refresh it. If you still experience this problem then 1) use a different version not impacted by this issue, or 2) use the same version on a compute node. If none of these solutions work then open a problem ticket and let us know.
Site Specific Usage[edit]
Sharcnet License[edit]
The SHARCNET Ansys license is free for academic use by any Alliance researcher on any Alliance system. The installed software does not have any solver or geometry limits. It may only be used for the purpose of Publishable Academic Research. The license was upgraded from CFD to MCS (Multiphysics Campus Solution) in May of 2020 and includes the following ANSYS products: HF, EM, Electronics HPC, Mechanical and CFD as described here. Neither LS-DYNA or Lumerical are included. In July of 2021 an additional 1024 anshpc licenses were added to the previous 512 pool. Before running large parallel jobs scaling tests should be run for any given simulation. Parallel jobs that do not achieve at least 50% cpu utilization may be be flagged by the system for a follow-up by our support team.
As of Dec2023 each researcher can run 4 jobs using a total of 252 anshpc (plus 4 anshpc per job). Thus any of the following uniform job size combinations are possible: one 256 core job, two 130 core jobs, three 88 core jobs, or four 67 core jobs according to ( (252 + 4*num_jobs) / num_jobs ). Since the best parallel performance is usually achieved by using all cores on packed compute nodes (aka full nodes) one can determine the number of full nodes by dividing the total anshpc cores with the compute node size. For example, consider graham which has many 32 core (broadwell) and some 44 core (cascade) compute nodes, the maximum number of nodes that could be requested when running various size jobs on 32 core nodes would be: 256/32=8, 130/32=~4, 88/32=~2 or 67/32=~2 to run 1, 2, 3 or 4 simultaneous jobs respectively. To express this in equation form, for a given compute node size on any cluster, the number of compute nodes can be calculated by ( 252 + (4*num_jobs) ) / (num_jobs*cores_per_node) ) then round down and finally determine the total cores to request by multiplying the even number of nodes by the number of cores_per_node.
The SHARCNET Ansys license is made available on a first come first serve basis. Should an unusually large number of ANSYS jobs be submitted on a given day some jobs could fail on startup should insufficient licenses be available. These events however have become very rare given the recent increase in anshpc licenses. If your research requires more licenses than SHARCNET can provide than a dedicated researcher license may be purchased (and hosted) on a ANSYS license server at your local institution. Researchers can purchase a license directly from Simutech where an extra 20% country wide uplift fee must be added if the cluster where the license will be used is not co-located at your institution (for example graham cluster at Waterloo). To use a local institutional ansys license server reconfigure the ~/.licenses/ansys.lic
file on the cluster(s) where you want to use it.
License Server File[edit]
To use the Sharcnet ANSYS license configure your ansys.lic file as follows:
[gra-login1:~/.licenses] cat ansys.lic
setenv("ANSYSLMD_LICENSE_FILE", "1055@license3.sharcnet.ca")
setenv("ANSYSLI_SERVERS", "2325@license3.sharcnet.ca")
Query License Server[edit]
To show the number of license in use by your username and the total in use by all users run:
ssh graham.computecanada.ca
module load ansys
lmutil lmstat -c $ANSYSLMD_LICENSE_FILE -a | grep "Users of\|$USER"
If you discover any licenses unexpectedly in use by your username (usually due to ansys not exiting cleanly on gra-vdi) then connect to the node where its running, open a terminal window and run the following command to terminate the rogue processes pkill -9 -e -u $USER -f "ansys"
after which your licenses should be freed. Note that gra-vdi consists of two nodes (gra-vdi3 and gra-vdi4) which researchers are randomly placed on when connecting to gra-vdi.computecanada.ca with tigervnc. Therefore its necessary to specify the full hostname (gra-vdi3.sharcnet.ca or grav-vdi4.sharcnet.ca) when connecting with tigervnc to ensure you login to the correct node before running pkill.
Local VDI Modules[edit]
When using gra-vdi researchers have the choice of loading ANSYS modules from our global environment (after loading CcEnv) or loading ANSYS modules installed locally on the machine itself (after loading SnEnv). The local modules may be of interest as they include some Ansys programs and versions not yet supported by our environment for graphics use on gra-vdi or the clusters. When starting programs from local Ansys modules, users can select the CMC license server or accept the default Sharcnet License server. Presently the settings from ~/.licenses/ansys.lic
are not used by the local Ansys modules except when starting runwb2
where they will override the default Sharcnet License server settings. Suitable usage of Ansys programs on gra-vdi includes: running a single test job interactively with up to 8cores and/or 128G ram, create or modify simulation input files, post process or visualize data.
ansys Modules[edit]
- Connect to gra-vdi.computecanada.ca with TigerVNC
- Open a new terminal window and load a module:
module load SnEnv ansys/2021R2
, ormodule load SnEnv ansys/2021R1
, ormodule load SnEnv ansys/2020R2
, ormodule load SnEnv ansys/2020R1
, ormodule load SnEnv ansys/2019R3
- Start an ANSYS program by issuing one of the following:
runwb2|fluent|cfx5|icemcfd|apdl
- Press y then
enter
to accept the conditions - Press
enter
to accept the n option and use the SHARCNET license server by default (in the case of runwb2 ~/.licenses/ansysedt.lic will be used if present otherwise ANSYSLI_SERVERS and ANSYSLMD_LICENSE_FILE will be used if set in your environment for example to some other remote license server). If you change n to y and hit enter the CMC license server will be used.
where cfx5
from step 3. above provides the option to start the following components:
1) CFX-Launcher (cfx5 -> cfx5launch) 2) CFX-Pre (cfx5pre) 3) CFD-Post (cfdpost -> cfx5post) 4) CFX-Solver (cfx5solve)
ansysedt Modules[edit]
- Connect to gra-vdi.computecanada.ca with TigerVNC
- Open a new terminal window and load a module:
module load SnEnv ansysedt/2021R2
, ormodule load SnEnv ansysedt/2021R1
- Start the ANSYS Electromagnetics Desktop program by typing the following command:
ansysedt
- Press y then
enter
to accept the conditions. - Press
enter
to accept the n option and use the SHARCNET license server by default (note that ~/.licenses/ansysedt.lic will be used if present otherwise ANSYSLI_SERVERS and ANSYSLMD_LICENSE_FILE will be used if set in your environment for example to some other remote license server). If you change n to y and hit enter then the CMC license server will be used.
License feature preferences previously setup with anslic_admin are no longer supported following the recent SHARCNET license server update (Sept9/2021). If a license problem occurs try removing the ~/.ansys
directory in your home account to clear the settings. If problems persist please contact our Technical support and provide the contents your ~/.licenses/ansys.lic
file.
Additive Manufacturing[edit]
To get started configure your ~/.licenses/ansys.lic
file to point to a license server that has a valid ANSYS Mechanical License. This must be done on all systems where you plan to run the software.
Enable Additive[edit]
To enable ANSYS Additive Manufacturing in your project do the following 3 steps:
Start Workbench[edit]
- start workbench as described in the Graphical Use - WORKBENCH section found above.
Install Extension[edit]
- click Extensions -> Install Extension
- specify the following /path/to/AdditiveWizard.wbex then click Open:
/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ansys/2019R3/v195/aisol/WBAddins/MechanicalExtensions/AdditiveWizard.wbex
Load Extension[edit]
- click Extensions -> Manage Extensions and tick Additive Wizard
- click the ACT Start Page tab X to return to your Project tab
Run Additive[edit]
Gra-vdi[edit]
A user can run a single ANSYS Additive Manufacturing job on gra-vdi with up to 16 cores as follows:
- Start Workbench on Gra-vdi as described above in Enable Additive
- click File -> Open and select test.wbpj then click Open
- click View -> reset workspace if you get a grey screen
- start Mechanical, Clear Generated Data, tick Distributed, specify Cores
- click File -> Save Project -> Solve
Check utilization:
- open another terminal and run:
top -u $USER
**OR**ps u -u $USER | grep ansys
- kill rogue processes from previous runs:
pkill -9 -e -u $USER -f "ansys|mwrpcss|mwfwrapper|ENGINE"
Please note that rogue processes can persistently tie up licenses between gra-vdi login sessions or cause other unusual errors when trying to start gui programs on gra-vdi. Although rare, rogue processes can occur if an ansys gui session (fluent, workbench, etc) is not cleanly terminated by the user before vncviewer is terminated either manually or unexpectedly - for instance due to a transient network outage or hung filesystem. If the latter is to blame then the processes may not by killable until normal disk access is restored.
Cluster[edit]
Project preparation:
Before submitting a newly uploaded Additive project to a cluster queue (with sbatch scriptname
) certain preparations must be done. To begin, open your simulation with Workbench gui (as described in the Enable Additive
section above) in the same directory that your job will be submitted from and then save it again. Be sure to use the same ansys module version that will be used for the job. Next create a slurm script (as explained in the Cluster Batch Job Submission - WORKBENCH section above). To perform parametric studies change Update()
to UpdateAllDesignPoints()
in the slurm script. Determine the optimal number of cores and memory by submitting several short test jobs. To avoid needing to manually clear the solution and recreate all the design points in Workbench between each test run, either 1) change Save(Overwrite=True)
to Save(Overwrite=False)
or 2) save a copy of the original YOURPROJECT.wbpj file and corresponding YOURPROJECT_files directory. Optionally create and then manually run a replay file on the cluster in the respective test case directory between each run, noting that single replay file can be used in different directories by opening it in a text editor and changing the internal FilePath setting.
module load ansys/2019R3 rm -f test_files/.lock runwb2 -R myreplay.wbjn
Resource utilization:
Once your additive job has been running for a few minutes a snapshot of its resource utilization on the compute node(s) can be obtained with the following srun command. Sample output corresponding to an eight core submission script is shown next. It can be seen that two nodes were selected by the scheduler:
[gra-login1:~] srun --jobid=myjobid top -bn1 -u $USER | grep R | grep -v top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22843 demo 20 0 2272124 256048 72796 R 88.0 0.2 1:06.24 ansys.e 22849 demo 20 0 2272118 256024 72822 R 99.0 0.2 1:06.37 ansys.e 22838 demo 20 0 2272362 255086 76644 R 96.0 0.2 1:06.37 ansys.e PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4310 demo 20 0 2740212 271096 101892 R 101.0 0.2 1:06.26 ansys.e 4311 demo 20 0 2740416 284552 98084 R 98.0 0.2 1:06.55 ansys.e 4304 demo 20 0 2729516 268824 100388 R 100.0 0.2 1:06.12 ansys.e 4305 demo 20 0 2729436 263204 100932 R 100.0 0.2 1:06.88 ansys.e 4306 demo 20 0 2734720 431532 95180 R 100.0 0.3 1:06.57 ansys.e
Scaling tests:
After a job completes its "Job Wall-clock time" can be obtained from seff myjobid
. Using this value scaling tests can be performed by submitting short test jobs with an increasing number of cores. If the Wall-clock time decreases by ~50% when the number of cores are doubled then additional cores may be considered.
Online Documentation[edit]
The full ANSYS documentation for versions back to 19.2 can be accessed by following these steps:
- connect to gra-vdi.computecanada.ca with tigervnc as described in VDI Nodes
- if Firefox browser or ANSYS Workbench are open then close them now
- start Firefox by clicking: Applications -> Internet -> Firefox
- open a new terminal window by clicking: Applications -> System Tools -> Mate Terminal
- start Workbench by typing the following in your terminal: module load CcEnv StdEnv ansys; runwb2
- in the upper Workbench menu bar click: Help -> ANSYS Workbench Help
- the Ansys documentation page should immediately appear in Firefox
- at this point Workbench is no longer needed so close it by clicking the
Unsaved Project - Workbench
tab located along the bottom frame (doing this will bring workbench into focus) and then click File -> Exit - in the top middle of the Ansys documentation page click the word HOME located just left of API DOCS
- now scroll down and you should see a list of ANSYS product icons and/or Alphabetical Ranges
- select a product to view the documentation for, such as Fluent. the documentation for the latest release version will be displayed by default. change the documentation version by clicking the Release Year R pull down located above and just to the right of the Ansys documentation page search bar.
- to search for documentation corresponding to a different ANSYS product click HOME again