Gaussian: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
 
(81 intermediate revisions by 9 users not shown)
Line 1: Line 1:
<languages />
[[Category:Software]][[Category:ComputationalChemistry]]
<translate>
<!--T:1-->
''See also [[Gaussian error messages]].''<br><br>
Gaussian is a computational chemistry application produced by [http://gaussian.com/ Gaussian, Inc.]
Gaussian is a computational chemistry application produced by [http://gaussian.com/ Gaussian, Inc.]


== License limitations ==
== Limitations == <!--T:46-->
 
<!--T:3-->
We currently support Gaussian only on [[Graham]] and [[Cedar]].
 
<!--T:47-->
[https://gaussian.com/running/?tabid=4 Cluster/network parallel execution] of Gaussian, also known as "Linda parallelism", is not supported at any of your national systems.
Only [https://gaussian.com/running/?tabid=4 "shared-memory multiprocessor parallel execution"] is supported.<br>
Therefore no Gaussian job can use more than a single compute node.


Compute Canada currently supports Gaussian only on [[Graham]] and certain legacy systems.
== License agreement == <!--T:2-->


In order to use Gaussian you must agree to the following:
<!--T:4-->
# You are not a member of a research group developing software competitive to Gaussian.
In order to use Gaussian you must agree to certain conditions. Please [[Technical_support | contact support]] with a copy of the following statement:
# You will not copy the Gaussian software, nor make it available to anyone else.
# I am not a member of a research group developing software competitive to Gaussian.
# You will properly acknowledge Gaussian Inc. and Compute Canada in publications.
# I will not copy the Gaussian software, nor make it available to anyone else.
# You will notify us of any change in the above acknowledgement.
# I will properly acknowledge Gaussian Inc. and [https://alliancecan.ca/en/services/advanced-research-computing/acknowledging-alliance the Alliance] in publications.
# I will notify the Alliance of any change in the above acknowledgement.
If you are a sponsored user, your sponsor (PI) must also have such a statement on file with us.


If you do, please send an email with a copy of those conditions, saying that you agree to
<!--T:5-->
them, to support@computecanada.ca. We will then grant you access to Gaussian.
We will then grant you access to Gaussian.


==Running Gaussian on Graham==
==Running Gaussian on Graham and Cedar== <!--T:6-->
Gaussian g09.e01 and g16.a03 are installed on the newest cluster Graham with modules.
The <code>gaussian</code> module is installed on [[Graham]] and [[Cedar]]. To check what versions are available use the <code>module spider</code> command as follows:


===Access Request===
<!--T:36-->
Graham uses your ComputeCanada(CC) user account, not your Sharcnet account anymore. Gaussian is managed with group 'soft_gaussian' on Graham. User has to request access to Gaussian on Graham by email to
[name@server $] module spider gaussian


support@computecanada.ca
<!--T:37-->
For module commands, please see [[Utiliser des modules/en|Using modules]].


with a Subject like: Graham, request access to Gaussian. The email body should include the license agreement terms
</translate>


To join the 'soft_gaussian' user group on Graham, I accept Gaussian license agreements:
<translate>
1) I am not a member of a research group developing software competitive to Gaussian.
===Job submission=== <!--T:7-->
2) I will not copy the Gaussian software, nor make it available to anyone else.
The national clusters use the Slurm scheduler; for details about submitting jobs, see [[Running jobs]].
3) I will properly acknowledge Gaussian Inc. and Compute Canada in publications.
4) I will notify Compute Canada of any change in the above acknowledgement.
Followed by your CC userid.


===User Environment Setup===
<!--T:48-->
Each research group has a default account with your supervisor's userid (PI'sid) named def-PI'sid (type 'groups' on Graham will show the groups you are in).  
Since only the "shared-memory multiprocessor" parallel version of Gaussian is supported, your jobs can use only one node and up to the maximum cores per node: 48 on Cedar and 32 on Graham. If your jobs are limited by the amount of available memory on a single node, be aware that there are a few nodes at each site with more than the usual amount of memory.  Please refer to the pages [[Cedar/en#Node_characteristics|Cedar]] and [[Graham/en#Node_characteristics|Graham]] for the number and capacity of such nodes.  


If this is your first time to use Graham, you can add the following line to your .bash_profile file:
<!--T:8-->
Besides your input file (in our example name.com), you have to prepare a job script to define the compute resources for the job; both input file and job script must be in the same directory.
export SBATCH_ACCOUNT=def-PI'sid


To use Gaussian 16, add the following module load line to your .bash_profile file:
<!--T:9-->
#for Gaussian 16
There are two options to run your Gaussian job on Graham and Cedar, based on the location of the default runtime files and the job size.
module load gaussian/g16.a03


To use Gaussian 09, add the following module load line to your .bash_profile file:  
====G16 (G09, G03)==== <!--T:10-->
#for Gaussian 09
module load gaussian/g09.e01


exit and login again, the setup should be good for you.
<!--T:11-->
This option will save the default runtime files (unnamed .rwf, .inp, .d2e, .int, .skr files) to /scratch/username/jobid/. Those files will stay there when the job is unfinished or failed for whatever reason, you could locate the .rwf file for restart purpose later.


===Job Submission===
<!--T:12-->
Graham uses Slurm scheduler, which is different from the sq command used on other Sharcnet clusters.
The following example is a G16 job script:


Besides your input name.com file, you have to prepare a job script in the same input file directory to define the compute resources for the job.  
<!--T:31-->
Note that for coherence, we use the same name for each files, changing only the extension (name.sh, name.com, name.log).
</translate>
{{File
|name=mysub.sh
|lang="bash"
|contents=
#!/bin/bash
#SBATCH --account=def-someuser
#SBATCH --mem=16G            # <translate><!--T:13-->
memory, roughly 2 times %mem defined in the input name.com file</translate>
#SBATCH --time=02-00:00      # <translate><!--T:14-->
expect run time (DD-HH:MM)</translate>
#SBATCH --cpus-per-task=16    # <translate><!--T:15-->
No. of cpus for the job as defined by %nprocs in the name.com file</translate>
module load gaussian/g16.c01
G16 name.com            # <translate><!--T:16-->
G16 command, input: name.com, output: name.log</translate>
}}
<translate>
<!--T:17-->
To use Gaussian 09 or Gaussian 03, simply modify the module load gaussian/g16.b01 to gaussian/g09.e01 or gaussian/g03.d01, and change G16 to G09 or G03. You can modify the --mem, --time, --cpus-per-task to match your job's requirements for compute resources.


There are Two Options to run your Gaussian job on Graham based on the size of your job files.
====g16 (g09, g03)==== <!--T:18-->


====g16 (or g09) for regular size jobs====
<!--T:19-->
This option will save the default runtime files (unnamed .rwf, .inp, .d2e, .int, .skr files) temporarily in $SLURM_TMPDIR (/localscratch/username.jobid.0/) on the compute node where the job was scheduled to. The files will be removed by the scheduler when a job is done (successful or not). If you do not expect to use the .rwf file to restart in a later time, you can use this option.


This option will save the unnamed runtime files (.rwf, .inp, .d2e, .int, .skr) to localscratch /localscratch/yourid/ on the compute node where the job was scheduled to. The files on localscratch will be deleted by the scheduler afterwards, usually users do not track the computer node number, those files could be lost easily. If you do not expect to use the .rwf file for restart in a later time, this is the option to go
<!--T:20-->
/localscratch is ~800G shared by all jobs running on the same node. If your job files would be bigger than or close to that size range, you would instead use the G16 (G09, G03) option.


Example g16 job script, e.g., name.sh is like (simply change g16 to g09 for a g09 job):  
<!--T:21-->
The following example is a g16 job script:
</translate>
{{File
|name=mysub.sh
|lang="bash"
|contents=
#!/bin/bash
#SBATCH --account=def-someuser
#SBATCH --mem=16G            # <translate><!--T:22-->
memory, roughly 2 times %mem defined in the input name.com file</translate>
#SBATCH --time=02-00:00      # <translate><!--T:23-->
expect run time (DD-HH:MM)</translate>
#SBATCH --cpus-per-task=16    # <translate><!--T:24-->
No. of cpus for the job as defined by %nprocs in the name.com file</translate>
module load gaussian/g16.c01
g16 < name.com                # <translate><!--T:25-->
g16 command, input: name.com, output: slurm-<jobid>.out by default</translate>
}}
<translate>


#!/bin/bash
====Submit the job==== <!--T:33-->
#SBATCH --mem=16G            # memory, roughly 2 times %mem defined in the input name.com file
  sbatch mysub.sh
#SBATCH --time=02-00:00      # expect run time (DD-HH:MM)
#SBATCH --cpus-per-task=16    # No. of cpus for the job as defined by %nprocs in the name.com file
  g16 < name.com >& name.log &  # g16 command, input: name.com, output: name.log
You can modify the script to fit your job's reqirements for compute resources.


====G16 (or G09) for large size jobs====
=== Interactive jobs === <!--T:26-->
You can run interactive Gaussian job for testing purpose on Graham and Cedar. It's not a good practice to run interactive Gaussian jobs on a login node. You can start an interactive session on a compute node with salloc, the example for an hour, 8 cpus and 10G memory Gaussian job is like
Goto the input file directory first, then use salloc command:
</translate>
{{Command|salloc --time{{=}}1:0:0 --cpus-per-task{{=}}8 --mem{{=}}10g}}


localscratch is ~800G shared by any jobs running on the node. If your job files would be bigger than or close to that size range, you would instead use this option to save files to your /scratch. However it's hard for us to define what size of job would be considered as a large job because we could not predict how many jobs will be running on a node at certain time, how many jobs may save files and the size of the files to /localscratch. It's possible to have multiple Gaussian jobs running on the same node sharing the ~800G space.
<translate>
<!--T:27-->
Then use either
</translate>
{{Commands
|module load gaussian/g16.c01
|G16 g16_test2.com    # <translate><!--T:28-->
G16 saves runtime file (.rwf etc.) to /scratch/yourid/93288/</translate>
}}


G16 provides a better way to manage your files as files are within the jobid directory: /scratch/youris/jobid/, and it's easier to locate the .rwf file to restart a job in a later time.
<translate><!--T:29-->
or </translate>
{{Commands
|module load gaussian/g16.c01
|g16 < g16_test2.com >& g16_test2.log &  # <translate><!--T:30-->
g16 saves runtime file to /localscratch/yourid/</translate>
}}
<translate>
=== Restart jobs === <!--T:38-->
Gaussian jobs can always be restarted from the previous <tt>rwf</tt> file.


Example G16 job script, name.sh is like (simply change G16 to G09 for a g09 job):
<!--T:39-->
#!/bin/bash
Geometry optimization can be restarted from the <tt>chk</tt> file as usual.
#SBATCH --mem=16G          # memory, roughly 2 times %mem defined in the input name.com file
One-step computation, such as Analytic frequency calculations, including properties like ROA and VCD with ONIOM; CCSD and EOM-CCSD calculations; NMR; Polar=OptRot; CID, CISD, CCD, QCISD and BD energies, can be restarted from the <tt>rwf</tt> file.
#SBATCH --time=02-00:00    # expect run time (DD-HH:MM)
#SBATCH --cpus-per-task=16  # No. of cpus for the job as defined by %nprocs in the name.com file
G16 name.com                # G16 command, input: name.com, output: name.log by default


====Examples====
<!--T:40-->
Sample *.sh and *.com files can be found on Graham in
To restart a job from previous <tt>rwf</tt> file, you need to know the location of this <tt>rwf</tt> file from your previous run.
/home/jemmyhu/tests/test_Gaussian/g16
or
/home/jemmyhu/tests/test_Gaussian/g09


Change name.sh file permission to be executable, i.e.,
<!--T:41-->
chmod 750 name.sh
The restart input is simple: first you need to specify %rwf path to the previous <tt>rwf</tt> file, secondly change the keywords line to be #p restart, then leave a blank line at the end.


Submit the job using sbatch
<!--T:42-->
sbatch name.sh
A sample restart input is like:
{{File
  |name=restart.com
  |lang="bash"
  |contents=
%rwf=/scratch/yourid/jobid/name.rwf
%NoSave
%chk=name.chk
%mem=5000mb
%nprocs=16
#p restart
(one blank line)


For a different job, you just need to copy name.sh to a different filename, i.e., name1.sh paired with name1.com and name1.log in order to run input file name1.com.
<!--T:43-->
}}


Chechk for job status
===Examples=== <!--T:34-->
squeue -u userid  #check your own jobs in the queue
An example input file and the run scripts <tt>*.sh</tt> can be found in
or  
<tt>/opt/software/gaussian/version/examples/</tt>
sacct  #your job history
where version is either g03.d10, g09.e01, or g16.b01


===Interactive jobs===
== Notes == <!--T:35-->
You can run interactive Gaussian job for testing purpose on Graham. It's not a good practice to run interactive Gaussian jobs on a login node. You can start an interactive session on a compute node with salloc, the example for an hour, 8 cpus and 10G memory Gaussian job is like
# NBO7 is included in g16.c01 version only, both nbo6 and nbo7 keywords will run NBO7 in g16.c01
# NBO6 is available in g09.e01 and g16.b01 versions.
# You can watch a recorded webinar/tutorial: [https://www.youtube.com/watch?v=xpBhPnRbeQo Gaussian16 and NBO7 on Graham and Cedar]


Goto the input file directory first, then use salloc command:
== Errors == <!--T:44-->
[jemmyhu@gra-login2 tests]$ salloc --time=1:0:0 --cpus-per-task=8 --mem=10g  # allocate on a compute node with jobid
Some of the error messages produced by Gaussian have been collected, with suggestions for their resolution. See [[Gaussian error messages]].
salloc: Granted job allocation 93288
</translate>
[jemmyhu@gra798 tests]$ G16 g16_test2.com    # G16 saves runtime file (.rwf etc.) to /scratch/yourid/93288/
or
[jemmyhu@gra798 tests]$ g16 < g16_test2.com >& g16_test2.log &  # g16 saves runtime file to /localscratch/yourid/
when it's done, or looks ok for the input test, terminate the session.  
[jemmyhu@gra798 tests]$ exit
exit
salloc: Relinquishing job allocation 93288
[jemmyhu@gra-login2 tests]$

Latest revision as of 20:50, 9 June 2023

Other languages:

See also Gaussian error messages.

Gaussian is a computational chemistry application produced by Gaussian, Inc.

Limitations

We currently support Gaussian only on Graham and Cedar.

Cluster/network parallel execution of Gaussian, also known as "Linda parallelism", is not supported at any of your national systems. Only "shared-memory multiprocessor parallel execution" is supported.
Therefore no Gaussian job can use more than a single compute node.

License agreement

In order to use Gaussian you must agree to certain conditions. Please contact support with a copy of the following statement:

  1. I am not a member of a research group developing software competitive to Gaussian.
  2. I will not copy the Gaussian software, nor make it available to anyone else.
  3. I will properly acknowledge Gaussian Inc. and the Alliance in publications.
  4. I will notify the Alliance of any change in the above acknowledgement.

If you are a sponsored user, your sponsor (PI) must also have such a statement on file with us.

We will then grant you access to Gaussian.

Running Gaussian on Graham and Cedar

The gaussian module is installed on Graham and Cedar. To check what versions are available use the module spider command as follows:

[name@server $] module spider gaussian

For module commands, please see Using modules.


Job submission

The national clusters use the Slurm scheduler; for details about submitting jobs, see Running jobs.

Since only the "shared-memory multiprocessor" parallel version of Gaussian is supported, your jobs can use only one node and up to the maximum cores per node: 48 on Cedar and 32 on Graham. If your jobs are limited by the amount of available memory on a single node, be aware that there are a few nodes at each site with more than the usual amount of memory. Please refer to the pages Cedar and Graham for the number and capacity of such nodes.

Besides your input file (in our example name.com), you have to prepare a job script to define the compute resources for the job; both input file and job script must be in the same directory.

There are two options to run your Gaussian job on Graham and Cedar, based on the location of the default runtime files and the job size.

G16 (G09, G03)

This option will save the default runtime files (unnamed .rwf, .inp, .d2e, .int, .skr files) to /scratch/username/jobid/. Those files will stay there when the job is unfinished or failed for whatever reason, you could locate the .rwf file for restart purpose later.

The following example is a G16 job script:

Note that for coherence, we use the same name for each files, changing only the extension (name.sh, name.com, name.log).

File : mysub.sh

#!/bin/bash
#SBATCH --account=def-someuser
#SBATCH --mem=16G             # memory, roughly 2 times %mem defined in the input name.com file
#SBATCH --time=02-00:00       # expect run time (DD-HH:MM)
#SBATCH --cpus-per-task=16    # No. of cpus for the job as defined by %nprocs in the name.com file
module load gaussian/g16.c01
G16 name.com            # G16 command, input: name.com, output: name.log


To use Gaussian 09 or Gaussian 03, simply modify the module load gaussian/g16.b01 to gaussian/g09.e01 or gaussian/g03.d01, and change G16 to G09 or G03. You can modify the --mem, --time, --cpus-per-task to match your job's requirements for compute resources.

g16 (g09, g03)

This option will save the default runtime files (unnamed .rwf, .inp, .d2e, .int, .skr files) temporarily in $SLURM_TMPDIR (/localscratch/username.jobid.0/) on the compute node where the job was scheduled to. The files will be removed by the scheduler when a job is done (successful or not). If you do not expect to use the .rwf file to restart in a later time, you can use this option.

/localscratch is ~800G shared by all jobs running on the same node. If your job files would be bigger than or close to that size range, you would instead use the G16 (G09, G03) option.

The following example is a g16 job script:

File : mysub.sh

#!/bin/bash
#SBATCH --account=def-someuser
#SBATCH --mem=16G             # memory, roughly 2 times %mem defined in the input name.com file
#SBATCH --time=02-00:00       # expect run time (DD-HH:MM)
#SBATCH --cpus-per-task=16    # No. of cpus for the job as defined by %nprocs in the name.com file
module load gaussian/g16.c01
g16 < name.com                # g16 command, input: name.com, output: slurm-<jobid>.out by default


Submit the job

sbatch mysub.sh

Interactive jobs

You can run interactive Gaussian job for testing purpose on Graham and Cedar. It's not a good practice to run interactive Gaussian jobs on a login node. You can start an interactive session on a compute node with salloc, the example for an hour, 8 cpus and 10G memory Gaussian job is like Goto the input file directory first, then use salloc command:

Question.png
[name@server ~]$ salloc --time=1:0:0 --cpus-per-task=8 --mem=10g

Then use either

[name@server ~]$ module load gaussian/g16.c01
[name@server ~]$ G16 g16_test2.com    # G16 saves runtime file (.rwf etc.) to /scratch/yourid/93288/


or

[name@server ~]$ module load gaussian/g16.c01
[name@server ~]$ g16 < g16_test2.com >& g16_test2.log &   # g16 saves runtime file to /localscratch/yourid/

Restart jobs

Gaussian jobs can always be restarted from the previous rwf file.

Geometry optimization can be restarted from the chk file as usual. One-step computation, such as Analytic frequency calculations, including properties like ROA and VCD with ONIOM; CCSD and EOM-CCSD calculations; NMR; Polar=OptRot; CID, CISD, CCD, QCISD and BD energies, can be restarted from the rwf file.

To restart a job from previous rwf file, you need to know the location of this rwf file from your previous run.

The restart input is simple: first you need to specify %rwf path to the previous rwf file, secondly change the keywords line to be #p restart, then leave a blank line at the end.

A sample restart input is like:

File : restart.com

%rwf=/scratch/yourid/jobid/name.rwf
%NoSave
%chk=name.chk
%mem=5000mb
%nprocs=16
#p restart
(one blank line)


Examples

An example input file and the run scripts *.sh can be found in /opt/software/gaussian/version/examples/ where version is either g03.d10, g09.e01, or g16.b01

Notes

  1. NBO7 is included in g16.c01 version only, both nbo6 and nbo7 keywords will run NBO7 in g16.c01
  2. NBO6 is available in g09.e01 and g16.b01 versions.
  3. You can watch a recorded webinar/tutorial: Gaussian16 and NBO7 on Graham and Cedar

Errors

Some of the error messages produced by Gaussian have been collected, with suggestions for their resolution. See Gaussian error messages.