Running jobs: Difference between revisions

reposition a paragraph
No edit summary
(reposition a paragraph)
Line 580: Line 580:
==== Jobs inherit environment variables ==== <!--T:128-->
==== Jobs inherit environment variables ==== <!--T:128-->
By default a job will inherit the environment variables of the shell where the job was submitted. The [[Using modules|module]] command, which is used to make various software packages available, changes and sets environment variables. Changes will propagate to any job submitted from the shell and thus could affect the job's ability to load modules if there are missing prerequisites. It is best to include the line <code>module purge</code> in your job script before loading all the required modules to ensure a consistent state for each job submission and avoid changes made in your shell affecting your jobs.
By default a job will inherit the environment variables of the shell where the job was submitted. The [[Using modules|module]] command, which is used to make various software packages available, changes and sets environment variables. Changes will propagate to any job submitted from the shell and thus could affect the job's ability to load modules if there are missing prerequisites. It is best to include the line <code>module purge</code> in your job script before loading all the required modules to ensure a consistent state for each job submission and avoid changes made in your shell affecting your jobs.
<!--T:152-->
Inheriting environment settings from the submitting shell can sometimes lead to hard-to-diagnose problems. If you wish to suppress this inheritance, use the <code>--export=none</code> directive when submitting jobs.


==== Job hangs / no output ==== <!--T:165-->
==== Job hangs / no output ==== <!--T:165-->
Line 585: Line 588:
<!--T:166-->
<!--T:166-->
Sometimes a submitted job writes no output to the log file for an extended period of time, looking like it is hanging. A common reason for this is the aggressive buffering performed by the Slurm scheduler, which will aggregate many output lines before flushing them to the log file. Often the output file will only be written after the job completes. If you wish to monitor the progress of your submitted job through an output file as it runs, you should redirect its output to a separate file. For example, in your submit file you may run your, e.g. python, job as <code>python myscript.py > mylog.txt</code>. If you wish for the output to also be present in the Slurm log, you may also use the <code>tee</code> command, as follows: <code>python myscript.py | tee mylog.txt</code>.
Sometimes a submitted job writes no output to the log file for an extended period of time, looking like it is hanging. A common reason for this is the aggressive buffering performed by the Slurm scheduler, which will aggregate many output lines before flushing them to the log file. Often the output file will only be written after the job completes. If you wish to monitor the progress of your submitted job through an output file as it runs, you should redirect its output to a separate file. For example, in your submit file you may run your, e.g. python, job as <code>python myscript.py > mylog.txt</code>. If you wish for the output to also be present in the Slurm log, you may also use the <code>tee</code> command, as follows: <code>python myscript.py | tee mylog.txt</code>.
<!--T:152-->
Inheriting environment settings from the submitting shell can sometimes lead to hard-to-diagnose problems. If you wish to suppress this inheritance, use the <code>--export=none</code> directive when submitting jobs.


== Job status and priority == <!--T:103-->
== Job status and priority == <!--T:103-->
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits