Running jobs: Difference between revisions

m
(mention how to see job output more frequently for batch jobs)
Line 578: Line 578:
By default a job will inherit the environment variables of the shell where the job was submitted. The [[Using modules|module]] command, which is used to make various software packages available, changes and sets environment variables. Changes will propagate to any job submitted from the shell and thus could affect the job's ability to load modules if there are missing prerequisites. It is best to include the line <code>module purge</code> in your job script before loading all the required modules to ensure a consistent state for each job submission and avoid changes made in your shell affecting your jobs.
By default a job will inherit the environment variables of the shell where the job was submitted. The [[Using modules|module]] command, which is used to make various software packages available, changes and sets environment variables. Changes will propagate to any job submitted from the shell and thus could affect the job's ability to load modules if there are missing prerequisites. It is best to include the line <code>module purge</code> in your job script before loading all the required modules to ensure a consistent state for each job submission and avoid changes made in your shell affecting your jobs.


=== Job hangs / no output ===
==== Job hangs / no output ====


Sometimes a submitted job writes no output to the log file for an extended period of time, looking like it is hanging. A common reason for this is the aggressive buffering performed by the Slurm scheduler, which will aggregate many output lines before flushing them to the log file. Often the output file will only be written after the job completes. If you wish to monitor the progress of your submitted job through an output file as it runs, you should redirect its output to a separate file. For example, in your submit file you may run your, e.g. python, job as <code>python myscript.py > mylog.txt</code>. If you wish for the output to also be present in the Slurm log, you may also use the <code>tee</code> command, as follows: <code>python myscript.py | tee mylog.txt</code>.
Sometimes a submitted job writes no output to the log file for an extended period of time, looking like it is hanging. A common reason for this is the aggressive buffering performed by the Slurm scheduler, which will aggregate many output lines before flushing them to the log file. Often the output file will only be written after the job completes. If you wish to monitor the progress of your submitted job through an output file as it runs, you should redirect its output to a separate file. For example, in your submit file you may run your, e.g. python, job as <code>python myscript.py > mylog.txt</code>. If you wish for the output to also be present in the Slurm log, you may also use the <code>tee</code> command, as follows: <code>python myscript.py | tee mylog.txt</code>.
cc_staff
37

edits