Talk:Running jobs: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
(thoughts on memory management) |
||
Line 2: | Line 2: | ||
The requirement for <code>--account=project-name-cpu</code> is still under discussion and may change before release. | The requirement for <code>--account=project-name-cpu</code> is still under discussion and may change before release. | ||
:[[User:Rdickson|Ross Dickson]] ([[User talk:Rdickson|talk]]) 13:55, 3 April 2017 (UTC) | :[[User:Rdickson|Ross Dickson]] ([[User talk:Rdickson|talk]]) 13:55, 3 April 2017 (UTC) | ||
==Memory Management== | |||
* According to Kamil it is advisable for users to express their job in 1000s on MB instead of GB (i.e. 1000MB ~ 1GB). This will leave some RAM for the OS when reaching the memory limit on core nodes. e.g. <code>--ntasks=32 --mem-per-cpu=4000M</code> will fit on a base node with 128G of RAM, while <code>--ntasks=32 --mem-per-cpu=4G</code> requires a Large-memory node. | |||
* At least on Graham, one needs to use <code>--mem=xxxG</code> to request a large-memory node and with using <code>--mem-per-cpu=yyyG</code> one can only ever get base-nodes. | |||
:[[User:Stuekero|Oliver Stueker]] ([[User talk:Stuekero|talk]]) 19:25, 11 July 2017 (UTC) | |||
==External links== | ==External links== |
Revision as of 19:25, 11 July 2017
What's a job?
The requirement for --account=project-name-cpu
is still under discussion and may change before release.
- Ross Dickson (talk) 13:55, 3 April 2017 (UTC)
Memory Management
- According to Kamil it is advisable for users to express their job in 1000s on MB instead of GB (i.e. 1000MB ~ 1GB). This will leave some RAM for the OS when reaching the memory limit on core nodes. e.g.
--ntasks=32 --mem-per-cpu=4000M
will fit on a base node with 128G of RAM, while--ntasks=32 --mem-per-cpu=4G
requires a Large-memory node. - At least on Graham, one needs to use
--mem=xxxG
to request a large-memory node and with using--mem-per-cpu=yyyG
one can only ever get base-nodes.
- Oliver Stueker (talk) 19:25, 11 July 2017 (UTC)
External links
This slide deck from LLNL is old (2004), covers some architectural aspects probably not of interest to most users, and has some LLNL-local information that will not apply at CC (e.g. FIFO scheduling). Not recommended.
- Ross Dickson (talk) 15:36, 23 January 2017 (UTC)
All the SchedMD videos I've looked at so far are just slide talks, mostly directed at administrators rather than users. Job submission commands come up in Introduction to SLURM, Part 3.
- Ross Dickson (talk) 14:35, 27 January 2017 (UTC)
Doug P also suggests https://sites.google.com/a/case.edu/hpc-upgraded-cluster/slurm-cluster-commands from CWRU, specifically on moving from Torque to SLURM.
- Ross Dickson (talk) 13:07, 7 March 2017 (UTC)