Allocations and compute scheduling/en: Difference between revisions

Jump to navigation Jump to search
Updating to match new version of source page
(Updating to match new version of source page)
(Updating to match new version of source page)
Line 2: Line 2:




''Parent page: [[Job scheduling policies]]''
<i>Parent page: [[Job scheduling policies]]</i>


= What is an allocation? =
= What is an allocation? =


'''An allocation is an amount of resources that a research group can target for use for a period of time, usually a year.''' This amount is either a maximum amount, as is the case for storage, or an average amount of usage over the period, as is the case for shared resources like computation cores.
<b>An allocation is an amount of resources that a research group can target for use for a period of time, usually a year.</b> This amount is either a maximum amount, as is the case for storage, or an average amount of usage over the period, as is the case for shared resources like computation cores.


Allocations are usually made in terms of core years, GPU years, or storage space. Storage allocations are the most straightforward to understand: research groups will get a maximum amount of storage that they can use exclusively throughout the allocation period. Core year and GPU year allocations are more difficult to understand because these allocations are meant to capture average use throughout the allocation period---typically meant to be a year---and this use will occur across a set of resources shared with other research groups.
Allocations are usually made in terms of core years, GPU years, or storage space. Storage allocations are the most straightforward to understand: research groups will get a maximum amount of storage that they can use exclusively throughout the allocation period. Core year and GPU year allocations are more difficult to understand because these allocations are meant to capture average use throughout the allocation period---typically meant to be a year---and this use will occur across a set of resources shared with other research groups.
Line 16: Line 16:
== Viewing group usage of compute resources ==
== Viewing group usage of compute resources ==


[[File:Select view group usage edit.png|thumb|Navigation to ''View Group Usage'']]
[[File:Select view group usage edit.png|thumb|Navigation to <i>View Group Usage</i>]]
Information on the usage of compute resources by your groups can be found by logging into the CCDB and navigating to ''My Account > View Group Usage''.
Information on the usage of compute resources by your groups can be found by logging into the CCDB and navigating to <i>My Account > View Group Usage</i>.
<br clear=all>
<br clear=all>


Line 24: Line 24:


The first tab bar offers these options:
The first tab bar offers these options:
: '''By Compute Resource''': cluster on which jobs are submitted;  
: <b>By Compute Resource</b>: cluster on which jobs are submitted;  
: '''By Resource Allocation Project''': projects to which jobs are submitted;
: <b>By Resource Allocation Project</b>: projects to which jobs are submitted;
: '''By Submitter''': user that submits the jobs;
: <b>By Submitter</b>: user that submits the jobs;
: '''Storage usage''' is discussed in [[Storage and file management]].  
: <b>Storage usage</b> is discussed in [[Storage and file management]].  


=== Usage by compute resource===
=== Usage by compute resource===
Line 36: Line 36:


[[File:Ccdb_view_use_by_compute_resource_monthly.png|thumb|Usage by compute resource with monthly breakdown]]
[[File:Ccdb_view_use_by_compute_resource_monthly.png|thumb|Usage by compute resource with monthly breakdown]]
From the ''Extra Info'' column of the usage table ''Show monthly usage'' can be clicked to display a further breakdown of the usage by month for the specific cluster row in the table. By clicking ''Show submitter usage'', a similar breakdown is displayed for the specific users submitting the jobs on the cluster.
From the <i>Extra Info</i> column of the usage table <i>Show monthly usage</i> can be clicked to display a further breakdown of the usage by month for the specific cluster row in the table. By clicking <i>Show submitter usage</i>, a similar breakdown is displayed for the specific users submitting the jobs on the cluster.
<br clear=all>
<br clear=all>


Line 42: Line 42:
===Usage by resource allocation project===
===Usage by resource allocation project===
[[File:Ccdb view use by compute resource monthly proj edit.png|thumb|Usage by Resource Allocation Project with monthly breakdown]]
[[File:Ccdb view use by compute resource monthly proj edit.png|thumb|Usage by Resource Allocation Project with monthly breakdown]]
Under this tab, a third tag bar displays the RAPIs (Resource Allocation Project Identifiers) for the selected allocation year. The tables contain detailed information for each allocation project and the resources used by the projects on all of the clusters. The top of the page summarizes information such as the account name (e.g. def-, rrg- or rpp-*, etc), the project title and ownership, as well as allocation and usage summaries.
Under this tab, a third tag bar displays the RAPIs (Resource Allocation Project Identifiers) for the selected allocation year. The tables contain detailed information for each allocation project and the resources used by the projects on all of the clusters. The top of the page summarizes information such as the account name (e.g. def-, rrg- or rpp-*, etc.), the project title and ownership, as well as allocation and usage summaries.
<br clear=all>
<br clear=all>


Line 53: Line 53:
== What happens if my group overuses my CPU or GPU allocation? ==
== What happens if my group overuses my CPU or GPU allocation? ==


Nothing bad.  Your CPU or GPU allocation is a target level, i.e., a target number of CPUs or GPUs.  If you have jobs waiting to run, and competing demand is low enough, then the scheduler may allow more of your jobs to run than your target level.  The only consequence of this is that succeeding jobs of yours ''may'' have lower priority for a time while the scheduler prioritizes other groups which were below their target.  You are not prevented from submitting or running new jobs, and the time-average of your usage should still be close to your target, that is, your allocation.
Nothing bad.  Your CPU or GPU allocation is a target level, i.e., a target number of CPUs or GPUs.  If you have jobs waiting to run, and competing demand is low enough, then the scheduler may allow more of your jobs to run than your target level.  The only consequence of this is that succeeding jobs of yours <i>may</i> have lower priority for a time while the scheduler prioritizes other groups which were below their target.  You are not prevented from submitting or running new jobs, and the average of your usage over time should still be close to your target, that is, your allocation.


It is even possible that you could end a month or even a year having run more work than your allocation would seem to allow, although this is unlikely given the demand on our resources.
It is even possible that you could end a month or even a year having run more work than your allocation would seem to allow, although this is unlikely given the demand on our resources.
Line 59: Line 59:
=How does scheduling work?=
=How does scheduling work?=


Compute-related resources granted by core-year and GPU-year allocations require research groups to submit what are referred to as “jobs” to a “scheduler”. A job is a combination of a computer program (an application) and a list of resources that the application is expected to use. The [[What is a scheduler?|scheduler]] is a program that calculates the priority of each job submitted and provides the needed resources based on the priority of each job and the available resources.
Compute-related resources granted by core-year and GPU-year allocations require research groups to submit what are referred to as <i>jobs</i> to a <i>scheduler</i>. A job is a combination of a computer program (an application) and a list of resources that the application is expected to use. The [[What is a scheduler?|scheduler]] is a program that calculates the priority of each job submitted and provides the needed resources based on the priority of each job and the available resources.


The scheduler uses prioritization algorithms to meet the allocation targets of all groups and it is based on a research group’s recent usage of the system as compared to their allocated usage on that system. The past of the allocation period is taken into account but the most weight is put on recent usage (or non-usage). The point of this is to allow a research group that matches their actual usage with their allocated amounts to operate roughly continuously at that level. This smooths resource usage over time across all groups and resources, allowing for it to be theoretically possible for all research groups to hit their allocation targets.
The scheduler uses prioritization algorithms to meet the allocation targets of all groups and it is based on a research group’s recent usage of the system as compared to their allocated usage on that system. The past of the allocation period is taken into account but the most weight is put on recent usage (or non-usage). The point of this is to allow a research group that matches their actual usage with their allocated amounts to operate roughly continuously at that level. This smooths resource usage over time across all groups and resources, allowing for it to be theoretically possible for all research groups to hit their allocation targets.
Line 65: Line 65:
=How does resource use affect priority?=
=How does resource use affect priority?=


The overarching principle governing the calculation of priority on Alliance's national clusters is that compute-based jobs are considered in the calculation based on the resources that others are prevented from using and not on the resources actually used.
The overarching principle governing the calculation of priority on our national clusters is that compute-based jobs are considered in the calculation based on the resources that others are prevented from using and not on the resources actually used.


The most common example of unused cores contributing to a priority calculation occurs when a submitted job requests multiple cores but uses fewer cores than requested when run. The usage that will affect the priority of future jobs is the number of cores requested, not the number of cores the application actually used. This is because the unused cores were unavailable to others to use during the job.
The most common example of unused cores contributing to a priority calculation occurs when a submitted job requests multiple cores but uses fewer cores than requested when run. The usage that will affect the priority of future jobs is the number of cores requested, not the number of cores the application actually used. This is because the unused cores were unavailable to others to use during the job.
Line 81: Line 81:
Cedar and Graham are considered to provide 4GB per core, since this corresponds to the most common node type in those clusters, making a core equivalent on these systems a core-memory bundle of 4GB per core. Niagara is considered to provide 4.8GB of memory per core, making a core equivalent on it a core-memory bundle of 4.8GB per core. Jobs are charged in terms of core equivalent usage at the rate of 4 or 4.8 GB per core, as explained above.  See Figure 1.
Cedar and Graham are considered to provide 4GB per core, since this corresponds to the most common node type in those clusters, making a core equivalent on these systems a core-memory bundle of 4GB per core. Niagara is considered to provide 4.8GB of memory per core, making a core equivalent on it a core-memory bundle of 4.8GB per core. Jobs are charged in terms of core equivalent usage at the rate of 4 or 4.8 GB per core, as explained above.  See Figure 1.


Allocation target tracking is straightforward when requests to use resources on the clusters are made entirely of core and memory amounts that can be portioned only into complete equivalent cores. Things become more complicated when jobs request portions of a core equivalent because it is possible to have many points counted against a research group’s allocation, even when they are using only portions of core equivalents. In practice, the method used by Alliance to account for system usage solves problems about fairness and perceptions of fairness but unfortunately the method is not initially intuitive.
Allocation target tracking is straightforward when requests to use resources on the clusters are made entirely of core and memory amounts that can be portioned only into complete equivalent cores. Things become more complicated when jobs request portions of a core equivalent because it is possible to have many points counted against a research group’s allocation, even when they are using only portions of core equivalents. In practice, the method used by the Alliance to account for system usage solves problems about fairness and perceptions of fairness but unfortunately the method is not initially intuitive.


Research groups are charged for the maximum number of core equivalents they take from the resources. Assuming a core equivalent of 1 core and 4GB of memory:
Research groups are charged for the maximum number of core equivalents they take from the resources. Assuming a core equivalent of 1 core and 4GB of memory:
Line 106: Line 106:
[[File:GPU_and_a_half_(memory).png|frame|center|Figure 7 - 1.5 GPU equivalents, based on memory.]] <br clear=all>
[[File:GPU_and_a_half_(memory).png|frame|center|Figure 7 - 1.5 GPU equivalents, based on memory.]] <br clear=all>


== Ratios: GPU / CPU Cores / System-memory ==
== Ratios: GPU / CPU cores / System memory ==
Alliance systems have the following GPU-core-memory bundle characteristics:
Alliance systems have the following GPU-core-memory bundle characteristics:
* [[Béluga/en#Node_Characteristics|Béluga]]:
* [[Béluga/en#Node_Characteristics|Béluga]]:
38,760

edits

Navigation menu