Allocations and compute scheduling/en: Difference between revisions

Updating to match new version of source page
(Updating to match new version of source page)
(Updating to match new version of source page)
Line 30: Line 30:
The performance of GPUs has dramatically increased in the recent years and continues to do so. Until RAC 2023 we treated all GPUs as equivalent to each other for allocation purposes. This caused problems both in the allocation process and while running jobs, so in the 2024 RAC year we introduced the <i>reference GPU unit</i>, or <b>RGU</b>, to rank all GPU models in production and alleviate these problems. In the 2025 RAC year we will also have to deal with new complexity involving [[Multi-Instance GPU|multi-instance GPU technology]].
The performance of GPUs has dramatically increased in the recent years and continues to do so. Until RAC 2023 we treated all GPUs as equivalent to each other for allocation purposes. This caused problems both in the allocation process and while running jobs, so in the 2024 RAC year we introduced the <i>reference GPU unit</i>, or <b>RGU</b>, to rank all GPU models in production and alleviate these problems. In the 2025 RAC year we will also have to deal with new complexity involving [[Multi-Instance GPU|multi-instance GPU technology]].


Because roughly half of our users primarily use single-precision floating-point operations ([https://en.wikipedia.org/wiki/Single-precision_floating-point_format FP32]), the other half use half-precision floating-point operations ([https://en.wikipedia.org/wiki/Half-precision_floating-point_format FP16], dense matrices), and a significant portion of all users are constrained by the amount of memory on the GPU, we chose the following evaluation criteria and corresponding weights to rank the different GPU models:
Because roughly half of our users primarily use single-precision floating-point operations ([https://en.wikipedia.org/wiki/Single-precision_floating-point_format FP32]), the other half use half-precision floating-point operations ([https://en.wikipedia.org/wiki/Half-precision_floating-point_format FP16]), and a significant portion of all users are constrained by the amount of memory on the GPU, we chose the following evaluation criteria and corresponding weights to rank the different GPU models:


{| class="wikitable" style="margin: auto;"
{| class="wikitable" style="margin: auto;"
Line 37: Line 37:
! scope="col"| Weight  
! scope="col"| Weight  
|-
|-
! scope="row"| FP32 score
! scope="row"| FP32 score <small>(with dense matrices on regular GPU cores)</small>
| 40%
| 40%
|-
|-
! scope="row"| FP16 score
! scope="row"| FP16 score <small>(with dense matrices on <em>Tensor Cores</em>)</small>
| 40%
| 40%
|-
|-
38,760

edits