Allocations and compute scheduling: Difference between revisions

Allocations and compute scheduling (view source)

104 bytes added , 1 month ago

Moved the dense matrices

cc_staff

782

edits

@@ Line 42: / Line 42: @@
 <!--T:47-->
-Because roughly half of our users primarily use single-precision floating-point operations ([https://en.wikipedia.org/wiki/Single-precision_floating-point_format FP32]), the other half use half-precision floating-point operations ([https://en.wikipedia.org/wiki/Half-precision_floating-point_format FP16], dense matrices), and a significant portion of all users are constrained by the amount of memory on the GPU, we chose the following evaluation criteria and corresponding weights to rank the different GPU models:
+Because roughly half of our users primarily use single-precision floating-point operations ([https://en.wikipedia.org/wiki/Single-precision_floating-point_format FP32]), the other half use half-precision floating-point operations ([https://en.wikipedia.org/wiki/Half-precision_floating-point_format FP16]), and a significant portion of all users are constrained by the amount of memory on the GPU, we chose the following evaluation criteria and corresponding weights to rank the different GPU models:
 <!--T:48-->
@@ Line 50: / Line 50: @@
 ! scope="col"| Weight
 |-
-! scope="row"| FP32 score
+! scope="row"| FP32 score <small>(with dense matrices on regular GPU cores)</small>
 | 40%
 |-
-! scope="row"| FP16 score
+! scope="row"| FP16 score <small>(with dense matrices on <em>Tensor Cores</em>)</small>
 | 40%
 |-