cc_staff
782
edits
No edit summary |
(Moved the dense matrices) |
||
Line 42: | Line 42: | ||
<!--T:47--> | <!--T:47--> | ||
Because roughly half of our users primarily use single-precision floating-point operations ([https://en.wikipedia.org/wiki/Single-precision_floating-point_format FP32]), the other half use half-precision floating-point operations ([https://en.wikipedia.org/wiki/Half-precision_floating-point_format FP16] | Because roughly half of our users primarily use single-precision floating-point operations ([https://en.wikipedia.org/wiki/Single-precision_floating-point_format FP32]), the other half use half-precision floating-point operations ([https://en.wikipedia.org/wiki/Half-precision_floating-point_format FP16]), and a significant portion of all users are constrained by the amount of memory on the GPU, we chose the following evaluation criteria and corresponding weights to rank the different GPU models: | ||
<!--T:48--> | <!--T:48--> | ||
Line 50: | Line 50: | ||
! scope="col"| Weight | ! scope="col"| Weight | ||
|- | |- | ||
! scope="row"| FP32 score | ! scope="row"| FP32 score <small>(with dense matrices on regular GPU cores)</small> | ||
| 40% | | 40% | ||
|- | |- | ||
! scope="row"| FP16 score | ! scope="row"| FP16 score <small>(with dense matrices on <em>Tensor Cores</em>)</small> | ||
| 40% | | 40% | ||
|- | |- |