cc_staff
782
edits
(Reviewed the limitations, added links) |
(Reviewed Available configurations) |
||
Line 21: | Line 21: | ||
= Available configurations = <!--T:6--> | = Available configurations = <!--T:6--> | ||
As of July 30 2024, | As of July 30, 2024, only Narval has a few A100 nodes configured with MIG. | ||
* | While there exist [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#a100-profiles many possible MIG configurations and profiles], only the two following profiles have been implemented on selected GPUs: | ||
* | * <code>3g.20gb</code> | ||
* <code>4g.20gb</code> | |||
<!--T:7--> | <!--T:7--> | ||
The name describes the size of the | The profile name describes the size of the GPU instance. | ||
For example, a <code>3g.20gb</code> instance has 20 GB of GPU RAM and offers 3/8 of the computing performance of a full A100-40gb GPU. Using less powerful MIG profiles will have a lower impact on your allocation and priority. | |||
<!--T:8--> | <!--T:8--> | ||
On Narval, the recommended maximum number of cores and amount of system memory per GPU instance are: | |||
* 3g.20gb: maximum 6 cores | * <code>3g.20gb</code>: maximum 6 cores and 62 GB | ||
* 4g.20gb: maximum 6 cores | * <code>4g.20gb</code>: maximum 6 cores and 62 GB | ||
<!--T:9--> | <!--T:9--> | ||
To request a | To request a GPU instance of a certain profile, your job submission must include a <code>--gres</code> parameter: | ||
* <code>3g.20gb</code>: <code>--gres=gpu:a100_3g.20gb:1</code> | |||
* <code>4g.20gb</code>: <code>--gres=gpu:a100_4g.20gb:1</code> | |||
Note: for the job scheduler on Narval, the prefix <code>a100_</code> is required before the profile name. | |||
= Examples = <!--T:10--> | = Examples = <!--T:10--> |