MIG is only compatible with Linux distributions that support CUDA 11/R450 or higher. It’s also a good idea to use the NVIDIA Datacenter Linux driver version 450.80.02 or higher. MIG is not supported on any Windows Pro or Server OS.
The new Multi-Instance GPU (MIG) functionality allows NVIDIA Ampere-based GPUs (such as the NVIDIA A100) to be securely partitioned into up to seven independent GPU Instances for CUDA applications, giving different users separate GPU resources for optimal GPU use.
This functionality is especially useful for workloads that do not completely utilize the GPU’s computing capabilities, and users may want to run many tasks in parallel to get the most out of their GPU.
The following prerequisites and software versions are highly recommended when using supported GPUs in MIG mode.
Only the GPUs and systems listed below are compatible with MIG.
- NVIDIA driver 450.80.02 or later is required for CUDA 11.
- Linux distributions that were supported by CUDA 11
If you are using Kubernetes or containers, make sure to use;
- Version 2.5.0 or later of the NVIDIA Container Toolkit (nvidia-docker2)
- v0.7.0 or newer of the NVIDIA K8s Device Plugin
- Version 0.2.0 or later of NVIDIA gpu-feature-discovery
NVIDIA Management Library (NVML) APIs or the nvidia-smi command-line interface can be used to manage MIG programmatically. Some of the nvidia-smi output in the following examples may be cropped for brevity to highlight the relevant sections of interest.
Enable or Disable MIG
To enable or disable MIG on all GPUs or a certain GPU, enable persistence mode on all GPUs.
nvidia-smi -pm 1 root@server:~# nvidia-smi -pm 1 Enabled persistence mode for GPU 00000000:31:00.0. Enabled persistence mode for GPU 00000000:CA:00.0. All done.
Then enable MIG by using the following command.
root@server:~# nvidia-smi -mig 1
root@server:~# nvidia-smi Tue Feb 22 11:47:45 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 510.39.01 Driver Version: 510.39.01 CUDA Version: 11.6 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100 80G... On | 00000000:31:00.0 Off | On | | N/A 34C P0 41W / 300W | 0MiB / 81920MiB | N/A Default | | | | Enabled | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA A100 80G... On | 00000000:CA:00.0 Off | On | | N/A 34C P0 43W / 300W | 0MiB / 81920MiB | N/A Default | | | | Enabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | MIG devices: | +------------------+----------------------+-----------+-----------------------+ | GPU GI CI MIG | Memory-Usage | Vol| Shared | | ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG| | | | ECC| | |==================+======================+===========+=======================| | No MIG devices found | +-----------------------------------------------------------------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
To disable MIG use the following command.
nvidia-smi -mig 0
root@server:~# nvidia-smi -mig 0 Disabled MIG Mode for GPU 00000000:31:00.0 Disabled MIG Mode for GPU 00000000:CA:00.0 All done.
To enable or disable MIG on a specific device.
nvidia-smi -i 3 -mig 1
nvidia-smi -i 3 -mig 0
List all GPU Instances
root@server:~# nvidia-smi mig -lgip +-----------------------------------------------------------------------------+ | GPU instance profiles: | | GPU Name ID Instances Memory P2P SM DEC ENC | | Free/Total GiB CE JPEG OFA | |=============================================================================| | 0 MIG 1g.10gb 19 7/7 9.50 No 14 0 0 | | 1 0 0 | +-----------------------------------------------------------------------------+ | 0 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 | | 1 1 1 | +-----------------------------------------------------------------------------+ | 0 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 | | 2 0 0 | +-----------------------------------------------------------------------------+ | 0 MIG 3g.40gb 9 2/2 39.25 No 42 2 0 | | 3 0 0 | +-----------------------------------------------------------------------------+ | 0 MIG 4g.40gb 5 1/1 39.25 No 56 2 0 | | 4 0 0 | +-----------------------------------------------------------------------------+ | 0 MIG 7g.80gb 0 1/1 78.75 No 98 5 0 | | 7 1 1 | +-----------------------------------------------------------------------------+ | 1 MIG 1g.10gb 19 7/7 9.50 No 14 0 0 | | 1 0 0 | +-----------------------------------------------------------------------------+ | 1 MIG 1g.10gb+me 20 1/1 9.50 No 14 1 0 | | 1 1 1 | +-----------------------------------------------------------------------------+ | 1 MIG 2g.20gb 14 3/3 19.50 No 28 1 0 | | 2 0 0 | +-----------------------------------------------------------------------------+ | 1 MIG 3g.40gb 9 2/2 39.25 No 42 2 0 | | 3 0 0 | +-----------------------------------------------------------------------------+ | 1 MIG 4g.40gb 5 1/1 39.25 No 56 2 0 | | 4 0 0 | +-----------------------------------------------------------------------------+ | 1 MIG 7g.80gb 0 1/1 78.75 No 98 5 0 | | 7 1 1 | +-----------------------------------------------------------------------------+
Create GPU instances on MIG-enabled GPU
root@server:~# nvidia-smi mig -cgi 19,19,19,19,19,19,19 -i 3 Successfully created GPU instance ID 13 on GPU 0 using profile MIG 1g.5gb (ID 19) Successfully created GPU instance ID 11 on GPU 0 using profile MIG 1g.5gb (ID 19) Successfully created GPU instance ID 12 on GPU 0 using profile MIG 1g.5gb (ID 19) Successfully created GPU instance ID 7 on GPU 0 using profile MIG 1g.5gb (ID 19) Successfully created GPU instance ID 8 on GPU 0 using profile MIG 1g.5gb (ID 19) Successfully created GPU instance ID 9 on GPU 0 using profile MIG 1g.5gb (ID 19) Successfully created GPU instance ID 10 on GPU 0 using profile MIG 1g.5gb (ID 19)
Create Compute instances on a MIG-enabled GPU
root@server:~# nvidia-smi mig -cci -i 3 Successfully created compute instance ID 0 on GPU 0 GPU instance ID 7 using profile MIG 1g.5gb (ID 0) Successfully created compute instance ID 0 on GPU 0 GPU instance ID 8 using profile MIG 1g.5gb (ID 0) Successfully created compute instance ID 0 on GPU 0 GPU instance ID 9 using profile MIG 1g.5gb (ID 0) Successfully created compute instance ID 0 on GPU 0 GPU instance ID 10 using profile MIG 1g.5gb (ID 0) Successfully created compute instance ID 0 on GPU 0 GPU instance ID 11 using profile MIG 1g.5gb (ID 0) Successfully created compute instance ID 0 on GPU 0 GPU instance ID 12 using profile MIG 1g.5gb (ID 0) Successfully created compute instance ID 0 on GPU 0 GPU instance ID 13 using profile MIG 1g.5gb (ID 0)
View MIG Instances
root@server:~# nvidia-smi +-----------------------------------------------------------------------------+ | MIG devices: | +------------------+----------------------+-----------+-----------------------+ | GPU GI CI MIG | Memory-Usage | Vol| Shared | | ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG| | | | ECC| | |==================+======================+===========+=======================| | 0 7 0 0 | 1MiB / 4864MiB | 14 0 | 1 0 0 0 0 | | | 0MiB / 8191MiB | | | +------------------+----------------------+-----------+-----------------------+ | 0 8 0 1 | 1MiB / 4864MiB | 14 0 | 1 0 0 0 0 | | | 0MiB / 8191MiB | | | +------------------+----------------------+-----------+-----------------------+ | 0 9 0 2 | 1MiB / 4864MiB | 14 0 | 1 0 0 0 0 | | | 0MiB / 8191MiB | | | +------------------+----------------------+-----------+-----------------------+ | 0 10 0 3 | 1MiB / 4864MiB | 14 0 | 1 0 0 0 0 | | | 0MiB / 8191MiB | | | +------------------+----------------------+-----------+-----------------------+ | 0 11 0 4 | 1MiB / 4864MiB | 14 0 | 1 0 0 0 0 | | | 0MiB / 8191MiB | | | +------------------+----------------------+-----------+-----------------------+ | 0 12 0 5 | 1MiB / 4864MiB | 14 0 | 1 0 0 0 0 | | | 0MiB / 8191MiB | | | +------------------+----------------------+-----------+-----------------------+ | 0 13 0 6 | 1MiB / 4864MiB | 14 0 | 1 0 0 0 0 | | | 0MiB / 8191MiB | | | +------------------+----------------------+-----------+-----------------------+
root@server:~# nvidia-smi -L GPU 0: A100-PCIE-40GB (UUID: GPU-0069414c-9f30-41f9-d5d8-87890423f0c4) MIG 1g.5gb Device 0: (UUID: MIG-GPU-0069414c-9f30-41f9-d5d8-87890423f0c4/7/0) MIG 1g.5gb Device 1: (UUID: MIG-GPU-0069414c-9f30-41f9-d5d8-87890423f0c4/8/0) MIG 1g.5gb Device 2: (UUID: MIG-GPU-0069414c-9f30-41f9-d5d8-87890423f0c4/9/0) MIG 1g.5gb Device 3: (UUID: MIG-GPU-0069414c-9f30-41f9-d5d8-87890423f0c4/10/0) MIG 1g.5gb Device 4: (UUID: MIG-GPU-0069414c-9f30-41f9-d5d8-87890423f0c4/11/0) MIG 1g.5gb Device 5: (UUID: MIG-GPU-0069414c-9f30-41f9-d5d8-87890423f0c4/12/0) MIG 1g.5gb Device 6: (UUID: MIG-GPU-0069414c-9f30-41f9-d5d8-87890423f0c4/13/0)
List GPU Instances
root@server:~# nvidia mig -lgi
List Compute Instances
root@server:~# nvidia mig -lci +-------------------------------------------------------+ | Compute instances: | | GPU GPU Name Profile Instance | | Instance ID ID | | ID | |=======================================================| | 0 7 MIG 1g.5gb 0 0 | +-------------------------------------------------------+ | 0 8 MIG 1g.5gb 0 0 | +-------------------------------------------------------+ | 0 9 MIG 1g.5gb 0 0 | +-------------------------------------------------------+ | 0 11 MIG 1g.5gb 0 0 | +-------------------------------------------------------+ | 0 12 MIG 1g.5gb 0 0 | +-------------------------------------------------------+ | 0 13 MIG 1g.5gb 0 0 | +-------------------------------------------------------+ | 0 14 MIG 1g.5gb 0 0 | +-------------------------------------------------------+
Destroy Compute Instances
root@server:~# nvidia-smi mig -dci -i 3 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 7 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 8 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 9 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 10 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 11 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 12 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 13
Destroy GPU Instances
root@server:~# nvidia-smi mig -dgi -i 3 Successfully destroyed GPU instance ID 7 from GPU 0 Successfully destroyed GPU instance ID 8 from GPU 0 Successfully destroyed GPU instance ID 9 from GPU 0 Successfully destroyed GPU instance ID 10 from GPU 0 Successfully destroyed GPU instance ID 11 from GPU 0 Successfully destroyed GPU instance ID 12 from GPU 0 Successfully destroyed GPU instance ID 13 from GPU 0
Getting MIG Help
root@server:~# nvidia-smi mig -h mig -- Multi Instance GPU management. Usage: nvidia-smi mig [options] Options include: [-h | --help]: Display help information. [-i | --id]: Enumeration index, PCI bus ID or UUID. Provide comma separated values for more than one device. [-gi | --gpu-instance-id]: GPU instance ID. Provide comma separated values for more than one GPU instance. [-ci | --compute-instance-id]: Compute instance ID. Provide comma separated values for more than one compute instance. [-lgip | --list-gpu-instance-profiles]: List supported GPU instance profiles. Option -i can be used to restrict the command to run on a specific GPU. [-lgipp | --list-gpu-instance-possible-placements]: List possible GPU instance placements in the following format, {Start}:Size. Option -i can be used to restrict the command to run on a specific GPU. [-C | --default-compute-instance]: Create compute instance with the default profile when used with the option to create a GPU instance (-cgi). [-cgi | --create-gpu-instance]: Create GPU instances for the given profile tuples. A profile tuple consists of a profile name or ID and an optional placement specifier, which consists of a colon and a placement start index. Provide comma separated values for more than one profile tuple. Option -i can be used to restrict the command to run on a specific GPU. [-dgi | --destroy-gpu-instance]: Destroy GPU instances. Options -i and -gi can be used individually or combined to restrict the command to run on a specific GPU or GPU instance. [-lgi | --list-gpu-instances]: List GPU instances. Option -i can be used to restrict the command to run on a specific GPU. [-lcip | --list-compute-instance-profiles]: List supported compute instance profiles. Options -i and -gi can be used individually or combined to restrict the command to run on a specific GPU or GPU instance. [-cci | --create-compute-instance]: Create compute instance for the given profile name or IDs. Provide comma separated values for more than one profile. If no profile name or ID is given, then the default* compute instance profile ID will be used. Options -i and -gi can be used individually or combined to restrict the command to run on a specific GPU or GPU instance. [-dci | --destroy-compute-instance]: Destroy compute instances. Options -i, -gi and -ci can be used individually or combined to restrict the command to run on a specific GPU or GPU instance or compute instance. [-lci | --list-compute-instances]: List compute instances. Options -i and -gi can be used individually or combined to restrict the command to run on a specific GPU or GPU instance.
Do you also have cheatsheet on how MIG works in ESXi..
I am having hard time in making it work on ESXi,
I see VMs doesn’t power on post attaching vGPU GRID profile to it
Hi, Have you installed GRID software in ESXi. Make sure you have Linux based OS like Ubuntu in VM. MIG is not supported in Windows. MIG also need to be enabled in GPU card like A100.
Hi, Secondly MIG cannot be enabled or configured in ESXi. Just enabled MIG compatible GPU and attach it to VM. MIG is enabled in VM using nvidia-smi. Also make sure NVIDIA license is installed in ESXi for using GPU.