Step 0: Introduction
This guide explains how to assign an NVIDIA GPU to an LXC container to enable GPU-accelerated tasks. Using LXC containers offers advantages over running directly on the host, such as better separation of services, easier network management, and improved overall security. Let’s begin.
Step 1: Install base packages
To enable GPU passthrough in an LXC container, the first step is to ensure your host system is properly prepared. Begin by installing essential packages that will allow you to compile and manage kernel modules. On the LXC host, run the following command:
apt-get install build-essential dkms
These packages provide the foundational tools required to build and maintain out-of-tree kernel modules, which are necessary for the NVIDIA driver installation process.
Step 2: Installing the NVIDIA driver
Next, download the appropriate NVIDIA driver for your specific GPU model. In this example, we’re using an NVIDIA Tesla P40 and will install version 570.86.15 of the driver:
wget https://us.download.nvidia.com/tesla/570.86.15/NVIDIA-Linux-x86_64-570.86.15.run
chmod +x NVIDIA-Linux-x86_64-570.86.15.run
./NVIDIA-Linux-x86_64-570.86.15.run --dkms
Make sure to run the installer with the --dkms
flag to ensure the NVIDIA kernel modules are automatically rebuilt if the kernel is updated.
Step 3: Disable the nouveau module
Next, it’s important to prevent the open-source nouveau
driver from loading, as it can conflict with the proprietary NVIDIA driver. To do this, blacklist nouveau
by adding it to a modprobe blacklist:
nano /etc/modprobe.d/blacklist.conf
# NVIDIA: Blacklist open-source nouveau driver
blacklist nouveau
After blacklisting the driver, regenerate the initramfs to ensure the changes take effect at boot:
update-initramfs -u
Step 4: Configure udev rules
To ensure proper permissions are set for NVIDIA device files at boot, define custom udev
rules. This step is important to allow non-root processes (such as those inside LXC containers) to access the GPU devices.
Create a new udev
rule file:
nano /etc/udev/rules.d/70-nvidia.rules
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 666 /dev/nvidia-uvm*'"
After saving the file, reboot the system to apply the changes:
shutdown -r now
Step 5: Configure GPU access from the LXC container
After the host system has rebooted and the NVIDIA driver is properly initialized, you can proceed to configure your LXC container. You can either create a new container or modify an existing one. If you’re using LXC profiles, this setup can be applied globally to containers that need GPU access.
To allow the container access to the NVIDIA devices, you’ll need to define the correct cgroup
device rules based on the major device numbers. First, check the major and minor numbers of the GPU device nodes:
ls -l /dev/nvidia*
Example output:
crw-rw-rw- 1 root root 195, 0 Feb 24 17:12 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Feb 24 17:12 /dev/nvidiactl
crw-rw-rw- 1 root root 509, 0 Feb 24 17:12 /dev/nvidia-uvm
crw-rw-rw- 1 root root 509, 1 Feb 24 17:12 /dev/nvidia-uvm-tools
From this output, you can see that the major device numbers used by the NVIDIA driver are 195
and 509
. These values will be used to grant access within the container via cgroup
device rules in the container config or LXC profile.
To enable GPU access inside your LXC container, you need to update the container’s configuration to both allow access to the required device nodes via cgroups and mount them inside the container.
Edit the container configuration using:
EDITOR=nano lxc config edit <CONTAINERNAME>
Within the config:
section, add the following raw.lxc
block:
config:
raw.lxc: |
lxc.cgroup2.devices.allow = c 1:* rwm
lxc.cgroup2.devices.allow = c 5:* rwm
lxc.cgroup2.devices.allow = c 195:* rwm
lxc.cgroup2.devices.allow = c 509:* rwm
lxc.mount.entry = /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry = /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry = /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry = /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
This configuration explicitly permits the container to access all necessary NVIDIA character devices (with the major numbers 195
and 509
from before), and mounts the corresponding device files from the host into the container.
After saving the configuration, restart the container to apply the changes:
lxc restart <CONTAINERNAME>
Step 6: Install NVIDIA driver inside the LXC container
With GPU device access configured, the next step is to prepare the container environment by installing the NVIDIA driver, but only the user-space components, as the kernel modules are already loaded on the host.
Start by entering the container shell:
lxc exec <CONTAINERNAME> bash
Inside the container, install essential build tools:
apt-get install build-essential
Now, run the NVIDIA driver installer inside the container, but skip kernel module installation using the --no-kernel-module
flag:
./NVIDIA-Linux-x86_64-570.86.15.run --no-kernel-module
You can safely ignore any warnings related to the absence of kernel headers or module build failures. Those are expected in a containerized environment where the host handles the kernel-level integration. The goal here is simply to install the NVIDIA user-space libraries (e.g. libnvidia-ml
, libcuda
, etc.) so that applications inside the container can interface with the GPU.
Once the NVIDIA user-space driver is installed inside the container, exit the container environment:
exit
Then, perform a full system reboot to ensure that all host and container configurations, including device bindings, udev
rules, and driver installations are cleanly initialized:
shutdown -r now
Step 7: Use your GPU in LXC
There you have it. A straightforward guide to passing an NVIDIA GPU into an LXC container. With the right host setup and driver installation inside the container, you can enjoy GPU acceleration in lightweight, isolated environments. Perfect for AI, rendering, or any GPU-powered tasks. Happy containerizing!