Step 0: Introduction

This guide explains how to assign an NVIDIA GPU to an LXC container to enable GPU-accelerated tasks. Using LXC containers offers advantages over running directly on the host, such as better separation of services, easier network management, and improved overall security. Let’s begin.

Step 1: Install base packages

To enable GPU passthrough in an LXC container, the first step is to ensure your host system is properly prepared. Begin by installing essential packages that will allow you to compile and manage kernel modules. On the LXC host, run the following command:

apt-get install build-essential dkms

These packages provide the foundational tools required to build and maintain out-of-tree kernel modules, which are necessary for the NVIDIA driver installation process.

Step 2: Installing the NVIDIA driver

Next, download the appropriate NVIDIA driver for your specific GPU model. In this example, we’re using an NVIDIA Tesla P40 and will install version 570.86.15 of the driver:

wget https://us.download.nvidia.com/tesla/570.86.15/NVIDIA-Linux-x86_64-570.86.15.run
chmod +x NVIDIA-Linux-x86_64-570.86.15.run
./NVIDIA-Linux-x86_64-570.86.15.run --dkms

Make sure to run the installer with the --dkms flag to ensure the NVIDIA kernel modules are automatically rebuilt if the kernel is updated.

Step 3: Disable the nouveau module

Next, it’s important to prevent the open-source nouveau driver from loading, as it can conflict with the proprietary NVIDIA driver. To do this, blacklist nouveau by adding it to a modprobe blacklist:

nano /etc/modprobe.d/blacklist.conf
# NVIDIA: Blacklist open-source nouveau driver
blacklist nouveau

After blacklisting the driver, regenerate the initramfs to ensure the changes take effect at boot:

update-initramfs -u

Step 4: Configure udev rules

To ensure proper permissions are set for NVIDIA device files at boot, define custom udev rules. This step is important to allow non-root processes (such as those inside LXC containers) to access the GPU devices.

Create a new udev rule file:

nano /etc/udev/rules.d/70-nvidia.rules
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 666 /dev/nvidia-uvm*'"

After saving the file, reboot the system to apply the changes:

shutdown -r now

Step 5: Configure GPU access from the LXC container

After the host system has rebooted and the NVIDIA driver is properly initialized, you can proceed to configure your LXC container. You can either create a new container or modify an existing one. If you’re using LXC profiles, this setup can be applied globally to containers that need GPU access.

To allow the container access to the NVIDIA devices, you’ll need to define the correct cgroup device rules based on the major device numbers. First, check the major and minor numbers of the GPU device nodes:

ls -l /dev/nvidia*

Example output:

crw-rw-rw- 1 root root 195,   0 Feb 24 17:12 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Feb 24 17:12 /dev/nvidiactl
crw-rw-rw- 1 root root 509,   0 Feb 24 17:12 /dev/nvidia-uvm
crw-rw-rw- 1 root root 509,   1 Feb 24 17:12 /dev/nvidia-uvm-tools

From this output, you can see that the major device numbers used by the NVIDIA driver are 195 and 509. These values will be used to grant access within the container via cgroup device rules in the container config or LXC profile.

To enable GPU access inside your LXC container, you need to update the container’s configuration to both allow access to the required device nodes via cgroups and mount them inside the container.

Edit the container configuration using:

EDITOR=nano lxc config edit <CONTAINERNAME>

Within the config: section, add the following raw.lxc block:

config:
    raw.lxc: |
        lxc.cgroup2.devices.allow = c 1:* rwm
        lxc.cgroup2.devices.allow = c 5:* rwm
        lxc.cgroup2.devices.allow = c 195:* rwm
        lxc.cgroup2.devices.allow = c 509:* rwm
        lxc.mount.entry = /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
        lxc.mount.entry = /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
        lxc.mount.entry = /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
        lxc.mount.entry = /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

This configuration explicitly permits the container to access all necessary NVIDIA character devices (with the major numbers 195 and 509 from before), and mounts the corresponding device files from the host into the container.

After saving the configuration, restart the container to apply the changes:

lxc restart <CONTAINERNAME>

Step 6: Install NVIDIA driver inside the LXC container

With GPU device access configured, the next step is to prepare the container environment by installing the NVIDIA driver, but only the user-space components, as the kernel modules are already loaded on the host.

Start by entering the container shell:

lxc exec <CONTAINERNAME> bash

Inside the container, install essential build tools:

apt-get install build-essential

Now, run the NVIDIA driver installer inside the container, but skip kernel module installation using the --no-kernel-module flag:

./NVIDIA-Linux-x86_64-570.86.15.run --no-kernel-module

You can safely ignore any warnings related to the absence of kernel headers or module build failures. Those are expected in a containerized environment where the host handles the kernel-level integration. The goal here is simply to install the NVIDIA user-space libraries (e.g. libnvidia-ml, libcuda, etc.) so that applications inside the container can interface with the GPU.

Once the NVIDIA user-space driver is installed inside the container, exit the container environment:

exit

Then, perform a full system reboot to ensure that all host and container configurations, including device bindings, udev rules, and driver installations are cleanly initialized:

shutdown -r now

Step 7: Use your GPU in LXC

There you have it. A straightforward guide to passing an NVIDIA GPU into an LXC container. With the right host setup and driver installation inside the container, you can enjoy GPU acceleration in lightweight, isolated environments. Perfect for AI, rendering, or any GPU-powered tasks. Happy containerizing!

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like