virt-man GPU Passthrough IOMMU and Kernel Woes - Debian/Liquorix
by obobskivich from LinuxQuestions.org on (#6QXK2)
Sorry for the maybe not so specific title - kind of down a rabbit hole on this one.
Objective: I want to use virt-man to host a Windows 7 VM on my workstation with a GPU passed through to it for 3D acceleration in Windows.
System specifics:
Core i9-10900
ASUS STRIX Z590E
GPU 1 ('host GPU'): Radeon 6900XT ('Navi 10')
GPU 2 ('VM GPU'): Radeon Vega Frontier ('Vega 10')
Host OS: SparkyLinux 8
VM OS: Windows 7 Ultimate x64
Where I've gone:
I followed the following guides/threads as well as project documentation:
https://www.youtube.com/watch?v=3yhwJxWSqXI
https://www.youtube.com/watch?v=GbhUBQdMoJg
https://www.debugpoint.com/kvm-share...windows-guest/
https://askubuntu.com/questions/1472...oup-not-viable
https://stackoverflow.com/questions/...an-iommu-group
https://mathiashueber.com/windows-vi...hrough-ubuntu/
What works:
- Windows 7 is installed, updated, and working in the 'normal' VM with no GPU passthrough, and its provisioned (4 CPUs + 32GB of RAM)
- virt-man is installed
- All hardware is installed and has been separately tested in host OS
- All of the vfio steps from the Chris Titus guide were verified and working, and the Vega was showing as a vfio-pci via lspci -nnv
Problems I have run into:
- On this motherboard, both of the PCIe slots are in IOMMU Group 1, which caused virt-man/QEMU to stall attempting to start the VM because only one GPU was passed through
- I was not able to get the WinFSP file sharing to work between the two, because I couldn't find a compatible version of the virtio-guest-tools for Windows 7 that would enable virt-fs, but I've worked around this by using the 'pass USB device' and basically sneaker-netting a thumb drive that I've left plugged in between the host and VM
The IOMMU thing is the bigger blocking issue - from doing some reading this is likely a firmware limitation of this motherboard, but can be worked around with the ACS kernel patch. I read and learned about an alternative kernel (Liquorix) that is supposed to be built with the ACS patches and this is maintained in their tree, so I went ahead and installed the Debian version of that, and that worked, following their project documentation (https://liquorix.net/).
The machine boots fine, uname -r now displays:
Code: 6.10.11-1-liquorix-amd64Problems:
- The Vega no longer shows as a vfio device, none of the modprobe.d or initramfs files were modified or removed by the liquorix install, but appear to just be fully ignored even after running an update on initramfs, and rebooting, the Vega is loading AMDGPU again instead of vfio-pci. I don't know what to do to rectify this.
- IOMMU doesn't have as verbose of an output following loading this kernel, but still appears to be active. I'm not sure if there are other hooks that need to be added to grub_cmd_linux_default to change this, I have left the intel_iommu and iommu hooks as before.
Open questions/what I'm asking for help with
1) What would be the best mechanism to revert the liquorix install if I wanted to do that?
2) How to get it to abide the vfio settings per the CTT guide with the new kernel?
3) If we can sort out #2, how to enable the ACS patch via boot options in grub and hopefully thus solve the problem?
Let me know if I have left any detail out - this has been an afternoon of tinkering and I'm asking for help having hit a wall, but may have forgotten some detail or other after staring at this for hours.
Objective: I want to use virt-man to host a Windows 7 VM on my workstation with a GPU passed through to it for 3D acceleration in Windows.
System specifics:
Core i9-10900
ASUS STRIX Z590E
GPU 1 ('host GPU'): Radeon 6900XT ('Navi 10')
GPU 2 ('VM GPU'): Radeon Vega Frontier ('Vega 10')
Host OS: SparkyLinux 8
VM OS: Windows 7 Ultimate x64
Where I've gone:
I followed the following guides/threads as well as project documentation:
https://www.youtube.com/watch?v=3yhwJxWSqXI
https://www.youtube.com/watch?v=GbhUBQdMoJg
https://www.debugpoint.com/kvm-share...windows-guest/
https://askubuntu.com/questions/1472...oup-not-viable
https://stackoverflow.com/questions/...an-iommu-group
https://mathiashueber.com/windows-vi...hrough-ubuntu/
What works:
- Windows 7 is installed, updated, and working in the 'normal' VM with no GPU passthrough, and its provisioned (4 CPUs + 32GB of RAM)
- virt-man is installed
- All hardware is installed and has been separately tested in host OS
- All of the vfio steps from the Chris Titus guide were verified and working, and the Vega was showing as a vfio-pci via lspci -nnv
Problems I have run into:
- On this motherboard, both of the PCIe slots are in IOMMU Group 1, which caused virt-man/QEMU to stall attempting to start the VM because only one GPU was passed through
- I was not able to get the WinFSP file sharing to work between the two, because I couldn't find a compatible version of the virtio-guest-tools for Windows 7 that would enable virt-fs, but I've worked around this by using the 'pass USB device' and basically sneaker-netting a thumb drive that I've left plugged in between the host and VM
The IOMMU thing is the bigger blocking issue - from doing some reading this is likely a firmware limitation of this motherboard, but can be worked around with the ACS kernel patch. I read and learned about an alternative kernel (Liquorix) that is supposed to be built with the ACS patches and this is maintained in their tree, so I went ahead and installed the Debian version of that, and that worked, following their project documentation (https://liquorix.net/).
The machine boots fine, uname -r now displays:
Code: 6.10.11-1-liquorix-amd64Problems:
- The Vega no longer shows as a vfio device, none of the modprobe.d or initramfs files were modified or removed by the liquorix install, but appear to just be fully ignored even after running an update on initramfs, and rebooting, the Vega is loading AMDGPU again instead of vfio-pci. I don't know what to do to rectify this.
- IOMMU doesn't have as verbose of an output following loading this kernel, but still appears to be active. I'm not sure if there are other hooks that need to be added to grub_cmd_linux_default to change this, I have left the intel_iommu and iommu hooks as before.
Open questions/what I'm asking for help with
1) What would be the best mechanism to revert the liquorix install if I wanted to do that?
2) How to get it to abide the vfio settings per the CTT guide with the new kernel?
3) If we can sort out #2, how to enable the ACS patch via boot options in grub and hopefully thus solve the problem?
Let me know if I have left any detail out - this has been an afternoon of tinkering and I'm asking for help having hit a wall, but may have forgotten some detail or other after staring at this for hours.