VMware vSphere / ESXi 7.0 GA work-around for GPU passthrough issues including disabled-after-reboot bug and UI bug
When I was upgrading my primary Supermicro SuperServer Workstation / Datacenter, I ran into some strange problems with getting passthrough working. What would happen is that I'd get everything squared away, and boot my Windows 10 VM with my AMD Radeon 7750 GPU successfully passed through, as I've been doing for many years, see:
- What fits in any home virtualization lab, has 8 Xeon cores, 6 drives, 128 GB memory, and 3 4K outputs from a Windows 10 VM? Your new Supermicro SuperServer Workstation!
Jul 15 2015
I went all-in with my ESXi 7.0 upgrade, so in my most crucial (but back-up protected) Windows 10 VM, I also upgraded my virtual hardware to version 17, and updated my VMware tools as well. After a reboot of my ESXi host, I noticed my Windows 10 VM wouldn't boot up. Then I noticed the reason, it turns out my passthrough settings weren't persisting through reboots. This was nerve-wracking, as I had work the next morning and had to figure out a way to get things square again, without resorting to falling back to 6.7U3, and/or reverting to backups of 1.8TB of data.
Gladly, I found a work-around for my new ESX 7.0, warning, it's pretty wonky, but quick-and-easy. It's not permanent though, you have to do this after every ESXi host reboot. If you found a better way around this, by all means drop a comment below to let us all know!
Note that I currently have no valid way of reporting such bugs to VMware, still working on that. When a new dot zero release like vSphere 7.0 came out on April 2nd, and isn't on the VMware Compatibility Guide, at least not yet, opening a per-incident ticket isn't an option. I tried! I'll explain all that in another article soon.
Meanwhile, after a few dozen attempts and reboots, I found a workaround that I published a video of back on April 9 of 2020, and now this article will hopefully help others as well. Strangely enough, over 500 folks have seen that video already, so unfortunately, I suspect I'm not along with my issue. I hope the next patch release fixes this issue, which I've also posted to the VMTN forum.
- In vSphere Client or ESXi Host Client, set both of your AMD GPU devices (video & audio) to passthrough
- Reboot the server
- After the reboot, if you use ESXi Host client and notice Passthrough status shows "Enabled/Needs reboot" instead of active, toggle both AMD devices off and then on again, you'll now see them both active, with no reboot required
- Now you can start your VM that uses the PCI device
- If you find your mappings are wrong and your VM still won't start, remove the PCI devices from the VM then re-add them again. This is covered in more detail in the video below.
All vSphere 7 articles.
All vSphere 7 videos.
- Solved: Upgrade from 6.7 to 7.0 and unsupported hardware
Apr 10 2020 by zwbee at VMware Technology Network Forums
Home > VMTN > VMware vSphere™ > ESXi > Discussions
I went to Host/Manage/Hardware and the HBA was showing up with the correct name. However, its passthrough setting had been switched to inactive. I toggled it to active, but there was an error in addition to the usual "Reboot required" message. I figured it didn't work, but I tried rebooting anyway. After reboot, the device now showed as passthrough active. Promising!