AMD Ryzen's VMware ESXi 6.5 PSOD workaround is to slow system by 30%, currently not a compelling Intel Xeon D alternative

Posted by Paul Braren on Apr 10 2017 (updated on Jul 31 2017) in
  • Virtualization
  • ESXi
  • HomeLab
  • Jul 31 2017 Update - In AMD's EPYC CPU, this issue has been fixed in VMware ESXi 6.5 Update 1, more detail below.


    AMD-Zen

    AMD's recent release of an entirely new CPU family that claims to be a true Intel competitor is big news. It's been many years since we've had a splash from AMD, which is part of why there's been considerable excitement about AMD Ryzen™ and The "Zen" Core Architecture built with AMD SenseMI Technology:

    • Pure Power
    • Precision Boost
    • Extended Frequency Range
    • Neural Net Prediction
    • Smart Prefetch

    Historically, AMD CPUs have been a bit more affordable than Intel for a given performance category, at the cost of some efficiency. Ryzen watt burn compared to Xeon D seems to be no exception, for workloads where you value multithreaded performance (juggling many VMs easily) more than pure GHz (occasional use gaming/single-user workstation). Generally, for virtualization labs left running a mix of VMs 24x7, the more cores with their own cache and communication lanes that still run efficiently at idle, the merrier.

    Home labs and mixed CPU clusters

    Mixing AMD and Intel in a home lab cluster means no VMware vMotion, at least not while the VM is running. So if you're thinking of going with multiple cluster nodes, and you want to move VMs from server to server, it's generally best to go all in with just Intel CPUs or just AMD CPUs. Ideally, using of similar vintage and capability within each CPU family. This way, you can avoid having to dumb the whole vSphere cluster down to the oldest CPU family in the bunch, using EVC if you got it.

    Of course, you could also avoid vMotion entirely, migrating only when VMs are powered-off instead. But where's the fun in that? Note that since 2012, with shared-nothing vMotion, you no longer need to have an external shared datastore to enjoy vMotions and migrations. If you have vCenter or VCSA installed and configured, you have Live Migration of Workloads, moving running VMs between servers easily, even if you only have local storage inside each system:

    vSphere vMotion allows you to move an entire running virtual machine from one physical server to another, without downtime. The virtual machine retains its network identity and connections, ensuring a seamless migration process. Transfer the virtual machine's active memory and precise execution state over a high-speed network, allowing the virtual machine to switch from running on the source vSphere host to the destination vSphere host. This entire process takes less than two seconds on a gigabit Ethernet network. This capability is possible over virtual Switches, vCenter Servers, and even long distances.

    See also:

    Patrick's Ryzen article at STH

    I believe many TinkerTry readers following my home lab tinkering are likely to enjoy this new article by Patrick Kennedy, founder of Serve The Home, where he goes through how he convinced his Ryzen system and VMware ESXi 6.5 to get along with one another. There's a short excerpt below, but you really should read the whole thing:

    • AMD Ryzen “Working” With VMware ESXi 6.5
      amd-ryzen-working-with-vmware-esxi-6-5

      Apr 08 2017 by Patrick Kennedy at STH

      Since we already covered Debian based Ubuntu as well as CentOS and how to fix their crashes with Ryzen and get those systems working, we instead changed our attention to VMware ESXi 6.5. The current AMD Ryzen 7 1700 is a low cost 8 core 16 thread option. Especially given the fact that for $329 you get a CPU and heatsink, you can get a $99 motherboard, and add up to 64GB RAM for around $400. That means that with an inexpensive case and PSU for around $900 you can have an 8 core 16 thread system with 64GB of RAM making it one of the best value home server systems around. The question we had was whether VMware ESXi 6.5 would work with AMD Ryzen 7 chips.

    Closing thoughts

    This AMD issue goes well beyond the PSODs that Skull Canyon Intel NUCs had, where William Lam guided folks to simply turn of Thunderbolt Controller in the BIOS during install. It also goes well beyond the missing network driver issue Kaby Lake Intel NUCs have, with the workaround explained by Florian Grehl here. These fairly simple workarounds don't degrade performance.

    I genuinely hope that AMD takes VMware ESXi Hypervisor testing more seriously in the future, as keeping Intel on its toes is competitive pressure would generally do us all some good. As it still stands today, Xeon D still presents a more compelling choice for home virtualization lab enthusiasts, especially since Xeon D has been capable of 128GB of RAM since launch, while Ryzen currently tops out at 64GB. From a VMware ESXi support perspective, Xeon D is on the VMware HCL, and AMD Ryzen is not. And with Ryzen, you also don't get 4 Intel network ports designed right into most Xeon D systems, 2 of them 10GbE. So even if Ryzen and VMware do eventually square away this PSOD issue, these other factors listed here will still need to be considered when shopping.

    What about the higher-end NUCs? Not as many cores, really just mobile (laptop) CPUs in a tiny package with one NIC port and no IPMI/iKVM. Good for lighter workloads, especially if the 32GB limit won't be holding you back. See detailed cache and core count comparison between the Skull Canyon NUC and all the Supermicro Xeon D systems here. The new Kaby Lake Core i7 NUC replaces the Skull Canyon model as the closest-to-NUC Xeon D available with a slight speed boost, but still has the 32GB max single NIC/mobile processor limitations.

    Unfortunately, the first wave of AMD Ryzen reviews and comments out also there seem to indicate Ryzen isn't going to exert much market pressure as far as a VMware ESXi platform for home labs. At least not yet.

    Others' Ryzen+ESXi observations

    Video

    "AMD Ryzen with VMware ESXi 6.5 - Ultimate Homelab? We Tried It" ServeTheHomeVideo YouTube Channel, Mar 05 2017.
    "SenseMI: True Intelligence built into your AMD Ryzen 7" AMD YouTube Channel, Mar 02 2017.

    Jul 31 2017

    Thanks to TinkerTry reader Seneca Pierson leaving this comment, we've been informed that AMD EPYC 7XX1 Series support has arrived with the release of ESXi 6.5 Update 1, seen on the VMware Compatibility Guide here, currently appearing as pictured below.

    AMD-Ryzen-support-for-ESXi65U1-arrives-by-TinkerTry
    VMware Compatibility Guide - EPYC, as of Jul 31 2017.

    See also:


    See also at TinkerTry

    may-2017-intel-and-amd-datacenter-and-workstation-cpu-announcements
    compare

    See also