No future for booting ESXi from USB in VMware home labs, using max endurance microSD for ESXi 7.0U3 now, SATA/SATADOM or M.2 later

Posted by Paul Braren on Oct 26 2021 (updated on Nov 18 2021) in
  • ESXi
  • HowTo
  • Virtualization
  • VMUGAdvantage
  • vSphere7
  • sxi-system-storage-changes

    It's been well over a year since we learned that vSphere 7.0 was making big changes in filesystem layout of the hypervisor. It's gotten even tougher with 7.0 Update 2, more writes and some more troubles, see also recent articles about the use of USB devices to boot ESXi by William Lam and Duncan Epping. Unlike in the data center, in the home lab, compact servers, NUCs, and Mac minis are popular, but tend to have few onboard storage options. If you don't have a lot of SATA, M.2, or PCIe slots open for the hypervisor, what do you do to get away from low endurance USB? This article is about what I did to mitigate risk by moving away from cheap flash drives for my 8 core and 12 core SuperServer Bundles, hopefully some of what I learned helps you plan your next steps too.

    TLDR

    I seamlessly moved from USB Flash to USB microSD based on SSD TLC flash, without having to re-install or reconfigure ESXi. Exactly how is explained below. I might move to SATADOM later, if I'm willing to give up one of my 6 SATA ports on my SuperServer Bundle. For future, new home lab servers, I'll be avoiding USB altogether, perhaps going with M.2 SATA, getting prepared for upcoming vSphere 8. The fewer the watts left running 24x7, the better, so using a 2.5" SATA SSD just doesn't seem right to me.

    Warning - Note, cloning then booting two hosts with that cloned image simultaneously is a bad idea, which seems to be why cloning isn't officially supported at all by VMware, read Statement about supportability of cloning ESXi boot devices for deployments [84280] thanks to TinkerTry commenter below this article by Hans Boot here. Proceed at your own risk.

    As for me, I won't be opening any support tickets on this system, and will soon be rebuilding ESXi from scratch, likely on SATADOM for now.

    CONTENTS

    WHY

    In the VMUG Boston UserCon presentation that I'm co-presenting on October 27, fellow IT Professional Matt Kozloski put together these articles about changes in 7.0 Update 2:

    • kb.vmware.com/s/article/83963
      Bootbank cannot be found at path '/bootbank' errors being seen after upgrading to ESXi 7.0 U2 (83963)

      “USB devices have a small queue depth and due to a race condition in the ESXi storage stack, some I/O operations might not get to the device. Such I/Os queue in the ESXi storage stack and ultimately time out.”

    • kb.vmware.com/s/article/2149257
      High frequency of read operations on VMware Tools image may cause SD card corruption (2149257)

      Queues are one of the ways developers deal with race conditions!

      There were issues related to USB going way back to 6.0, when VMware began advising folks to load VMware Tools image in RAM to prevent frequent read operations on the SD media:

    • kb.vmware.com/s/article/83782
      ToolsRamdisk option is not available with ESXi 7.0.x releases (83782)

      Of note, this fix was added [back?] into vSphere 7.0 U3.

    But wait, there's more! I'll add these to your reading list:

    • blogs.vmware.com/vsphere/2021/09/esxi-7-boot-media-consideration-vmware-technical-guidance.html
      ESXi 7 Boot Media Considerations and VMware Technical Guidance
      Sep 30 2021 by vSphere Team at VMware vSphere Blog

      Historically, SD cards or USB devices have been chosen to free up device bays and lower the cost of installing ESXi hosts. Such devices, however, have lower endurance and exhibit reliability and issues over time. SD cards and USB drives may also exhibit performance issues and may not tolerate high-frequency read-write operations. We are now witnessing boot-related problems more frequently with ESXi 7.x with the hosts using SD cards or USB drives as boot media. This blog post will outline such issues in detail and provide the technical guidance to mitigate the same.
      ...
      Looking forward, the need for ESXi hosts to support other VMware or 3rd party solutions is ever-increasing. Therefore, the need for a more reliable, flexible, and high-performing storage device for ESXi 7.x system storage is a necessity.
      ...
      ESX-OSData partition must be created on a high endurance persistent storage device as there is an increase in IO requests sent to the ESX-OSData partition. The increased IO request is a result of multiple factors that have been introduced with ESXi 7.x such as:

      • Increased number of probe requests sent to check the device state, making sure they continue to service IO requests.
      • Scheduled scripts to backup system state, timestamp slightly contribute to the increased IO requests.
      • Also, more features and solutions store their configuration state on ESX-OSData, thus requiring it to be installed on a high-endurance, locally attached persistent storage device.
    • kb.vmware.com/s/article/2004784
      Installing ESXi on a supported USB flash drive or SD flash card (2004784)

      This article provides instructions for installing ESXi on a USB drive or SD flash card.

    • kb.vmware.com/s/article/82515
      Boot device guidance for low endurance media(vSphere and vSAN) (82515)

      What is supported for upgrades?
      For all upgrades ESXi 7.0 onwards, we continue to support existing boot devices.
      We still allow upgrading on USB, however an extra disk with the USB device is recommended.
      Having a low endurance device or a USB device might increase the risk of wear-out and might have unreliable outcomes.
      In addition, having a low performance device will affect some features in future ESXi versions that may depend on performance.
      We strongly recommend upgrading to 32GB or higher high-endurance and high-performance devices for upgrades and new installations.

      Important: If you install ESXi on M.2 or other non-USB low-end flash media, delete the VMFS datastore on the device immediately after installation. For more information on removing VMFS datastores, see the vSphere Storage documentation
      What is the plan to continue support for USB/SD boot for vSphere 7.0?
      Apart from bootbanks being larger, 7.x behavior is still the same as 6.x where things go to RAM disk if there's no high performance storage available
      The scratch partition can be configured elsewhere (eg. NFS) for logging, but isn't an ideal scenario
      As for supporting USB/SD devices, they can continue to use those devices, but they need to supply a secondary high-quality device. You can do this from the ESXi UI which allows them to delete datastores that are not in use or partedUtil to manually do it. For more information, please refer: > Using the partedUtil command line utility on ESXi and ESX
      Important: If you install ESXi on M.2 or other non-USB low-end flash media, delete the VMFS datastore on the device immediately after installation. For more information on removing VMFS datastores, see the vSphere Storage documentation

    Fun, eh? Are you getting the hint that USB boot media's days are done?

    EXPERIENCE

    I've personally experienced only one SanDisk Ultra Fit 32GB USB 3.0 Flash Drive failure these past 6 years of using about two-dozen of them in my home lab, with on average 2 ESXi hosts left running 24x7 while booted from USB. My 24x7 rig's needs are generally modest. I don't run vSAN full time, and I only have one to three dozen VMs on each host, with just a handful or two of the VMs booted at any one time. So it's very possible my USB flash drives aren't seeing a whole lot of writes.

    With well over a thousand SuperServer Bundles shipped with these drives since October of 2015, I've yet to receive reports of any failures of the SanDisk Ultra Fit 32GB drives that shipped with them. That doesn't meant they don't fail, it's just not been a big problem as far as I can tell, since I usually hear about most issues that Wiredzone's customers encounter, and I have hundreds of comments below my various related articles. These popular and versatile single-power-supply servers are typically used in home labs rather than in production, but there are many home labs that are really more like home data centers, driven quite hard 24x7. One extreme example is Citrix Guru Carl Webster, who is running a dozen Bundles in his home!

    With the writing on the wall as far as how much longer I can or should run USB based media in my home lab, the hunt was on for a workaround and/or solution. Here was my approach.

    First, see if I can do a low effort, safe, and easy way of moving to a more resilient boot media. The idea was to take my imaging technique from my article:

    clone-esxi-with-usb-image-tool

    Combine this with a much more resilient flash storage device that is also affordable an compact. I set out to find a drive that was small enough to allow the front door of the SuperServer to (almost) close all the way, but with a much higher endurance rating than the original general purpose SanDisk Ultra Fit 32GB. This wasn't hard, I had just researched exactly this sort of thing when seeking out the tiniest, lightweight device for reliable Dashcam/Sentry Mode recording of 4 video streams whenever I'm driving or parked. Constantly abused.

    Same goes for my hypervisor. I'd like to not have to worry about my boot device failing, at least long enough to make it to vSphere/ESXi 8.0, whenever that arrives. Despite reading that I might get USB device in use for ESXi warnings in vSphere Client, I have yet to see any, even with VCSA 7.0U3a/ESXi 7.0U3. But I still wanted to put this issue to rest, and not bump into some big ugly warnings about my use of USB while recording videos of the next big vSphere patch or update.

    So if you also have your ESXi installed on a 32GB flash drive, here's my work-around that buys me some time, making a boot device failure far less likely than it would be if I did nothing with my 5 year old systems. If you are booting from a 64GB or larger flash drive, you'll need to image to a same or larger microSD drive.

    The beauty of my approach detailed below is that you don't have to have VCSA or vLCM capabilities. It should "just work" with any USB based ESXi home lab, very much in the same spirit as my related article:

    SanDisk-IMG_0743
    Notice the packaging claims "Up to 15,000 hours of continuous video recording." Got to love those "up to" weasel words, eh?

    This little project cost me a total of $11.89 plus $12.99 = $24.88. Not bad, and the whole process of moving my ESXi only took about an hour of my time from start to finish, mostly unattended, only requiring a few minutes of mouse clicks. The process has very little likelihood of a user error, and it's very safe, since you can always put the original USB Flash drive back in should things somehow go sideways at any point.

    I'm not saying it's the work-around below is for everybody. But at least for Bundle owners, it seems to be a decent way to get some peace-of-mind, avoiding the dreaded inability to boot someday at a most inopportune moment.

    Warning - This is an article about home labs, it's not about production environments. You are doing this completely unsupported ESXi transplant at your own risk. Also, this is a consumer flash device, and as such, there's no published TBW (Terabytes Written) rating to be found, no DWPD either (Drive Writes Per Day) or Endurance Rating either. Contrast that with the Intel SSD D3-S4510 M.2 SATA intended for things like Dell BOSS-2 module for ESXi for production workloads.

    sandisk-max-endurance-uhs-i-microsd

    Here's the info that Western Digital publishes about the SanDisk MAX ENDURANCE microSD™ Card:

    Built to capture up to 120,000 hours (over 13 years) of footage for your home security cameras or dash cam, the SanDisk® MAX ENDURANCE microSD™ card is engineered not only for continuous recording and re-recording, but also for continuous peace of mind for years to come.

    Designed to last, this microSD card can withstand a variety of extreme weather conditions because it’s temperature-proof, waterproof, shockproof, and X-ray proof. With capacities of up to 256GB, you can record and store more Full HD or 4K videos3. And, with read speeds up to 100MB/s, you’ll be able to spend less time transferring and backing up video footage, and more time living life.

    reverse-engineering-and-analysis-of-sandisk-high-endurance-microsdxc-card

    How about trying to figure out which exact NAND flash is used for this microSD device? Well, that turns into quite he rabbit hole, check out this incredible deep dive, Reverse-engineering and analysis of SanDisk High Endurance microSDXC card:

    TL;DR – The SanDisk High Endurance cards use SanDisk/Toshiba 3D TLC Flash. It took way, way more work than it should have to figure this out (thanks for nothing, SanDisk!).
    In contrast, the SanDisk MAX Endurance uses the same 3D TLC in pMLC (pseudo-multi-level cell) mode.

    At least it's not QLC, so I'm good with it.

    PREREQUISITES

    • A windows system to run USB Image Tool on, with one available USB Type A port.

    BUY

    Step 1

    Purchase one of each:
    This is not a sponsored post. The Amazon Associates links below may earn the author fees, at no cost to readers.

    If your ESXi is installed on a larger USB flash drive, just choose a larger SanDisk Max Endurance drive at the same shopping link below. Hopefully what you buy is the same or slightly larger than the drive you have, otherwise the imaging process will fail.

    B084CJLNM4

    B07G5JV2B5

    CLONING STEP-BY-STEP

    Step 2

    Shutdown your ESXi host.

    Step 3

    Remove your USB flash drive.

    Step 4

    Use USB Image Tool to create an image of your ESXi host's USB flash drive.
    This is done by temporarily inserting the USB flash drive into a Windows system then running USB Image Tool. Step-by-step instructions at TinkerTry here, or watch my video below. There may be similar imagine tools on Mac, but that's not something I've tested. When you're done, eject the USB flash drive.

    USB Image Tool for Windows easily backs up and restores complete VMware ESXi installed on USB or SD

    Step 6

    Install the new microSD card into your new microSD reader, then insert the assembly into your Windows system.

    Step 7

    Use USB Image Tool to restore an image of your ESXi host to your new microSD.

    Step 8

    Insert your newly imaged microSD assembly into the same USB Type A port of the ESXi host where the USB Flash drive previously was.

    Step 9

    Power your ESXi host back up.
    Your vSphere/VCSA and your ESXi won't even notice anything has changed! Since you're using the same USB slot as before, you shouldn't have to change your boot order in your BIOS. If it does fail to boot, be sure to verify that your BIOS boot order is set correctly.

    I had hoped to use a USB to SATA to SATADOM adapter that I rigged up for an easy migration, but sadly, I couldn't get it to work. It turns out that even if I had, the image restore would have failed anyway, since my 32GB SATADOM module formats to a capacity slightly less than the source SanDisk Ultra Fit 32GB. The filesystem layout that ESXi would do on USB differs from what it would do on a SATA flash drive, so such a kludge really wasn't a good idea in the first place.

    ESXi-on-USB-Flash--TinkerTry
    `df -h` - ESXi on SanDisk Ultra Fit 32GB
    ESXi-on-microSD--TinkerTry
    `df -h` - ESXi on SanDisk Max Endurance microSD 32GB

    Step 10 (Eventually, maybe)

    I plan to do a fresh install of ESXi 7.0 Update 3 onto my SATADOM module that finally arrived after months of backorder (supply chain) at some point. Doing so will mean I'll have to free up the yellow (powered) SATA port on my Supermicro SuperServer motherboard to this duty, giving up the existing wired connection to one of my two internal 2.5" drive bays. It also means I won't have the ease of imaging my ESXi prior to upgrades. This also means I'm not feeling particular rushed to make the move to SATADOM, given it's highly likely my microSD will last just fine well past whenever ESXi 8.0 arrives in 2022 or beyond. But if what happened to this guy happens to me, at least I have a SATADOM module on hand.


    Nov 04 2021 Update

    VMware vSAN - Oct 14 2021 - vSAN Quick Questions - What type of boot device should be used with vSAN?

    Nov 18 2021 Update

    I might also consider re-using an old, small SATA SSD for my ESXi install in the future, see also Hans Boot's comment below. Article title updated accordingly.


    See also at TinkerTry

    more-reliable-booting-of-esxi-7-from-microsd

    clone-esxi-with-usb-image-tool

    nice-little-usb-flash-drive-choice-for-that-esxi-in-your-home-lab

    See also

    considerations-for-future-vsphere-homelabs-due-to-upcoming-removal-of-sd-card-usb-support-for-esxi

    booting-esxi-from-sd-usb-devices-time-to-reconsider-when-buying-new-hardware

    vsphere-7-0-u1-u2-important-upgrade-notice