VMworld 2020 Recap: VMware vSphere 7.0U1 advancements: SmartNICs (aka DPUs) are going to be big!

Posted by Paul Braren on Oct 1 2020 (updated on Oct 16 2020) in
  • ESXi
  • Virtualization
  • VMworld
  • TLDR version - VMware vSphere with Tanzu allows vSphere Enterprise Plus customers to support Kubertnetes workloads without requiring the full VMware Cloud Foundation stack. Customers won't have to deploy NSX and vSAN to get Kubernetes workloads, and deployment time should be around an hour. Note that a full VCF deployment is still preferred for maxiumum scalability. Also, Project Monterey provides a forward-looking glimpse into why VMware ESXi support for ARM exists (SmartNICs / DPUs), meanwhile, home-lab tinkerers can give ESXi on ARM a try now using the VMware Fling for their Raspberry Pi, or even on their Nintendo Switch!


    duncan-virtual-events-my-thoughts

    While it's disappointing we couldn't be together in person at VMworld 2020, VMware did a great job given the circumstances, and it's a very good thing for social inequality that the barrier to attending VMworld wasn't the price this year. Yes, this was by far the most inclusive VMworld ever, $0 to register and attend a vast majority of the event, versus thousands it cost all prior years. It's also good that IT Professionals, even those just starting out or from far away places, are able to invest in themselves by setting aside a few hours for 3 days, joining online to dive into the content, and listening to replays at a later date as needed. And of course, attendance online has much lower carbon footprint too, with the sustainability and inclusion angle taken by Duncan Epping at Yellow Bricks too.

    vmware-lets-recap-vmworld-2020-day-2-the-challenges-facing-our-time
    Fireside Chat: The Extraordinary Events of 2020, NYC Business News Anchor Hope King and VMware CEO Pat Gelsinger

    Based on many of Pat Gelsinger's VMworld 2020 keynote and roundtable, it's clear that the goals of reducing social inequity, increasing sustainability, and being a family-friendly flexible employer are on his mind these days. So it's consistent with these themes when Pat mentioned that a majority of VMware employees won't be required to return to a VMware office, even after this pandemic is over. Instead of my paraphrasing, to hear what Pat actually said about these topics, have a listen to his compelling CEO Fireside Chat VI3353 Watch On Demand, once available. Tonight, the button doesn't seem to be working quite yet. His ethical outlook is refreshing and timely. You can read a bit more about this special event in VMware's post Let’s Recap – VMworld 2020 Day 2: The Challenges Facing Our Time.

    Now let's have a quick look back at my opening paragraph from the last time I was able to attend VMworld:

    My VMworld 2018 US experience
    I had a wonderful week, and I managed to squeeze in the time to record 9 vendor visit videos too! They're mostly focused on what pertains to the home lab enthusiast of course, and the growing IoT/Edge Computing trend is likely to benefit all of us, both at work, and at home. Big jumps forward with vSAN configuration and support too!

    In contrast, I'm sure VMworld 2020 was good for many, but I confess, I was only able to attend a few sessions this time, still working on catching up with the session replays. I did find it a bit odd that some of the sessions I wanted to get into were actually full, which surprised me a bit, perhaps limitations on concurrent viewers in Zoom? I admit I didn't register far enough in advance, so that's mostly on me, and many of the sessions I missed are available for replay.

    This article is focused on what's new that caught most of my attention, at least so far. It's an admittedly biased selection of announcements of interest from either a home lab sysadmin perspective or of most relevance to my work with enterprise customers at my day job. There's quite a bit of overlap, and I'll need some more time to ponder what's been announced this week before I'm able to synthesize any deeper observations.

    I don't want to harp on what is missing from not being in attendance in person and presenting on a stage with interactive feedback during Q&A, along with meeting appreciative readers and customers while working the booths. The COVID-19 situation we're all in together demands a different approach of course, and VMware certainly did a fine job of it considering.

    There were many vSphere 7.0 Update 1 updates to functionality, it's not just a patch. It has better monster VM support for example. Details at:

    vsphere-7-update-1-unprecedented-scalability
    • vSphere 7 Update 1 – Unprecedented Scalability
      Sep 24 2020 by Jatin Purohit

      In vSphere 7 update 1, the total number of ESXi hosts in a vSphere Cluster is now increased to 96 hosts compared to 64 hosts in a previous release. Starting from vSphere 7 Update 1, you can run up to 10000 VMs in a vSphere cluster compared to 6400 VMs in vSphere 7.
      ...
      Note: vSAN cluster continues to support 64 hosts.

      Scale1-1536x742

    Keep in mind that just because you can do it doesn't necessarily mean you should do it.

    My VMworld 2020 learnings, so far

    Let's get started! If you wish to jump ahead, I've built a Table of Contents for you.

    Table of Contents

    Server Hardware Technology Advancements

    Release date

    See also

    Server Hardware Technology Advancements

    VMware

    There is a whole lot to unpack here, this first blog post by VMware is just a taste.

    announcing-project-monterey-redefining-hybrid-cloud-architecture
    • Announcing Project Monterey – Redefining Hybrid Cloud Architecture
      Sep 29 2020 by Kit Colbert at VMware Blogs

      ...
      This year at VMworld 2020, we are announcing a continuation of the rearchitecture started by Project Pacific, this time focused on hardware architecture. We call this effort Project Monterey.
      ...

      1310958384926126086

      What we see is that these new apps are using more and more of server CPU cycles. Now traditionally, the industry has relied on the CPU for everything – application business logic, processing network packets, specialized work such as 3D modeling, and more. But as app requirements for compute have continued to grow, hardware accelerators including GPUs, FPGAs, specialized NICs have been developed for processing workloads that could be offloaded from the CPU. By leveraging these accelerators, organizations can improve performance for the offloaded activities and free up CPU cycles for core app processing work.
      ...
      Introducing Project Monterey
      We are introducing Project Monterey, a new technology preview, to solve these exact challenges. Project Monterey is a rearchitecture of VCF from the hardware up to support all the new requirements of modern applications enabled by Project Pacific. It leverages a new hardware technology called SmartNIC to deliver maximum performance, zero-trust security, and simplified operations to VCF deployments. More amazingly, by leveraging SmartNIC, Project Monterey extends VCF to support bare metal operating systems and applications! And of course, it delivers this across all the locations VCF runs today – data center, edge, and cloud – reducing TCO across the board. In order to realize Project Monterey, we are partnering with a broad set of SmartNIC vendors and server OEMs to deliver an integrated solution to customers.

      Project-Monterey-What-is-a-smartNIC-1024x390
      "What is a SmartNIC?" image above is from the VMware blog post. You should read the whole article.

      ...
      Evolving VCF Architecture
      Project Monterey is a redesign and rethinking of VCF to take advantage of these disruptive hardware capabilities. Fundamentally we are moving functionality that used to run on the core CPU complex to the SmartNIC CPU complex:

      Project-Monterey-new-VCF-architecture-1024x408

      ...

    Now let's go for a completely different approach to understand all this, from more of a hardware perspective, and where the industry is heading.

    STH

    Patrick Kennedy from STH dives deep here, see

    what-is-a-dpu-a-data-processing-unit-quick-primer

    Here we have Patrick going deep into the weeds into what a Data Processing Unit or DPU is, and what it can do from a broader, industry-wide perspective. Let's have a look at some teaser excerpts that help clarify, but I encourage you to read the whole article.

    Recently we saw one of the more momentous shifts in marketing. NVIDIA Networking, formerly Mellanox, shifted the naming of its SmartNICs from “IPUs” to “DPUs” or Data Processing Units. It seems like the industry is moving to call these networking devices that function as mini-servers themselves into DPUs, so it is time for a primer. In this piece, we are going to discuss what is a DPU. We are then going to discuss a few of the companies we are covering in the space to show how there is a common feature set.
    ...
    While many other features may be present on a DPU, the common features of DPU chips seem to be:
    ...
    Runs its own OS separate from a host system (commonly Linux, but the subject of VMware Project Monterey EXSi on Arm as another example)
    ...

    Ok, with that groundwork now laid out, here's what really caught my attention, it's going to be a big deal! Patrick continues:

    For storage, one can present the DPU in a host system as a standard NVMe device, but then on the DPU manage a NVMeoF solution where the actual target storage can be located on other servers in other parts of the data center. Likewise, since the DPUs usually have PCIe root capability, NVMe SSDs can be directly connected to the DPU and then exposed over the networking fabric for other nodes in the data center, all without traditional host servers.

    I know, right? Think of what this could mean for vSAN? NSX?

    Now you're ready to also read or watch:

    vmware-project-monterey-esxi-on-arm-on-dpu
    • VMware Project Monterey ESXi on Arm on DPU
      Sep 29 2020 by Cliff Robinson at STH

      VMware is bringing an AWS Nitro-like feature to VMware ecosystems in the future. AWS Nitro is really a class-leading design that started in AWS around 2013. Although this is a roadmap feature, and still not at GA, it is the direction that VMware has announced it is moving.
      ...
      With Project Monterey, we had a GPU/ SmartNIC that can run various VMware services. These include NSX for networking and security, vSAN for data storage, and host management.
      ...

    Dell Technologies

    And finally, given Dell co-engineers VxRail with VMware and sells VCF solutions, it's nice to see this co-announcement that starts to tie it all together, at least for me:

    dell-technologies-teams-with-vmware-on-project-monterey
    • Dell Technologies Teams with VMware on Project Monterey
      Sep 29 2020 by Paul Perez at Dell Technologies Direct2DellEMC

      Next-Generation Infrastructure for HCI and Beyond
      In computing, architecture is all about finding balance by taking advantage of cheap, plentiful resources and maximizing utilization of scarce, expensive ones to optimize yield, performance and cost per computation of applications.

      Project Monterey which VMware introduced earlier today creates a new type of disaggregation and, therefore, composability options to balance those resources. Modern applications and operating environments require modern infrastructure. At Dell Technologies, we are reimagining next-generation building blocks to enable closer cooperation between the future VMware Cloud Foundation (VCF) infrastructure overlays and our future infrastructure underlays.
      ...
      In hyperconverged systems, like our industry-leading VxRail offering co-developed with VMware, infrastructure and application VMs or containers co-reside on relatively coarse common hardware and contend for resources. As we introduce hyper-composability, we will develop finely disaggregated infrastructure expressly enhanced for composability and therefore tightly integrated and optimized by both soft- and hard-offload capabilities to SmartNICs and/or computational storage.

      Customers will benefit from the simplicity afforded in VMware’s infrastructure overlay and the flexibility of having tailored hardware infrastructure in Dell Technologies’ underlay with no waste relative to workload demands.

      We have already demonstrated joint working prototypes internally and have committed to deliver offers to the market. Stay tuned!

    Virtually Speaking Podcast

    171

    I found listening to this overview podcast episode really helped:

    It features Kit Colbert, VP & CTO at VMware for CPBU (Cloud Platform Business Unit), and Lee Caswell, VP of Marketing at VMware for CPBU.

    Kit Colbert, at this spot:

    It's been this nice sort of kind of cadence of announcements over the last few years, I feel like. We were really excited last year when we announced project Pacific, and this ability to deeply integrate Kubernetes into vSphere, and just all the benefits that could bring to our customer base, to IT Ops and Developers. And then you know releasing Project Pacific first as VMware Cloud Foundation with Tanzu, and then just a couple of weeks ago as vSphere with Tanzu, and again more options on how to consume that. So then you know I feel like we're following up this year with sort of the next steps of those announcements, so if you think about Pacific as really rethinking the consumption layer of vSphere and kind of a fundamental rearchitecture there to enable that, what we announced this year with Project Monterey is very complimentary in the sense that we're rethinking the hardware architecture. So the way to think about this is Pacific kind of enables these next generation applications to run much more seamlessly more natively on top of vSphere and VCF. What that does is creates unique hardware challenges and constraints. You see these apps driving more network utilization and you see things like the need for specialized accelerators, you see a lot of operational complexity for managing all that. Security issues happening there. So Monterey is really focused on rethinking the underlying hardware architecture to better support all these new apps...

    Pete Flecha:

    And how does it do that, at a high level?

    Kit:

    ... Fundamentally what Monterey leverages is a new technology called SmartNIC, and so SmartNIC is like its name says it's a NIC, except that it has a general purpose CPU on it. So what this means is that we can now take a lot of the processing that we had to do on the core server CPUs on the x86 side and move this off to the SmartNIC. And so this is thinks like Network IO processing, Storage IO processing, a lot of the security that we do. And so you know very concretely what we're doing is in fact running ESXi on this SmartNIC. So this is kind of mind blowing right? So now there's actually 2 copies of ESX on a single physical server. And you know we actually demo'd this last year, we had a SmartNIC demo that Greg Lavender did at the end of the keynote, I think the Day 1 keynote, at VMworld. And so we showed the kind of a single physical server with I think it had 4 SmartNICs in it, so you had a 5 ESXi host cluster on a single physical host, and that was just to kind of demo it. So what we're coming around back with this year is saying hey, we are actually taking it a step further, and looking at here's what the architecture looks like, here's what the use cases look like, here's what the benefits look like. So, because we have SmartNICs there, it allows us to really rethink our ESXi architecture, where we can move NSX, vSAN, host management functionalities that used to be done on the core CPU server, server CPU I should say, out to the sSmartNIC. That is beneficial for many reasons, it allows it to be much faster there on the SmartNIC, it's right there in line with the IO traffic, and it frees up the CPU cycles for you applications...Now we're not seeing it for all customers, we're really seeing it for our largest customers. Large enterprises, financials, telcos, there the ones that are moving up to 50G and 100G type architectures.
    ...
    It also has some server management capabilities. So for instance, it can you know because it's running there on the network link, it can you know do PXE booting to that server...it can deliver whatever guest OS image you want, it's not a guest OS, it's a bare metal OS...So by default of course we will have the ESXi image that we deliver to the x86 that boots up ESX. But there's no reason we couldn't support Windows or Linux either...We can now deliver bare metal operating systems and manage those within ESX, within vSphere, within VCF. Now of course they're like VMs in many ways that you can lifecycle manage them, and this sort of stuff. You can't vMotion them yet, you can't do everything, because there's still that bare metal component, but we can deliver NSX and storage capabilities...we can deliver all the networking capabilities and the security capabilities there as well...So now it starts opening up bare metal support on top of vSphere on top of VCF, something I think a lot of people didn't think was possible. But because of these hardware innovations, we can do...

    Pete asks here:

    What does that mean to the customer, so what do they need to actually take advantage of this? (Project Monterey)

    Kit:

    So we're just at the beginning here, so what we announced officially was the technical preview. So we have a lot of the stuff in house, and we're kind of working on it, continuing to evolve it, making it readsy for production, working closely with our hardware partners to make it a reality yet, customers can't get it yet though tech preview, but we're working closely with customers to get their feedback on the design...We want to ear your feedback, what are the key use cases for you.

    For me personally, I'm immediately brought back to recent conversations I've had with just such a large customer of mine who was seeking out FPGA support in their HCI solution. This is exciting.

    Video

    STH - Sep 29 2020 - What is a DPU - A Quick STH Primer to the New Processor
    1311437424816467969
    Lee Caswell, VP of Marketing at CPBU in VMware, goes on the theCUBE to explain Project Monterey in just 2 minutes 16 seconds.

    VMware vSphere 7.0 Update 1 Release Date

    I don't think a date has been announced yet, but it's likely sometime this month, see also what I wrote here:

    vmware-vsphere-7-update-1-announced

    Oct 02 2020 Update

    Closing Thoughts

    It seems that ARM processors used for VM workloads instead of Intel or AMD isn't really the focus here. That said, the impact to the tech used to run the datacenter of the future is intriguing to me, and there will be plenty to learn. I found the PXE boot and native OS deployment abilities of these SmartNICs most intriguing, bring back fond and not-so-fond memories of my work with trying to automated the deployment of a thousand blades at a large bank using RapidFlash. The possibilities for large, enterprise-focused OEMs like Dell, HPE, and Lenovo are very interesting to me as well.

    It also seems that I have a whole lot more reading and viewing to do of the content catalog.

    Meanwhile, I've refined the overall flow and organization of the above article a bit, and added these closing thoughts. I'm sure many more thoughts will be hitting me in the days to come, as I catch up on podcasts while doing housework and lawnmowing this weekend...


    Oct 16 2020 Update

    Prepended TLDR section.


    See also at TinkerTry


    See also

    Many vSphere 7.0 Update 1 articles at VMware Blogs | VMware vSphere Blog under the vsphere-1 tag.

    One of them:

    performance-improvements-in-vsan-7-u1
    • Performance Improvements in vSAN 7 U1
      Sep 30 2020 by Pete Koehler at VMware Blogs | Virtual Blocks

      With the announcement of vSAN 7 U1, VMware introduces several enhancements that improve the performance of vSAN and VMware Cloud Foundation (VCF) environments powered by vSAN. How much of an improvement? This will vary depending on the configuration and workloads, but VMware estimates customers may see up to 30% improvement in performance when comparing vSAN 7 U1 to vSAN 6.7 U3. Let’s look at how this is achieved.
      ...

    Now, better handling of "Monster VMs"!

    whats-new-vmc-on-dell-emc-sept-rel
    • What’s New In the Latest Release of VMware Cloud on Dell EMC – VMworld Edition
      Sep 29 2020 by Ken Smith at VMware Blogs | VMware vSphere Blog

      ...
      The September release of VMware Cloud on Dell EMC has advanced its support of popular compliance and security standards to now support ISO27001, ISO27017, ISO27018, SOC 2 Type1, CSA, CCPA, and GDPR compliance standards.
      ...
      VMware Cloud on Dell EMC is introducing our 4th node instance with this release. This new node, X1d.xLarge, is a dual socket powerhouse supporting 96 virtual CPUs, twice as much RAM as our previously announced instance for memory hungry workloads, and a whopping 3 times the storage to handle storage intensive applications.
      ...

      it2

    esxionarm_is_real_and_vmware

    vsan-7-0-u1-file-services-with-smb-and-nfs-support-demo