Tag: technology

  • Designing VMware Cloud Foundation 9.1: The 31 Decisions You Need to Make

    Every VCF deployment starts the same way: someone hands you a blank whiteboard and says design it. The problem is that VCF 9.1 is a broad platform, and without a structured approach it is easy to make decisions out of order, miss dependencies, or find out three phases in that an early choice locked you into something you did not intend.

    Broadcom organizes the VCF 9.1 design process into nine phases covering 31 distinct decisions. This post walks through each phase, what the decisions are, and why they matter in practice. If you are using the VCF Designer tool, this maps directly to the decision schema it uses.

    Phase 1: Starting Point and Profile

    Before touching any configuration, you need two things nailed down: the design blueprint and the scope.

    The Design Blueprint is your baseline deployment profile. Broadcom defines several: single site minimal, single site, multi-site single region, multi-region, and others covering application and security modernization. This is not a technical decision as much as it is a business one. It defines the complexity ceiling for everything that follows.

    Scope and Use Cases is where you gate the rest of the design. VCF 9.1 can cover private cloud IaaS, Kubernetes via Supervisor, Private AI Foundation, vDefend lateral security, VCF Edge, and disaster recovery. What you check here enables or disables options in later phases. Do not mark something in scope unless there is a real requirement behind it.

    Phase 2: Fleet-Level Decisions

    The VCF Fleet Deployment Model defines how the fleet is laid out. A single VCF instance is the most common for customers starting out or running a standalone private cloud. A connected fleet with multiple instances comes into play when you have multiple sites or organizational boundaries that require separate management planes.

    The VCF Fleet Sizing Model covers appliance sizing: Small, Medium, HA Medium, Large, and HA Large. Sizing here is not about your workload VMs. It is about the management plane itself. Undersizing the fleet appliances is one of the most common mistakes in early VCF deployments.

    Phase 3: Consumption Decisions

    This phase covers how cloud consumers interact with the platform. Five decisions, and they are tightly interconnected.

    The VCF Automation Model decides whether VCF Automation is deployed and in what topology. If your organization needs self-service provisioning or catalog-driven deployments, you need this. If not, skip it. Running it just because it is available adds operational overhead without benefit.

    The Network Consumption Model is one of the most consequential decisions in the entire design. VLAN, NSX Overlay Segments, VPC, or Transit Gateway. This drives downstream decisions on edge clusters, load balancers, and how workloads connect. Get this wrong and you are rearchitecting the network mid-project.

    Workload Connectivity and Load Balancer Model follow from the network consumption choice. For load balancing, NSX Native covers most use cases. Avi (VCF Advanced LB) is needed when you require full L7 with advanced policies, SSL offload, or WAF capabilities.

    Phase 4: Operations Decisions

    Six decisions covering management services, management networking, operations tooling, logging, network observability, and recovery.

    The VCF Management Services Model defines availability for SDDC Manager, vCenter, and NSX Manager. Standard vs. Highly Available. For production environments, the answer is almost always HA. The cost of an HA management plane is small compared to the cost of a failed SDDC Manager during a critical operation.

    The VCF Management Network Model determines whether management components share a VLAN, use isolated VLANs per component, or run on NSX segments. NSX segments require NSX to be up before management components can communicate, which creates a chicken-and-egg risk during recovery scenarios. Plan this carefully.

    The VCF Recovery Option aligns to your RPO and RTO requirements. Backup and restore, component-level recovery, and instance-level recovery each have different complexity and cost profiles. Define your recovery requirements before choosing this, not after.

    Phase 5: Security and Compliance

    Identity Broker and SSO decisions define how users authenticate to VCF components. Most enterprise environments will federate to Active Directory or an external IdP. Plan this early since it affects every component that needs authentication.

    vDefend Lateral Security only applies if it was included in scope in Phase 1. If deployed, the Security Services Platform adds distributed IDS/IPS and east-west traffic inspection.

    Phase 6: Virtual Infrastructure

    Seven decisions covering domains, clusters, networking, and storage. This is where the design gets concrete.

    The VCF Domain Model defines your management and workload domain topology. Single-AZ with one management plus one workload domain is the most common starting point. Stretched (multi-AZ) adds complexity but is required for metro HA.

    The Storage Model is one of the decisions with the most downstream impact. VCF 9.1 supports vSAN OSA, vSAN ESA, NFS, VMFS on Fibre Channel, iSCSI, and NVMe variants. vSAN ESA is the recommended path for new deployments using compatible hardware. If you are connecting to an existing SAN or NAS, the external storage options apply.

    NSX Manager topology and NSX Edge Cluster decisions define the control plane and data plane for your overlay network. Edge cluster sizing depends on the volume and type of north-south traffic. A shared NSX Manager cluster across domains reduces overhead. Dedicated per domain gives you blast radius isolation.

    Phase 7: Physical Infrastructure

    One decision: the Network Fabric Model. Routed VLAN fabric, Leaf-Spine VXLAN underlay, or EVPN-VXLAN fabric. This needs to be made in coordination with the network team. The fabric model affects how VLANs are extended across the environment and how the NSX overlay integrates with the underlay. EVPN-VXLAN provides the most flexibility for multi-site and stretched cluster scenarios.

    Phase 8: Optional Workload Capabilities

    VCF Edge and Private AI Foundation, both conditional on Phase 1 scope. For VCF Edge, single-host is suitable for small remote sites where HA is not required. Three-host provides local HA at the edge.

    For Private AI Foundation, the compute model selection depends heavily on the type of workloads. Training workloads typically want full GPU passthrough or MIG. Inference workloads can often share via vGPU.

    Phase 9: Closeout

    Two workflow tasks, not configuration decisions. First, reconcile every decision made in Phases 1 through 8 against the Broadcom VCF Design Library to confirm alignment with supported patterns. Second, translate the finalized design into the VCF Planning and Preparation Workbook, which is the actual input consumed by the VCF Installer during bring-up. A clean design that does not translate into a properly completed workbook will cause bring-up failures. Budget time for this step.

    The Full Decision Index

    StepPhaseDecision
    1Phase 1Design Blueprint
    2Phase 1Scope and Use Cases
    3Phase 2VCF Fleet Deployment Model
    4Phase 2VCF Fleet Sizing Model
    5Phase 3VCF Automation Model
    6Phase 3vSphere Supervisor Model
    7Phase 3Network Consumption Model
    8Phase 3Workload Connectivity Model
    9Phase 3Load Balancer Model
    10Phase 4VCF Management Services Model
    11Phase 4VCF Management Network Model
    12Phase 4VCF Operations Model
    13Phase 4Log Management Model
    14Phase 4VCF Operations for Networks Model
    15Phase 4VCF Recovery Option
    16Phase 5Identity Broker Model
    17Phase 5VCF Single Sign-On Model
    18Phase 5Lateral Security with vDefend
    19Phase 6VCF Domain Model
    20Phase 6vSphere Cluster Model
    21Phase 6Distributed Switch Model
    22Phase 6Storage Model
    23Phase 6NSX Manager and Control Plane Model
    24Phase 6NSX Edge Cluster Model
    25Phase 6Virtual Network Appliance Cluster Model
    26Phase 7Network Fabric Model
    27Phase 8VCF Edge Model
    28Phase 8Private AI Foundation Platform Model
    29Phase 8Private AI Foundation Compute Model
    30Phase 9Reconcile Against Broadcom Design Library
    31Phase 9Produce the Planning and Preparation Workbook
  • Insights from a Nutanix Migration Specialist

    Insights from a Nutanix Migration Specialist

    My work life as an IT specialist has always been quite varied.

    I spent part of my time installing traditional datacenter infrastructure, some of my time implementing cybersecurity solutions, and bits and pieces here and there, working on projects with a number of different technology vendors.

    But over the past 18 months, my main focus has been: migrate customers’ virtualization environments to Nutanix.

    The timing lines up with some big shakeups in the tech industry, as well as the continued growth of hyperconverged infrastructure (HCI). I heard my customers worry that support quality would decline for their existing environments, or that innovation might stall. In reality, what my customers have mostly seen is severe sticker shock on their renewal bills—partly due to inflation that has hit all sectors, but also due to dramatic changes to vendor licensing agreements. 

    Some customers have seen 3x, 5x, or even 10x increases in their virtualization costs, practically overnight. These are customers that have been with a vendor for 15 or 20 years, in many cases, and many had come to view their virtualization environments as something of a commodity with a stable pricing structure. But changes to licensing agreements have upended this stability. Before, customers could mostly purchase individual product licenses as needed, but they’re now being funneled into bundled packages with add-on features they don’t want and can’t use.

    Some large enterprises are able to absorb these new costs. But for others—especially small and medium-sized companies—the impact to their business is comparable to tripling their rent, or adding a zero to their monthly utility bills. These smaller customers also find themselves in a poor negotiating position with tech giants. 

    For example, we recently worked with Norfolk Public Schools in Virginia to migrate to Nutanix. The district was facing an eye-popping 680% cost increase if it stayed with its previous provider, but a five-year licensing agreement saved it approximately $2 million.

    For customers like Norfolk Public School, the numbers of the new virtualization landscape simply don’t add up. And for the first time, many of these organizations are willing to seriously consider a change.

    Even non-technical people can understand the anxiety that comes with switching technology platforms. (Think of how rarely people change to a phone with a different operating system.) Most of my customers never even considered switching from their existing virtualization provider until recently. After all, virtualization is a foundational technology that supports their entire business. Many system administrators have built their careers and expertise around the environment they know, developed their own workflows around its interface and capabilities, and integrated their entire application environment with that platform.

    Most importantly, businesses have come to rely on the stability of their virtualization environments to keep their mission-critical systems up and running. So, it’s understandable why many approach a change with a degree of trepidation. They want to know whether their applications will work the same way, how much downtime to expect, and whether their teams will need extensive retraining.

    However, once customers make the move, they tend to find that Nutanix infrastructure provides everything they need—and often in a more intuitive way, at what essentially amounts to what they were paying before the market shifts of the past couple off years. During the pre-sales process, I sit with customers to walk them through the Nutanix interface. We spend much of this time exploring the equivalent functionality between the platforms, which is often mostly a matter of learning new terminology for familiar features.

    At Norfolk Public Schools, we conducted site assessments, installed and configured new hardware, configured the Nutanix platform, and migrated more than 400 virtual machines—all in just over a month. The cutover to the new operating environment was seamless, and the district saw immediate improvements in performance and reliability.

    For most organizations, the migration is just as painless. Some clients prefer to migrate in small batches of just a few virtual machines, while others are ready to move hundreds of virtual machines over a single weekend. The actual cutover process for each virtual machine takes only about five to ten minutes—comparable to the standard maintenance window for most security patches. Post-migration, customers typically notice improved performance (mostly due to new hardware). In addition to the cost savings, many also cite Nutanix’s simplified disaster recovery capabilities as a major benefit of the move.

    After we start the migration, I can see the anxiety on my customers’ faces melt away, replaced by relief. Recently, one even started laughing. “This is so amazing!” he kept repeating. “This is so easy!”