Integrating NSX-V and NSX-T backed workloads Using VeloCloud SD-WAN Solution

In my previous blog , I talked about the Fundamentals of  VMware SD-WAN technology by VeloCloud focusing on its architecture and use cases.
In this blog, I will be discussing the design I used to connect NSX backed workloads across different Sites/Branches and Clouds using the VMware VeloCloud SD-WAN technology.

My setup consists of the following:

1. NSX-v lab living on Site A. It is backed by a vCenter, management and compute ESXI clusters. Workloads are Virtual Machines.
2. NSX-T lab, that is backed by a management and compute ESXI cluster in addition to a KVM cluster. Workloads are VMs along with Kurbenetes (K8) containers.
3. VMware Cloud on AWS SDDC instance with VC/ESXI/VMs.
Because the SD-WAN technology is agnostic of the technology running in the Data Centers, you can add any cloud/Datacenter/Branch to the above mentioned list such as AWS native EC2s/ Azure/ GCP/ Baba or simply any private/public cloud workloads.
Also note that NSX DataCenter is not a prerequisite to connect sites/branches using SD-WAN by VeloCloud.

The Architectural Design of the end Solution:

velo_design

 

The above design is showcasing a small portion of the full picture design where I can connect “n” number of Sites/Branches/Clouds using SD-WAN technology.
I started by connecting my site 1 in San Jose which happens to be backed by NSX-V with my other DC located in San-Fran backed by NSX-T.
Traffic egressing to the Internet/Non SD-WAN Traffic (Green) will go via the NSX ESG in case of NSX-V site and via the Tier-0 in case of the NSX-T site.
Branch to Branch traffic (in Purple) will ingress/egress via the VeloCloud Edge VCE on each site.

NSX-V Site:

In the San Jose site, I Peered the NSX-V ESG with the VCE using e-BGP. I also already had  i-BGP neighborshop between the NSX DLR and ESG. The transit boundary in Blue you see in the image below is established by deploying an NSX logical switch attached to NSX-V ESG , NSX-V DLR and the VCE.
I redistributed routes learned via ibgp on the ESG to the VCE router using the ebgp.
Now the VCE_SanJose knows about the subnets/workloads that are residing south of the DLR.
I filtered the default originate that the ESG learned from its upstream from being distributed to VCE_SanJose as I dont want to advertise my default originate (default route) to other Branches/sites/Clouds.

Low level design on the NSX-V Site:

 

velo_design_2

Based on the above,

Internet/Non-SD-WAN traffic path will be as follows

VM1–>DLR–>ESG–> Inernet/VLAN

SD-WAN traffic  path :

VM1–>DLR–>VCE–> Internet. (Note that the VCE could have multiple ISP Links or MPLS links that will leverage the Dynamic Multipath Protocol Optimization known as DMPO).

VCE will build tunnels to a VMware VeloCloud hosted Gateways (VCGs) and to the Orchestrator (VCO).  VeloCloud Gateways will be the VCEs distributed control plane and hence VCEs will learn about all other branches routes via the updates those VCGs send over. (refer to Image 1 to help you understand the path).

 

Now that we are done configuring the San Jose site, lets go and Configure the San Francisco NSX-T data center.

 

NSX-T Site:

A new Tier-0 uplink will be connected to an NSX-T Geneve Logical Switch. This Transit logical switch will also be connected to one of the VCE’s interfaces as Downlink.

On the NSX-T Tier-0 and VCE, we will build an e-BGP neighborship via the transit Logical Switch created. VCE will hence know about the routes being advertised from the Tier-0.

Note in NSX-T, Tier-1 Auto plumb all routes towards the Tier-0.

Now that the VCE knows about the San Francisco Routes, it will advertise them to the VCG that is again hosted somewhere on the internet by VMware VeloCloud.

Low level design on the NSX-T Site:

 

velo_design_3

 

Internet/Non-SD-WAN traffic path will be as follows:

VM1–>Tier-1–>Tier-0–> Inernet/VLAN

SD-WAN traffic path :

VM1–>Tier1–>Tier-0–> VCE–> Internet.

Note that the VCE could have multiple ISP Links or MPLS links that will leverage the Dynamic Multipath Protocol Optimization known as DMPO.

 

VCE will build tunnels to a VMware VeloCloud hosted Gateways (VCGs) and to the Orchestrator (VCO). Gateways will be the VCEs control plane and hence VCEs will learn about all other branches routes via the VCG.

 

Now San Jose and San Francisco workloads know how to reach each other via  SD-WAN.

 

Summary

 

The magic of SD-WAN is that we can add “n” number of sites with or without  NSX and  connect them via L3 seamlessly. For instance, I can connect 50 branches to those 2 DCs by deploying a VCE on each branch.

We can also use the DMPO technology to improve the Quality of Service of the traffic destined to branches. Business policies can also be enforced using the VCE.

 

vCloud Director Cross-VDC Design with Cross VC NSX

 

With the release of VMware vCloud Director 9.5, which is packed with a lot of great new features, one of the significant additions is the introduction of Cross-VDC networking.

In prior vCD releases, Cloud Providers couldn’t use the universal constructs that NSX introduced in the NSX cross VC architecture and could not benefit from the use cases that cross VC NSX targets and solve.

Compatibility with Cross VC NSX is finally there starting vCD 9.5+ as vCD  now supports the universal constructs that NSX creates. This is great news for Cloud Providers who are looking to target those use cases.

In this blog I will address the use cases of leveraging cross-VC NSX inside a vCD Virtual Data Center (VDC) and what design We are proposing to integrate with NSX.

This blog was a joint activity between my SE peer Daniel Paluszek and vCD engineering staff  Abhinav Mishra.

 

What are the use cases of stretching the network across VCs/Sites?

 

  1. Resource Pooling: ResourcePoolLogical networking and security across multiple vCenters allow for the ability to access and pool resources form multiple vCenter domains. Resources are no longer isolated based on vCenter and/or vCD boundaries which hence allows the ability to access and pool resources form multiple vCenter domains achieving better utilization and less idle hosts.
  2. Workload Mobility

 

workload mobility

 

Since logical networking can span multiple vCenter domains and multiple sites:

  • Cross-VC NSX allows for enhanced workload mobility across Active-Active data centers
  • Workloads can now be moved between vCenter domains/sites/Org VDCs on demand. A practical use case example would be when there is a data center migration/upgrade activity.

 

3. Disaster Recovery

 

 

Blog1st

 

Cross VDC will help tenants and providers to continue operations in case of a partial or complete network failure. Workloads on Site-A can leverage the Tenant-X-Org-VDC edge on Site-B in the case where the Tenant-X-Org-VDC Edge fails on Site-A.

Moreover, during Internet failure on Site-A, All tenants workloads on site-A will use the Provider Edges on Site-B to exit to internet provided on site-B.

 

 

High Level Design Architecture for vCD 9.5+ and NSX

 

2ndd

 

The goal of this high-level design is to provide optimal availability of network services from the Provider and Tenant layer. We must adhere to Cross-vCenter NSX best practices, so do note that we are presuming you are aware with these guidance parameters stated here: NSX Cross VC Design Guide

In this suggested design, we have two layers of NSX:

  1. Tenant layer within vCloud Director (auto provisioned by vCD)
  2. Provider Managed layer ( provisioned natively in NSX)

 

The goal is to provide high availability between the two sites while meeting the stated requirements of Cross-VDC networking.

 

First NSX layer (Tenant Layer): This layer is the one that is controlled and provisioned by vCD . vCD will extend the tenant networks across sites via stretching their respective logical switches (Universal Logical Switches).The Tenant Universal Distributed Logical Router (UDLR) will be auto provisioned by vCD and will do the required routing for the tenant’s workloads residing on different L2 domains. The tenant’s Active Edge Services Gateway (ESG) or Tenant-<X>-OrgVDC-Site-A will terminate all tenant services such as NAT/FW/DHCP/VPN/LB and will essentially be the North/South entry/exit point for workloads residing in the tenant’s respective OrgVDC on each site.

 

Tenant layer

 

We are suggesting that we deploy the Tenant UDLR in Active/Standby(passive) mode where all Tenant A workload traffic whether they are on Site-A or Site-B will egress from
Tenant-A-OrgVDC-Site-A  Edge.

The rationale behind Active/Standby mode is to maintain stateful services that are running on the tenant’s ESG and explicit control of the ingress traffic which will also assist in any failure considerations. (More details on fail-over scenarios in my next blog)

 

Tenant-B will have a flipped A/S design, where I will have Site A as the passive/(standby) while Site B will be Active for Tenant B workloads.

 

tenant2

 

Tenant B workload traffic whether they are on Site-A or Site-B will egress from
Tenant-B-OrgVDC-Site-B  Edge.

Making different tenants active on different sites will help us distribute network traffic across sites and thus benefit from resource pooling and utilization from the available Data Centers.

 

 

2nd NSX Layer (Provider Layer):  This layer is the Provider Controlled NSX Layer and will be configured/managed by native NSX outside/North of vCD.

 

3rddd

 

Each Tenant ESGs (Tenant-X-Org-VDC-Edge) will peer externally on a ULS with the       pre-Provisioned provider UDLR. This transit interface will be on VXLAN or in other words nothing but another pre-provisioned Universal Logical Switch/tenant. That way we can scale up to 1000 tenants as UDLR supports up to 1000 Logical interfaces (Lifs).

In this high-level design, we will be utilizing an Active/Active state with local egress mode at the Provider Layer (Provider UDLR). Therefore, local traffic will egress at its respective local site. With this configuration, a UDLR Control VM will be deployed on each site.

We are also suggesting that we enable ECMP on the Provider UDLR and Peer with up to 8 ESGs spread equally across sites.

Site-A Provider Primary Control VM will peer with ESG 1-4 Green on site 1 with higher BGP weight along with ESGs 5 to 8 Green on site 2 having lower BGP weights. This will be an achievable step as E1-E8 Green will connect to the same stretched Universal Logical Switch.

Similarly, Provider Secondary Control VM on Site 2 will peer with up to 8 ESGs. ESGs 1 to 4 Blue on site 2 will have higher BGP weight when peering with the Secondary Control VM while ESGs 5 to 8 Blue on site 1 will peer with lower BGP weight.

 

Provider UDLR will reach the Tenant’s ESGs uplinks via directly connected routes.This is where Public IPs will be floating. No need to have any kind of static/dynamic routes between the Provider UDLR  and the Tenant ESGs. Reason is that Provider UDLR will advertise directly connected routes to the Provider Edges upstream via the BGP adjacency that has been already formed while Tenant ESG will simply NAT the public IPs to the workloads that need to be published.

 

Note: For high availability, the default originate would be advertised to the Provider ESGs from the upstream physical network. This will help in the fail-over to the secondary site when upstream internet switches are down.

 

Big thank you to my peer Yannick Meillier who inspired me on peering the Provider UDLR control VMs with a set of Provider ESGs spread across sites to achieve high availability in case of upstream failure in any given site.

 

In my next blog, I discussed in depth the packet life of the above design along with failure and fail-over scenarios.