VX LAN@nettrain

Download as pdf or txt
Download as pdf or txt
You are on page 1of 59
At a glance
Powered by AI
Some of the key takeaways from the presentation are optimizing the location of L2 and L3 gateways, leveraging L3 VXLAN services as the main service, designing the underlay and overlay hierarchically, and linking provisioning of the overlay to the host orchestration system.

Some of the main reasons for using VXLAN include providing flexible overlay networks, mobility, segmentation, scaling by reducing core state, flexibility and programmability, and having a robust underlying fabric.

The main components of a VXLAN overlay network are the overlay control plane, VXLAN encapsulation, VXLAN tunnel end points (VTEPs), the underlay network, and edge devices called network virtualization edges (NVEs).

VXLAN Design and

Deployment
Marian Klas, Systems Engineer
[email protected]
Agenda
• Why VXLAN?
• VXLAN Fundamentals
• Underlay Deployment Considerations
• Overlay Deployment Considerations
• Summary and Conclusion

2
Trend: Flexible Data Center Fabrics
Create Virtual Networks on
top of an efficient IP
network

Mobility
Segmentation + Policy
Scale
Automated & Programmable
Hosts
V
M
O
V
M
O
Physical
Full Cross Sectional BW
L2 + L3 Connectivity
S S

Virtual

Use VXLAN to Create DC Fabrics Physical + Virtual


3
VXLAN Fundamentals

4
Why Overlays?
Seek well integrated best in class Overlays and Underlays

Robust Underlay/Fabric Flexible Overlay Virtual Network


• High Capacity Resilient Fabric • Mobility – Track end-point attach at edges
• Intelligent Packet Handling • Segmentation

• Programmable & Manageable • Scale – Reduce core state


• Distribute and partition state to network edge

• Flexibility/Programmability
• Reduced number of touch points
5
Overlay Taxonomy
Overlay Control Plane
Service = Virtual Network Instance (VNI) VTEPs
Identifier = VN Identifier (VNID)
NVE = Network Virtualization Edge Encapsulation
VTEP = VXLAN Tunnel End-Point
Edge Devices (NVE)
Edge Device (NVE)
Hosts
Underlay Network (end-points)

Underlay Control Plane

6
VXLAN is an Overlay Encapsulation
Data Plane Learning Protocol Learning
Flood and Learn over a multidestination Advertise hosts in a protocol
distribution tree joined by all edge devices amongst edge devices

Overlay Control Plane

Encapsulation

VXLAN
t
7
VXLAN Packet Structure
Ethernet in IP with a shim for scalable segmentation

FCS
Outer MAC Header Outer IP Header Outer UDP Header VXLAN Header Original Layer 2 Frame

14 Bytes
(4 Bytes Optional) 20 Bytes 8 Bytes 8 Bytes Ethernet Payload

VXLAN Flags
UDP Length
Dest. MAC

Src. MAC

VXLAN Port
VLAN Type

Ether Type

Checksum
0x11 (UDP)
Misc. Data

Checksum
IP Header

RRRRIRRR

Reserved

Reserved
Source IP
Protocol

Header
Address

Address

VLAN ID

Source
Dest. IP
0x8100

0x0800

0x0000
Port

VNI
Tag

48 48 16 16 16 72 8 16 32 32 16 16 16 16 8 24 24 8

Src VTEP MAC Address Src and Dst addresses of Large scale
the VTEPs Allows for 16M segmentation
UDP 4789 possible segments
Next-Hop MAC Address
Hash of the inner L2/L3/L4 headers
of the original frame.
Enables entropy for ECMP Load Tunnel Entropy
50 (54) Bytes of overhead balancing in the Network.

8
Data Plane Learning
Dedicated Multicast Distribution Tree per VNI

PIM Join for Multicast PIM Join for Multicast


Group 239.1.1.1 Group 239.2.2.2

V V V V V
Web DB DB Web
VM VM VM VM

9
Data Plane Learning
Dedicated Multicast Distribution Tree per VNI

V V V V V
VM1 VM2 VM3

10
Data Plane Learning
Learning on Broadcast Source - ARP Request Example

ARP Req IP A è G

ARP Req IP A è G

V V V V V
ARP Req VM1 VM2 VM3
MAC IP Addr MAC IP Addr
VM 1 VTEP 1 VM 1 VTEP 1

ARP Req ARP Req 11


Data Plane Learning
Learning on Unicast Source - ARP Response Example

ARP Resp V V V V V
ARP Resp VTEP 2 è VTEP 1

VM1 VM2 VM3


MAC IP Addr MAC IP Addr
VM 2 VTEP 2 VM 1 VTEP 1

12
ARP Resp
VXLAN Evolution
• Head-end replication enables unicast-only mode
Multicast Independent • Control Plane provides dynamic VTEP discovery

• Workload MAC addresses learnt by VXLAN NVEs


Protocol Learning
• Advertise L2/L3 address-to-VTEP association
prevents floods information in a protocol

• VXLAN HW Gateways to other encaps/networks


External Connectivity • VXLAN HW Gateway redundancy
• Enable hybrid overlays

• VXLAN Routing
IP Services • Distributed IP Gateways

14
Building your IP Network – Routing Protocols; IS-IS
Underlay
• IS-IS – what was this CLNS?
- Independent of IP (CLNS)
- Well suited for routed interfaces/ports
- No SPF calculation on Link change;
only if Topology changes
- Fast Re-convergence
V1 V2
- Not everyone is familiar with it

V3

31
Building your IP Network – Routing Protocols; iBGP
Underlay
• iBGP + IGP = The Routing Protocol
Combo
• IGP for underlay topology &
reachability (e.g. IS-IS, OSPF)
• iBGP for VTEP (loopback) reachability
• iBGP route-reflector for simplification
and scale RR RR RR RR

• Requires two routing protocols


• Separates Links (IGP) from VTEPs iBGP
V2
V1
(iBGP)
• End-Host information are still in iBGP but
different address-family

V3

33
Multicast Enabled Underlay

• May use PIM-ASM or PIM-BiDir (Different hardware has different capabilities)


Nexus 7K Nexus 9K CSR 1000V
N1KV Nexus 3K Nexus 5K/6K ASR9K
with F3 LC Standalone ASR1K
PIM-ASM PIM-ASM
PIM-ASM & Bidir-PIM PIM-ASM &
Mcast mode IGMP v2/v3 (Bidir – Bidir-PIM (Bidir –
Bidir-PIM (ASM –Future) Bidir-PIM
Future) Future)

• Cisco Nexus 9000 Series Switches (supports ASM/SSM/Ingress-Replication for VXLAN VTEP support)
and Cisco Nexus 5600 Series Switches (supports only PIM BIDIR for VXLAN VTEP support) can be
part of the same VXLAN EVPN fabric but not share the same Layer-2 VNI.
• Spine and Aggregation switches make good RP locations in clos and traditional topologies respectively
• Reserve a range of multicast channels to service the overlay and optimize for diverse VNIs
• In clos topologies with lean spine, using multiple RPs across the multiple spines and mapping different
VNIs to different RPs will provide a simple load balancing measure
• Design a multicast underlay for a network overlay, host VTEPs will simply leverage this network. 35
Multicast Enabled Underlay
Host Overlay to Hybrid Overlay
• Host Overlay VTEPs join multicast groups
as hosts using IGMP reports
• Host overlays will work over an L2
PIM
underlay, ensure IGMP snooping is in
place to scope the reach of multicast
• A multicast enabled L3 underlay is the
better option as it enables a hybrid overlay
(host and network VTEPs)
IGMP HW VTEP
• Ensure that the first hop router for the host in
the underlay is configured to service the SW VTEP
IGMP reports from the host VTEP
VNI 6000

36
Multi-Pathing and Entropy
NV-edge NV-edge

• Symmetric Underlay Network topologies facilitate ECMP routing:


• Multi-path load balancing
• Fast Re-convergence on link Failures

• Polarization: Encapsulated flows appear as a single flow which hashes to a single path
• Entropy in the encapsulation header to depolarize tunnels
• Variable UDP source port in VXLAN outer header
• Underlay must support ECMP hashing on L4 port numbers

49
Overlay Deployment
Considerations

55
Type of Overlay Service

Layer 2 Overlays Layer 3 Overlays


• Emulate a LAN segment • Abstract IP based connectivity
• Transport Ethernet Frames (IP and non-IP) • Transport IP Packets
• Single subnet mobility (L2 domain) • Full mobility regardless of subnets
• Exposure to open L2 flooding • Contain network related failures (floods)
• Useful in emulating physical topologies • Useful in abstracting connectivity and policy

Hybrid L2/L3 Overlays offer the best of both domains


Building your VTEP (VXLAN Tunnel End-Point)
Overlay
Configuration Example
# Features & Globals Enables VTEP (only required on Leaf or Border)
feature bgp
feature nv overlay
nv overlay evpn
Enables EVPN Control-Plane in BGP
# Spine (S 1)

# Leaf (V 1) Configure the VTEP interface


interface nve1
RR RR RR RR
source-interface loopback0
host-reachability protocol bgp Use a Loopback for Source Interface

iBGP

V1 V2
Enable BGP for Host reachability

V3
*Simplified BGP configuration; would have 4 BGP peers (RR)
58
IGP not shown
Building your Overlay Control-Plane
Overlay
Configuration Example
# Features & Globals
feature bgp
feature nv overlay
nv overlay evpn
Enables EVPN Control-Plane in BGP
# Spine (S 1)
router bgp 65500
router-id 10.10.10.S1
address-family ipv4 unicast
RR RR RR RR
address-family l2vpn evpn
neighbor 10.10.10.V1 remote-as 65500 Activate L2VPN EVPN under each BGP neighbor
update-source loopback0
address-family l2vpn evpn iBGP
send-community both V1 V2
route-reflector-client

# Leaf (V 1)
router bgp 65500
router-id 10.10.10.V1 Send Extended BGP Community
address-family ipv4 unicast to distribute EVPN route attributes V3
neighbor 10.10.10.S1 remote-as 65500
update-source loopback0 *Simplified BGP configuration; would have 4 BGP peers (RR)
59
address-family l2vpn evpn IGP not shown
send-community both
*
Extend your VLAN to VXLAN
Overlay
• Mapping a IEEE 802.1Q VLAN ID to a VXLAN Configuration Example
VNI # Features
• VLAN to VNI configuration on a per-Switch based feature vn-segment-vlan-based

• VLAN becomes “Switch Local Identifier” # VLAN to VNI mapping (MT-Lite)


• VNI becomes “Network Global Identifier” vlan 43
VLAN to Layer-2 VNI mapping
vn-segment 30000
• 4k VLAN limitation per-Switch does still apply
• 4k Network limitation has been removed # Activate Layer-2 VNI for EVPN
evpn
• Dependent on VLAN Space! vni 30000 l2 Enables EVPN Control-
rd auto Plane for Layer-2
route-target import auto Services
route-target export auto
Alternative is to use
VLAN # Activate Layer-2 VNI on VTEP “ingress-replication
interface nve1 protocol bgp”
source-interface loopback0
ethernet VLAN VNI vxlan host-reachability protocol bgp
member vni 30000
Multi-Tenancy Lite (MT-Lite)
mcast-group 239.239.239.100
suppress-arp
Enables Layer-2 VNI
on VTEP and suppress
60

ARP
Distributed Gateway Function in L3 Overlays
L3 Boundary

L2/L3 Fabric
L3 Boundary

App App
App App

OS OS OS
OS

Virtual Physical Virtual Physical

Traditional L2 - centralised L2/L3 boundary L2/L3 fabric (or overlay)


• Always bridge, route only at an aggregation point • Always route (at the leaves), bridge when necessary
• Large amounts of state converge • Distribute and disaggregate necessary state
• Scale problem for large# of L2 segments • Optimal scalability
• Traditional L2 and L2 overlays • Enhanced forwarding and L3 overlays
61
Distributed IP Anycast Gateway
The same “Anycast” SVI IP/MAC is used at all VTEPs/ToRs
A host will always find its SVI anywhere it moves

L3 Fabric
VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3
Gateway Gateway Gateway Gateway Gateway

SVI IP Address SVI IP Address


MAC: 0000.dead.beef MAC: 0000.dead.beef
IP: 10.1.1.1 IP: 10.1.1.1
VM VM

OS OS
62
Distributed IP Anycast Gateway
Detailed View Underlay Underlay
/ IP Core / IP Core

L3 GWY L3 GWY
SVI A SVI B SVI B SVI A

VLAN A' VNI A VLAN A


VLAN B' VNI B VLAN B

VTEP L2 GWY L2 GWY VTEP

Consistent Anycast SVI IP / MAC L3 Fabric


address at all leaves
VXLAN L3 VXLAN L3
VLAN-IDs are locally significant
VXLAN L3
Gateway Gateway Gateway

63
Distributed IP Anycast Gateway*
Configuration example

V2

V3

V1

Host Y
VLAN 55
Host A
VLAN 43 64

*Requires EVPN Control-Plane.


Routing in VXLAN – define the resources
Overlay
Configuration Example for VRF-A

vxlan
# Define VLAN for VRF routing instance
VNI 50000 vlan 2500
vn-segment 50000 VLAN to Layer-3 VNI mapping

# Define SVI for VRF routing instance


interface Vlan2500
no shutdown VLAN to Layer-3 VNI mapping
mtu 9216 - ip forward required for prefix-
vrf member VRF-A based routing
ip forward
VRF-A # VRF configuration for “customer” VRF
vrf context VRF-A
vni 50000 VRF context definition
rd auto
ethernet ethernet address-family ipv4 unicast
- VNI
route-target both auto - Route-Distinguisher
route-target both auto evpn
- Route-Targets
- IPv4 and/or IPv6

65
Distributed IP Anycast Gateway*
Overlay
Configuration Example for “BLUE” (V1 & V3) Configuration Example for “RED” (V1-3)
# Features & Globals # Features & Globals
feature interface-vlan feature interface-vlan
fabric forwarding anycast-gateway-mac 2020.DEAD.BEEF fabric forwarding anycast-gateway-mac 2020.DEAD.BEEF

# VLAN to VNI mapping (MT-Lite) # VLAN to VNI mapping (MT-Lite)


vlan 43 vlan 55
vn-segment 30000 vn-segment 30001

# Anycast Gateway MAC, inherited by any interface # Anycast Gateway MAC, inherited by any interface
(SVI) using “fabric forwarding” (SVI) using “fabric forwarding”
fabric forwarding anycast-gateway-mac 0002.0002.0002 fabric forwarding anycast-gateway-mac 0002.0002.0002

# Distributed IP Anycast Gateway (SVI) # Distributed IP Anycast Gateway (SVI)


interface vlan 43 interface vlan 55
no shutdown no shutdown
vrf member VRF-A vrf member VRF-A
ip address 11.11.11.1/24 tag 12345 ip address 98.98.98.1/24 tag 12345
fabric forwarding mode anycast-gateway fabric forwarding mode anycast-gateway

*Requires EVPN Control-Plane. VRF and BGP configuration not shown


66
Routing in VXLAN – configure the routing
Overlay
Enables Layer-3 VNI on VTEP Configuration Example for VRF-A

vxlan
and associate it to VRF # Activate Layer-3 VNI on VTEP
VNI 50000 interface nve1
source-interface loopback0
host-reachability protocol bgp
member vni 30000
mcast-group 239.239.239.100
suppress-arp
member vni 50000 associate-vrf

# Route-Map for Redistribute Subnet


route-map REDIST-SUBNET permit 10
VRF-A match tag 12345

# Control-Plane configuration for VRF (Tenant)


router bgp 65500
ethernet ethernet …
vrf VRF-A
address-family ipv4 unicast
VRF/Tenant definition advertise l2vpn evpn
redistribute direct route-map REDIST-SUBNET
within Overlay Control-Plane maximum-paths ibgp 2
67
Host Subnet Redistribution
Overlay
• Host “A” is a silent Host
• Not known via ARP/IP
• How can Host “Y” reach Host “A” I know Subnet “A”
• Host “A” and “Y” are in different
VLAN/Subnet V2
• Route for Host “A”-Subnet will be
advertised by V1 and V2 V3
• Host “Y” will reach either V1 or V2
based on ECMP V1
• From V1 or V2, Host “A” can be Host Y
reached via Layer-2 Segment. VLAN 55
Host A
VLAN 43 69
VXLAN Hardware VTEP Redundancy (vPC)
Southbound Connectivity
• VXLAN vPC Domain Configuration
Classic Ethernet
• Configure VXLAN specific vPC
Peer-Link Configuration
• Extend the IP Interface (Loopback)
configuration for the VTEP V5
• Secondary IP address (anycast) is V4
used as the anycast VTEP address
• Both vPC VTEP switches need to
have the identical secondary IP
address configured under the
loopback interface Host D
VNI 30000

70
VXLAN Hardware Gateway Redundancy (vPC)
Southbound Connectivity
vPC VTEP Configuration Example for (V4-5)
# VLAN to VNI mapping (MT-Lite)
vlan 55 interface loopback0
vn-segment 30000 ip address 10.10.10.5/32
ip address 10.10.10.99/32
# VTEP IP Interface; Source/Destination for all secondary
VXLAN Encapsulated Traffic.
§ Primary IP address is used for Orphan Hosts
§ Secondary IP is for vPC Hosts (same IP on both
vPC Peers)
interface loopback0
V5
ip address 10.10.10.V/32 Add Secondary IP to VTEP Loopback
ip address 10.10.10.VAnycast/32 secondary V4
# VTEP configuration using Loopback as source.
Destination Group for VNI 30001 is “239.1.1.2”
interface nve1
source-interface loopback0 interface loopback0
host-reachability protocol bgp ip address 10.10.10.4/32
ip address 10.10.10.99/32
member vni 30000
secondary Host D
mcast-group 239.239.239.100
VNI 30000
suppress-arp
member vni 50000 associate-vrf 71
VXLAN Hardware Gateway Redundancy (vPC)
Not to Forget!
vPC VTEP Configuration Example for (V4-5)
# VPC Domain Configuration
vpc domain 99 interface loopback0
peer-switch peer-gateway needs to be ip address 10.10.10.5/32
enabled
peer-keepalive destination V4-mgmt source v5-mgmtso that vPC VTEP ip address 10.10.10.99/32
peer-gateway switches can forward traffic secondary
ip arp synchronize for each other’s router
MAC address
# VPC Peer-Link
interface port-channelXX
switchport mode trunk
V5
vpc peer-link
V4
# VPC Domain Routing Adjacency Routed Interface (SVI) for routing
interface Vlan3999
adjacency across VPC Peer-Link
no shutdown
ip address 10.254.254.1/30
ip router ospf 1 area 0.0.0.0 interface loopback0
ip ospf network point-to-point ip address 10.10.10.4/32
ip address 10.10.10.99/32
ip pim sparse-mode
secondary Host D
VNI 30000

72
Folded Clos Topology
Providing Topology Symmetry

Spine

L3 Fabric
VXLAN L2/3 VXLAN L2/3 VXLAN L2/3 VXLAN L2/3 VXLAN L2/3 VXLAN L2/3
Gateway Gateway Gateway Gateway Gateway Gateway

Leaf Border Leaf

WAN/DCI

• Fully Symmetric, BW rich topology, Optimized for East-West traffic


• Lean Spine does not do any VXLAN termination/gateway
• Access to other networks through border leaf block 73
VXLAN/EVPN Fabric External Routing
Overlay
• The Border Leaf/Spine provides
Layer-2 and Layer-3 connectivity to
external Network
• Flexible routing protocol options for
for external routing V2
• Today, VRF-lite allows to extend
VRF outside of the fabric V3
• With Nexus 7000/7700 and F3, VBL
other options are coming V1

WAN
VXLAN/EVPN Fabric External Routing
Overlay

VRF for External Routing


need to exist on Border Leaf
VBL
VRF VRF VRF V2
A B C

V3
Interface-Type Options:
• Physical Routed Ports V1
• Sub-Interfaces
• VLAN SVIs over Trunk Ports Peering Interface can
be in Global or Tenant VRF

WAN
VXLAN/EVPN Fabric External Routing (eBGP)
Overlay
Border Leaf Configuration Example
# Sub-Interface Configuration
interface Ethernet1/1
no switchport
VBL
VRF VRF VRF V2
interface Ethernet1/1.10 A B C
mtu 9216
encapsulation dot1q 10
vrf member VRF-A V3
ip address 10.254.254.1/30

# eBGP Configuration
router bgp 65500 V1

vrf VRF-A Advertise external learned routes
address-family ipv4 unicast into EVPN (Route-Type 5)
advertise l2vpn evpn
aggregate-address 10.0.0.0/8 summary-only
neighbor 10.254.254.2 remote-as 65599
update-source Ethernet1/1.10 WAN
address-family ipv4 unicast
* AS# 65599
*Ensure that non-necessary routes are not advertised towards the External Network
VXLAN/EVPN Fabric External Routing (eBGP)
Overlay

VBL
VRF VRF VRF V2
Edge Router Configuration Example A B C

# Interface Configuration
interface Ethernet1/1
V3
vrf member VRF-A
ip address 10.254.254.2/30

# eBGP Configuration V1
router bgp 65599

vrf VRF-A
address-family ipv4 unicast
neighbor 10.254.254.2 remote-as 65500
update-source Ethernet1/1
address-family ipv4 unicast WAN
AS# 65599
MPLS IP-VPN & VXLAN – in NX-OS 7.3
L3 Handoff – Border PE
P
OTV Transport
MPLS/IP Core
P P

Border Border

N7K
PE PE

VXLAN L3 VXLAN L3
Gateway Gateway

L3 Fabric

VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3 VXLAN L3


Gateway Gateway Gateway Gateway Gateway Gateway

N5600, N7K, N9K

79
Summary and Conclusion

80
Summary recommendations & takeaways
• Optimize the location of L2 and L3 GWYs to optimize routing and minimize failure
exposure
• Leverage L3 VXLAN services enabled by control protocols as the main service and L2
extensions as the exception
• Design the underlay with the VXLAN overlay in mind
• Design the network hierarchically: both the underlay as well as the overlay
• L3 Gateways are key to a sound overlay design
• A combination of pull protocols and push protocols may render optimal scale and
resiliency
• Link the provisioning of the overlay and scoping of VNIs to the host orchestration system
for optimal scale

81
Thank you

82
We’re ready. Are you?

You might also like