Smartnic Ocp 2016

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

SmartNIC:

Accelerating Azure’s Network with


FPGAs on OCS servers

Daniel Firestone
Principal Tech Lead and Software Development Manager
Azure Networking Datapath Team
Summary
• Azure Scale

• Cloud Networking Today: Agility with Software Defined Networking

• Hardware acceleration needed in the 40G+ era

• The industry has relied on ASICs, but ASICs aren’t agile enough

• Solution: FPGA-based SmartNIC

• Demo!
Microsoft Azure
cloud
cloud
services
services caching
caching identity
identity service
service busbus media
media

App
Services mobile
mobile
services
services web apps integration hpc analytics

Data
Services
SQL blob
database HDInsight table storage

Infrastructure
Services
virtual
virtual virtual traffic
machines
machines network vpn manager cdn
2014
2013
2015
Coming Soon…
100K
Compute Millions
Instances

10’s of PB
Azure Exabytes
Storage

Datacenter 10’s of Tbps


Network
Pbps
2010 2016
How Do We Build Software
Networks in the Cloud?
SDN: Building the right abstractions for Scale
Abstract by separating management,
control, and data planes
Example: ACLs REST APIs

Management plane Create a tenant


Control plane Plumb these tenant Management Plane
ACLs to these
switches
Data plane Apply these ACLs to Controller
these flows
Control Plane
Data plane needs to apply per-flow policy
to millions of VMs
Virtual Switch
How do we apply billions of flow policy
actions to packets?
Virtual Filtering Platform (VFP)
VM
vNIC
VM
vNIC
Azure’s SDN Dataplane
NIC

• Acts as a virtual switch inside Hyper-V


VMSwitch VMSwitch
• Provides core SDN functionality for Azure
VFP networking services, including:
ACLs, Metering, Security
• Address Virtualization for VNET
• VIP -> DIP Translation for SLB
VNET • ACLs, Metering, and Security Guards
SLB (NAT)
• Uses programmable rule/flow tables to
perform per-packet actions
• Supports all Azure dataplane policy at
40GbE+ with offloads
Flow Tables are the right abstraction for the Host
• VMSwitch exposes a typed Match-
Action-Table API to the controller VNet Description

• One table per policy Tenant Description

• Key insight: Let controller tell the Controller


switch exactly what to do with
which packets (e.g. encap/decap), VNet Routing
rather than trying to use existing Policy NAT ACLs
abstractions (Tunnels, …) Endpoints

Node: 10.4.1.5

VFP
Flow Action Flow Action Flow Action
TO: 10.2/16 Encap to GW TO: 79.3.1.2 DNAT to 10.1.1.2 TO: 10.1.1/24 Allow Blue VM1
NIC
TO: 10.1.1.5 Encap to 10.5.1.7 TO: !10/8 SNAT to 79.3.1.2 10.4/16 Block 10.1.1.2
TO: !10/8 NAT out of VNET TO: !10/8 Allow

VNET LB NAT ACLS


This worked well at 1GbE, ok at
10GbE… what about 40GbE+?
Traditional Approach to Scale: ASICs
• We’ve worked with network ASIC vendors over the years to accelerate
many functions, including:
• TCP offloads: Segmentation, checksum, …
• Steering: VMQ, RSS, …
• Encapsulation: NVGRE, VXLAN, …
• Direct NIC Access: DPDK, PacketDirect, …
• RDMA
• Is this a long term solution?
Host SDN Scale Challenges in Practice
• Hosts are Scaling Up: 1G  10G  40G  50G  100G
• Reduces COGS of VMs (more VMs per host) and enables new workloads
• Need the performance of hardware to implement policy without CPU
• Not enough to just accelerate to ASICs – need to move entire stacks to HW

• Need to support new scenarios: BYO IP, BYO Topology, BYO Appliance
• We are always pushing richer semantics to virtual networks
• Need the programmability of software to be agile and future-proof –
12-18 month ASIC cycle + time to roll new HW is too slow

How do we get the performance of hardware


with programmability of software?
Our Solution – Azure SmartNIC
• HW is needed for scale, perf, and COGS at 40G+ Blade
• 12-18 month ASIC cycle + time to roll new HW is too slow CPU

• To compete and react to new needs, we need agility – SDN


SmartNIC
• SmartNIC combines agility of SDN with speed+COGS of HW
NIC
FPGA
ASIC

Bump in the Wire:


Reconfigurable FPGA +
Roll out Hardware as we do Software NIC ASIC

ToR
SmartNIC Design
• Use an FPGA for reconfigurable functions Blade
• FPGAs are already used in Bing
• Roll out Hardware as we do software CPU

• Programmed using Generic Flow Tables


• Language for programming SDN to hardware SmartNIC
• Uses connections and structured actions as NIC
FPGA
primitives ASIC

• SmartNIC can also do Crypto, QoS, storage


acceleration, and more…

ToR
2015 FPGA Deployments:
40G Bump in the Wire SmartNIC FPGA Mezz
All new Azure Compute servers ship with FPGAs!

Server Blade FPGA board


OCS Blade with NIC and FPGA
DRAM DRAM DRAM

40Gb/s
CPU QPI CPU Gen3 2x8 FPGA QSFP Switch
QSFP

Gen3 x8
Option Card
Mezzanine
NIC QSFP 40Gb/s Connectors
FPGA

Tray
Backplane
SmartNIC - Accelerating SDN
ARM APIs
Controller Controller Controller
VFP APIs
SLB Decap SLB NAT VNET ACL Metering
Rule Action Rule Action Rule Action Rule Action Rule Action
* Decap * DNAT * Rewrite * Allow * Meter

Rewrite
VM
Transposition
Engine

VFP Flow
1.2.3.1->1.3.4.1, 62362->80
Action
Decap, DNAT, Rewrite, Meter
GFT Offload Engine
SR-IOV
First Packet GFT Offload API (NDIS)
(Host Bypass)
VMSwitch

GFT GFT
SmartNIC Flow
1.2.3.1->1.3.4.1, 62362->80
Action
Decap, DNAT, Rewrite, Meter
Crypto RDMA Table
QoS
50G
16
Scenario: Virtual Network Encryption
• SmartNIC can dial encrypted virtual network tunnels (over VxLAN)
for each tenant
• Provides E2E security and privacy against actors inside the
network fabric
• Line Rate Encryption at 40Gbps
Fabric

SmartNIC SmartNIC

Host Host

VM VM VM VM
Demo: SmartNIC Encryption
SmartNIC Gen2: Now at 50GbE!

NIC ASIC and FPGA on one Board


Conclusion
• The cloud will continue to scale, and we will continue to add new
networking features and scenarios

• ASICs can’t keep up with rate of change -> more pressure on FPGAs

• Ability to change our minds later is the strongest technology we have…

Want to help lead the reconfigurable computing revolution in the cloud? We’re Hiring!

[email protected]

You might also like