DPDK Packet Processing Ia Overview Presentation
DPDK Packet Processing Ia Overview Presentation
DPDK Packet Processing Ia Overview Presentation
Intel Data Plane Development Kit (Intel DPDK) Overview Packet Processing on Intel Architecture
December 2012
Legal Disclaimer
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm All information provided related to future Intel products and plans is preliminary and subject to change at any time, without notice. All dates provided are subject to change without notice. Intel may make changes to specifications and product descriptions at any time, without notice. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intels current plan of record product roadmaps. Intel product plans in this presentation do not con stitute Intel plan of record product roadmaps. Celeron, Intel, Intel logo, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel SpeedStep, Intel XScale, Itanium, Pentium, Pentium Inside, VTune, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Intel Active Management Technology requires the platform to have an Intel AMT-enabled chipset, network hardware and software, as well as connection with a power source and a corporate network connection. With regard to notebooks, Intel AMT may not be available or certain capabilities may be limited over a host OS-based VPN or when connecting wirelessly, on battery power, sleeping, hibernating or powered off. For more information, see http://www.intel.com/technology/iamt. Enhanced Intel SpeedStep Technology for specified units of this processor available Q2/06. See the Processor Spec Finder at http://processorfinder.intel.com or contact your Intel representative for more information 64-bit computing on Intel architecture requires a computer system with a processor, chipset, BIOS, operating system, device drivers and applications enabled for Intel 64 architecture. Performance will vary depending on your hardware and software configurations. Consult with your system vendor for more information. No computer system can provide absolute security under all conditions. Intel Trusted Execution Technology is a security technology under development by Intel and requires for operation a computer system with Intel Virtualization Technology, an Intel Trusted Execution Technology-enabled processor, chipset, BIOS, Authenticated Code Modules, and an Intel or other compatible measured virtual machine monitor. In addition, Intel Trusted Execution Technology requires the system to contain a TPMv1.2 as defined by the Trusted Computing Group and specific software for some uses. See http://www.intel.com/technology/security/ for more information. Hyper-Threading Technology (HT Technology) requires a computer system with an Intel Pentium 4 Processor supporting HT Technology and an HT Technology-enabled chipset, BIOS, and operating system. Performance will vary depending on the specific hardware and software you use. See www.intel.com/products/ht/hyperthreading_more.htm for more information including details on which processors support HT Technology. Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain platform software enabled for it. Functionality, performance or other benefits will vary depending on hardware and software configurations and may require a BIOS update. Software applications may not be compatible with all operating systems. Please check with your application vendor. Intel AES-NI requires a computer system with an AES-NI enabled processor, as well as non-Intel software to execute the instructions in the correct sequence. AES-NI is available on select Intel processors. For availability, consult your reseller or system manufacturer. For more information, see Intel Advanced Encryption Standard Instructions (AES-NI) * Other names and brands may be claimed as the property of others. Other vendors are listed by Intel as a convenience to Intel's general customer base, but Intel does not make any representations or warranties whatsoever regarding quality, reliability, functionality, or compatibility of these devices. This list and/or these devices may be subject to change without notice. Copyright 2012, Intel Corporation. All rights reserved.
Agenda 1. Intels Packet Processing Motivation and Value Proposition 2. Overview of Intel DPDK
+
Intel Architecture Ecosystem + Standards
WIRELESS BASE STATION WIRELESS INFRASTRUCTURE INTELLIGENT EDGE ROUTERS AND SWITCHES
MEDIA PROCESSING
NETWORK APPLIANCES
NETWORK SECURITY
Next Generation
Packet Processing
Signal Processing
NPU/ASIC
DSP
DSP
Multiple Opportunities
TRANSFORMING COMMUNICATIONS
Intel addresses TCO and TTM concerns with single architecture, multiworkload IA capability, allied to an industry-leading beat-rate of process and uArchitectural advancements (Tick-Tock Model)
Optimized Data Plane Software solutions will help unleash IA platform potential
Intel Silicon and Software advances are proactively addressing this problem statement, making high performance Packet Processing on IA a reality!
PPS
0 136 640
100 172 208 244 280 316 352 388 424 460 496 532 568 604 676 712 748 784 820 856 892 928 964
1252
1000
1036
1072
1108
1144
1180
1216
1288
1324
1360
1396
1432
1468
Packet Size
835 ns
1670 cycles 2505 cycles
135 cycles
201 cycles
1504
64
80
70
60
50
40
30
20
10
Mpps
2006 2007 2008 DP Intel Xeon DP Intel Xeon DP Intel Xeon Processor LV Processor E5345 Processor E5410 2 x 2 Core 2 x 4 Core 2 x 4 Core 2.0 GHz, 667 MHz FSB 2.33 GHz, 1333 MHz FSB 2.33 GHz, 1333 MHz FSB
2011 1S Intel Xeon Processor E5-2600 (B0 stepping) 1 x 8 Core 2.0 GHz PCIe* Gen2 Performance
Standard off-the-shelf IA platform can deliver huge performance. Performance jump can be attributed to Core, Memory architecture (iMC) + Intel DPDK
9
Software & Hardware performance enhancements over the next 2-3 years
Core
PCIe* Gen 2
Platform
Acceleration
AES-NI instruction
1010
Packet Processing Packet Processing Processor Switch Switch Processor Packet Processing Packet Processing
Intel Provided
Intel DPDK
AdvancedTCA*
Typically Control and Data Plane on different boards
Value proposition of Intel DPDK is workload consolidation: Provides framework and performance for NPU workloads on IA cores
1111
Agenda 1. Intels Packet Processing Motivation and Value Proposition 2. Overview of Intel DPDK
1212
1313
Control Plane
Data Plane
1414
Customer Application
Customer Application
Customer Application
- Run-time Environment
Low overhead, run-to-completion model optimized for fastest possible data plane performance
Linux Kernel
Platform Hardware
The Intel DPDK is a great starting point for customers and the industry in general delivering breakthrough packet processing performance.
1515
1616
Component Overviews
EAL Memory Management Overview Queue/Ring Overview Buffer Management Overview Flow Classification Overview
1717
Memory Usage
Basic unit for runtime object allocation is the memory zone Zones contain rings, pools, LPM routing tables, or any other performance-critical structures Always backed by Huge Page (2 MB/1 GB page) memory
Ring:
Ring:
Memory Pool: mbuf_pool Memory Zone: MP_mbuf_pool
RX_RING_0
TX_RING_0
Memory Zone:
RG_RX_RING_0
Memory Zone:
RG_TX_RING_0
Memory Segment 0
Memory Segment 1
Memory Segment N
2MB page
2MB page
2MB page
1919
Memory Pool
Pkt Buffers (60K 2K buffers) Events (2K 100B buffers) Events (2K 100B buffers)
Multi-producer/multi-consumer safe
Pools are based on Intel DPDK rings so are multiproducer and multi-consumer safe No locking; use CAS instructions Pools can also be used in multi-process environments
Processor 0
Data Plane
Data Plane
Data Plane
Intel DPDK C1
Intel DPDK C2
Intel DPDK C3
Intel DPDK C4
10G
10G
2020
Not expecting every customer to use it more of a showcase to demonstrate how to do optimization for IA Classification is something that is very customer-specific Each customer/segment has different needs
Router implementations typically use longest-prefix-match Security implementations need to identify individual flows and can use flow classification
The Flow classification API is designed to take advantage of current and future hardware-based flow classification capabilities
Intel 82599 10GbE Ethernet Controller implements flow classification (limited in number of flows) Future Chipsets are expected to implement an extensive classifier
2121
Initialization
Initialize memory zones and pools Initialize devices and device queues Start the packet forwarding application
Poll devices RX queues and receive packets in bursts Allocate new RX buffers from per queue memory pools to stuff into descriptors Transmit the received packets from RX Free the buffers used to store the packets
Polling
2222
Overcoming The Challenge of Achieving 80 Mpps (and More) Per CPU Socket
Memory and PCIe* access is really, really slow compared to CPU operation
Process a bunch of packets (e.g. 4 packets at a time) to minimize external memory and PCIe bandwidth. Avoid read-modify-write transactions in favour of single write, and multiple reads in favour of single read.
Data doesnt seem to be near the CPU when it needs it (and so it waits)
For memory access, use HW or SW controlled prefetching and align data structures to cache line size (64 Byte) to minimize external memory and PCIe* bandwidth, as all external memory accesses are in cache line increments; for PCIe access, use Direct Data IO (available on Intel Xeon processor E5 Product Family) to read data directly into cache.
The system cant keep up with the amount of interrupts for packet Rx
Switch from an interrupt-driven network device driver to a polled-mode driver.
The out-of-the-box Linux* Scheduler causes too much overhead to task switch
Bind a single software thread to a logical core. Use CPU core isolation and thread affinities for 1:1 mapping of SW threads to HW threads.
Intel PTU Tool indicates that page tables are constantly evicted (D-TLB Thrashing)
Use 2MB or 1G Huge Pages in Linux* to reduce TLB misses.
The Challenge can be Overcome with Smart Programming and Hardware assists!!
2323
NUMA
Processor 0
QPI
PCIe
Pkt Pkt
Processor 1
10 GbE Physical Rx Core 0 Tx Intel DPDK Physical Core 1 Intel DPDK
10 GbE
PCIe
Rx Tx
PCIe
10 GbE
Pkt
App A
App B
App C
Pkt
Rx Tx
10 GbE
PCIe PCIe
App A
App B
App C
Rx Tx
Pkt Pkt
Rx Tx
10 GbE
RSS Mode
Rx Tx
Run to Completion model I/O and Application workload can be handled on a single core I/O can be scaled over multiple cores
Pipeline model I/O application disperses packets to other cores Application work performed on other cores
2424
Intel Performance Tuning Utility (Intel PTU) offers specific tuning advice
Agenda 1. Intels Packet Processing Motivation and Value Proposition 2. Overview of Intel DPDK
2626
200
150
100
50
Mpps
2011 1S Intel Xeon E5-2658 processors C1 Stepping 1 x 8 Core 2.1 GHz PCIe Gen2 Performance
2012 2S Intel Xeon E5-2658 processors C1 Stepping 2 x 8 Core 2.1 GHz PCIe Gen2 Performance
2012 2S Intel Xeon E5-2658 processors C1 Stepping 2 x 8 Core 2.1 GHz PCIe Gen3 Performance (Estimate ONLY)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
2828
Agenda 1. Intels Packet Processing Motivation and Value Proposition 2. Overview of Intel DPDK
2929
Intel DPDK
Core 1 - n
Ecosystem Provided: Intel DPDK Integrated into Commercial Solutions and Intel DPDK services offerings For more information about ecosystem solutions, visit www.intel.com/go/dpdk
3030
Agenda 1. Intels Packet Processing Motivation and Value Proposition 2. Overview of Intel DPDK
3131
Intel DPDK
www.intel.com/go/dpdk
Your One-Stop-Shop for: Documentation and articles, white papers, pod casts Ecosystem information and articles
Examples
See the Video: Intel Data Plane Development Kit (Intel DPDK). Found on EDC site at www.intel.com/go/dpdk under Video: Intel Data Plane Development Kit (Intel DPDK)
3232
Agenda 1. Intels Packet Processing Motivation and Value Proposition 2. Overview of Intel DPDK
3333
Summary
Intel DPDK enables multi-workload/single architecture potential by making IA extremely competitive for packet processing workloads
Distribution of enabling software under flexible and cost-free licensing model enabling maximum customer usability
Fully featured and supported IA Data Plane software solutions via Intels lead Ecosystem partners
3434
3535