Huawei E9000 Server Network Technology White Paper

Download as pdf or txt
Download as pdf or txt
You are on page 1of 120
At a glance
Powered by AI
The document describes network technologies of the Huawei E9000 server and typical networking suggestions and configurations.

The document discusses switch modules like MX210, MX510, CX911, CX915, CX912 that provide network connectivity. It also discusses intended audience and change history.

Connecting switch modules to storage devices and FC switches are some networking applications discussed in the document.

Huawei E9000 Server

V100R001

Network Technology White Paper

Issue 04
Date 2019-03-22

HUAWEI TECHNOLOGIES CO., LTD.


Copyright © Huawei Technologies Co., Ltd. 2019. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.

Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China

Website: http://e.huawei.com

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. i


Huawei E9000 Server
Network Technology White Paper About This Document

About This Document

Purpose
This document describes network technologies of the Huawei E9000 server and typical
networking suggestions and configurations for connecting to other vendors' switches in
various application scenarios. This document helps you quickly understand E9000 network
features and improve deployment efficiency.
For details about more E9000 network features, see the switch module white papers,
configuration guides, and command references of the E9000 server.

Intended Audience
This document is intended for:
l Marketing engineers
l Technical support engineers
l Maintenance engineers

Symbol Conventions
The symbols that may be found in this document are defined as follows.

Symbol Description

Indicates an imminently hazardous situation which, if


not avoided, will result in death or serious injury.

Indicates a potentially hazardous situation which, if


not avoided, could result in death or serious injury.

Indicates a potentially hazardous situation which, if


not avoided, may result in minor or moderate injury.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. ii


Huawei E9000 Server
Network Technology White Paper About This Document

Symbol Description

Indicates a potentially hazardous situation which, if


not avoided, could result in equipment damage, data
loss, performance deterioration, or unanticipated
results.
NOTICE is used to address practices not related to
personal injury.

Calls attention to important information, best


practices and tips.
NOTE is used to address information not related to
personal injury, equipment damage, and environment
deterioration.

Change History
Issue Date Description

04 2019-03-22 l This issue is the fourth official release.


l Added contents related to the CX930, MZ532, and
MZ731.

03 2017-05-31 This issue is the third official release.

02 2016-03-03 This issue is the second official release.

01 2015-01-01 This issue is the first official release.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. iii


Huawei E9000 Server
Network Technology White Paper Contents

Contents

About This Document.....................................................................................................................ii


1 E9000 Overview..............................................................................................................................1
2 Switch Modules............................................................................................................................. 3
2.1 Compatibility Between Mezzanine Cards and Switch Modules.................................................................................... 4
2.2 Mapping Between Mezzanine Cards and Switch Modules............................................................................................ 8
2.3 Networking Assistant Tool............................................................................................................................................. 9

3 Network Technologies............................................................................................................... 11
3.1 iStack............................................................................................................................................................................ 11
3.1.1 iStack Overview.........................................................................................................................................................11
3.1.2 Technical Advantages................................................................................................................................................12
3.1.2.1 Simplified Configuration and Management........................................................................................................... 12
3.1.2.2 Control Planes in 1+1 Redundancy........................................................................................................................ 12
3.1.2.3 Link Backup........................................................................................................................................................... 12
3.1.3 Basic Concepts.......................................................................................................................................................... 13
3.1.3.1 Role.........................................................................................................................................................................13
3.1.3.2 Stack Domain......................................................................................................................................................... 13
3.1.3.3 Stack Member ID....................................................................................................................................................13
3.1.3.4 Stack Priority.......................................................................................................................................................... 13
3.1.3.5 Physical Stack Member Port...................................................................................................................................14
3.1.3.6 Stack Port................................................................................................................................................................14
3.1.4 Basic Principles......................................................................................................................................................... 14
3.1.4.1 Stack Setup............................................................................................................................................................. 14
3.1.4.2 Removing Stack Members......................................................................................................................................15
3.1.4.3 Stack Split...............................................................................................................................................................15
3.1.4.4 DAD........................................................................................................................................................................16
3.1.4.5 Fast Upgrade...........................................................................................................................................................16
3.1.5 Local Preferential Forwarding...................................................................................................................................16
3.1.6 Configuration Instance.............................................................................................................................................. 17
3.2 LACP............................................................................................................................................................................ 19
3.2.1 LACP Overview........................................................................................................................................................ 19
3.2.2 Basic Concepts.......................................................................................................................................................... 19
3.2.2.1 Link Aggregation, LAG and LAI........................................................................................................................... 20

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. iv


Huawei E9000 Server
Network Technology White Paper Contents

3.2.2.2 Member Interfaces and Links................................................................................................................................. 20


3.2.2.3 Active/Inactive Interfaces and Links...................................................................................................................... 20
3.2.2.4 Upper Threshold of the Active Interface Quantity................................................................................................. 20
3.2.2.5 Lower Threshold of the Active Interface Quantity.................................................................................................20
3.2.2.6 Minimum Number of Local Active Links.............................................................................................................. 20
3.2.3 Link Aggregation Load Balancing Mode.................................................................................................................. 21
3.2.4 Link Aggregation Working Modes............................................................................................................................ 22
3.2.4.1 Manual Link Aggregation...................................................................................................................................... 23
3.2.4.2 LACP Link Aggregation........................................................................................................................................ 23
3.2.5 Comparison Between Huawei Eth-Trunks and Cisco Port Channels........................................................................24
3.3 M-LAG......................................................................................................................................................................... 24
3.3.1 Introduction to M-LAG............................................................................................................................................. 24
3.3.2 Basic Concepts.......................................................................................................................................................... 25
3.3.3 Implementation.......................................................................................................................................................... 26
3.3.3.1 Dual Homing a Switch to an Ethernet Network..................................................................................................... 27
3.3.3.2 Dual Homing a Switch to an IP Network............................................................................................................... 32
3.3.4 Link Aggregation Working Modes............................................................................................................................ 36
3.3.4.1 Connecting a Server in Dual-homing Mode...........................................................................................................36
3.3.4.2 Multi-level M-LAG................................................................................................................................................ 37
3.4 NIC Teaming................................................................................................................................................................ 38
3.4.1 Overview................................................................................................................................................................... 38
3.4.2 NIC Teaming in Windows......................................................................................................................................... 38
3.4.3 Bonding in Linux.......................................................................................................................................................40
3.4.4 NIC Teaming in vSphere........................................................................................................................................... 41
3.4.4.1 vSphere Virtual Network Components...................................................................................................................41
3.4.4.2 vSphere Virtual Switches........................................................................................................................................42
3.4.5 Mapping Between NIC Teaming and Switch Modules............................................................................................. 45
3.5 FC Technology..............................................................................................................................................................46
3.5.1 Basic Concepts.......................................................................................................................................................... 46
3.5.2 Working Principles.................................................................................................................................................... 49
3.5.3 FCF............................................................................................................................................................................ 50
3.5.4 NPV........................................................................................................................................................................... 50
3.5.5 Zone........................................................................................................................................................................... 51
3.6 DCB.............................................................................................................................................................................. 53
3.6.1 DCB Overview.......................................................................................................................................................... 53
3.6.2 PFC............................................................................................................................................................................ 53
3.6.3 ETS............................................................................................................................................................................ 54
3.6.4 DCBX........................................................................................................................................................................ 56
3.6.5 Configuration Instance.............................................................................................................................................. 57
3.7 FCoE............................................................................................................................................................................. 58
3.7.1 FCoE Overview......................................................................................................................................................... 58
3.7.2 Basic Concepts.......................................................................................................................................................... 58

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. v


Huawei E9000 Server
Network Technology White Paper Contents

3.7.2.1 ENode..................................................................................................................................................................... 59
3.7.2.2 FCoE Virtual Link.................................................................................................................................................. 59
3.7.2.3 FIP.......................................................................................................................................................................... 59
3.7.2.4 FCoE VLAN...........................................................................................................................................................59
3.7.3 FCoE Packet Format..................................................................................................................................................60
3.7.4 FIP............................................................................................................................................................................. 60
3.7.4.1 FIP VLAN Discovery............................................................................................................................................. 61
3.7.4.2 FIP FCF Discovery................................................................................................................................................. 62
3.7.4.3 FIP FLOGI and FDISC...........................................................................................................................................62
3.7.5 FCoE Virtual Link Maintenance................................................................................................................................62
3.7.6 FIP Snooping............................................................................................................................................................. 63
3.7.7 Configuration Instance.............................................................................................................................................. 63
3.8 Smart Link.................................................................................................................................................................... 63
3.8.1 Background................................................................................................................................................................64
3.8.2 Smart Link Basic Concepts....................................................................................................................................... 65
3.8.2.1 Smart Link Group................................................................................................................................................... 65
3.8.2.2 Master Port............................................................................................................................................................. 66
3.8.2.3 Slave Port................................................................................................................................................................66
3.8.2.4 Control VLAN........................................................................................................................................................ 66
3.8.2.5 Flush Packet............................................................................................................................................................66
3.8.3 Smart Link Working Mechanism...............................................................................................................................67
3.8.4 Configuration Instance.............................................................................................................................................. 68
3.9 Monitor Link.................................................................................................................................................................68
3.9.1 Monitor Link Overview.............................................................................................................................................68
3.9.1.1 Uplink Ports............................................................................................................................................................ 69
3.9.1.2 Downlink Ports....................................................................................................................................................... 69
3.9.2 Monitor Link Working Mechanism........................................................................................................................... 69
3.9.3 Configuration Instance.............................................................................................................................................. 70
3.10 Configuration Restoration.......................................................................................................................................... 70

4 Networking Applications.......................................................................................................... 72
4.1 Ethernet Networking.....................................................................................................................................................72
4.1.1 Stack Networking...................................................................................................................................................... 72
4.1.2 Smart Link Networking............................................................................................................................................. 73
4.1.3 STP/RSTP Networking..............................................................................................................................................74
4.1.4 Monitor Link Networking......................................................................................................................................... 74
4.2 Networking with Cisco Switches (PVST+)..................................................................................................................75
4.2.1 Cisco PVST+ Protocol...............................................................................................................................................75
4.2.2 Processing of PVST+ BPDUs................................................................................................................................... 77
4.2.3 Standard MSTP..........................................................................................................................................................77
4.2.4 Difference Between Cisco and Huawei MSTPs........................................................................................................ 77
4.2.5 Interconnection Scheme............................................................................................................................................ 78
4.2.5.1 Smart Link Group Interconnecting with Cisco PVST+ Network...........................................................................78

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. vi


Huawei E9000 Server
Network Technology White Paper Contents

4.2.5.2 IEEE Standard Protocol (Root Bridge on the Cisco PVST+ Network Side)......................................................... 79
4.3 Cisco vPC Interoperability........................................................................................................................................... 79
4.4 FCoE Converged Networking...................................................................................................................................... 81
4.4.1 CX311 FCoE Converged Networking....................................................................................................................... 81
4.4.1.1 CX311 Switch Module........................................................................................................................................... 81
4.4.1.2 Default Configuration of the CX311...................................................................................................................... 83
4.4.1.3 Connecting MX510s to Storage Devices................................................................................................................87
4.4.1.4 Connecting MX510s to FC Switches..................................................................................................................... 87
4.4.1.5 MX510 Link Load Balancing and Failover............................................................................................................88
4.4.1.6 CX311s in Stacking Mode......................................................................................................................................89
4.4.2 Connecting CX310s to Cisco Nexus 5000 Series Switches...................................................................................... 90
4.4.3 Connecting CX310s to Brocade VDX6700 Series Switches.................................................................................... 94
4.4.4 CX320 Converged Networking................................................................................................................................. 95
4.4.4.1 CX320 Converged Switch Module.........................................................................................................................95
4.4.4.2 FCF Networking..................................................................................................................................................... 96
4.4.4.3 NPV Networking.................................................................................................................................................. 101
4.5 FC Networking........................................................................................................................................................... 106
4.5.1 Multi-Plane Switch Module.....................................................................................................................................106
4.5.1.1 Overview.............................................................................................................................................................. 106
4.5.1.2 MZ910 Port Types................................................................................................................................................ 107
4.5.1.3 MX210/MX510 Working Modes......................................................................................................................... 108
4.5.2 Connecting MX210s/MX510s to Storage Devices................................................................................................. 109
4.5.3 Connecting MX210s/MX510s to FC Switches....................................................................................................... 110
4.5.3.1 MX210 Load Balancing and Failover.................................................................................................................. 110

A Acronyms and Abbreviations................................................................................................ 111

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. vii


Huawei E9000 Server
Network Technology White Paper 1 E9000 Overview

1 E9000 Overview

The Huawei E9000 converged architecture blade server (E9000 for short) is a new-generation
powerful infrastructure platform that integrates computing, storage, switching, and
management. It delivers high availability, computing density, energy efficiency, midplane
bandwidth, as well as low network latency and intelligent management and control. It also
allows elastic configuration and flexible expansion of computing and storage resources and
application acceleration. It supports eight full-width compute nodes, 16 half-width compute
nodes, or 32 child compute nodes. Half-width and full-width compute nodes can be combined
flexibly to meet various service needs. Each compute node provides up to two or four CPUs
and 48 DIMMs, supporting a maximum of 6 TB memory capacity.

Figure 1-1 E9000 chassis

The E9000 provides flexible computing and I/O expansion. The compute nodes support Intel
CPUs of the next three generations. The E9000 supports Ethernet, Fiber Channel (FC),
InfiniBand (IB), and Omni-Path Architecture (OPA) switching. The supported network ports
range from mainstream GE and 10GE ports to 40GE ports and future 100GE ports. The

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 1


Huawei E9000 Server
Network Technology White Paper 1 E9000 Overview

E9000 houses four switch modules at the rear of the chassis to support Ethernet, FC, IB, and
OPA switching networks. The switch modules provide powerful data switching capability and
rich network features. The compute nodes can provide standard PCIe slots. A full-width
compute node supports a maximum of six PCIe cards.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 2


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

2 Switch Modules

The E9000 chassis provides four rear slots for installing pass-through and switch modules that
support GE, 10GE, 40GE, 8G/16G FC, 40G/56G IB, 100G OPA, and evolution towards
100GE, 100G IB, and 200G OPA. The switch modules provide powerful data switching
capabilities.

Figure 2-1 Installation positions of switch modules

The four switch module slots are numbered 1E, 2X, 3X, and 4E from left to right. The chassis
midplane provides eight pairs of links (10GE or 40GE ports) to connect switch modules in
slots 1E and 4E and switch modules in slots 2X and 3X. The switch modules can be stacked
or cascaded through the links. Figure 2-2 shows the switch module slots and their
interconnections on the midplane.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 3


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

Figure 2-2 Slot numbers and connections between switch modules

If switch modules in slots 1E and 4E or those in slots 2X and 3X need to be stacked and local
preferential forwarding is enabled (default setting) for the Eth-Trunk, the midplane ports can
provide the bandwidth required for stacking. No stacking cable is required.

NOTE

The midplane does not provide links for interconnecting CX915 or CX111 switch modules. To stack
CX915 or CX111 switch modules, use optical or electric cables to connect the 10GE ports on the panels.

2.1 Compatibility Between Mezzanine Cards and Switch Modules


2.2 Mapping Between Mezzanine Cards and Switch Modules
2.3 Networking Assistant Tool

2.1 Compatibility Between Mezzanine Cards and Switch


Modules
The E9000 supports Ethernet, FC, IB, and OPA switch modules. These switch modules can be
used with different mezzanine cards to build different types of networks with multiple planes
at different rates. Table 2-1 lists the switch modules and compatible mezzanine cards.

Table 2-1 Switch modules and compatible mezzanine cards


Switch Module Switch Module Mezzanine Card
Type

GE switch CX110 Downlink: 16 x (2 x GE)+2 x MZ110 4 x GE


module 40GE
Uplink: 12 x GE + 4 x 10GE

CX111 Downlink: 16 x (2 x GE) MZ110 4 x GE


Uplink: 12 x GE + 4 x 10GE

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 4


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

Converged CX310 Downlink: 16 x (2 x 10GE) MZ510 2 x 10GE


switch module + 40GE converged
Uplink: 16 x 10GE network adapter
(CNA)

MZ512 2 x (2 x 10GE)
CNA

CX311 Downlink: 16 x (2 x 10GE) MZ510 2 x 10GE CNA


+ 40GE
MZ512 2 x (2 x 10GE)
Uplink: 16 x 10GE + 8 x 8G FC CNA

CX312 Downlink: 16 x (2 x 10GE) MZ311 2 x 10GE


+ 40GE
MZ510 2 x 10GE CNA
Uplink: 24 x 10GE
MZ512 2 x (2 x 10GE)
CNA

CX320 Downlink: 16 x (2 x 10GE) MZ510 2 x 10GE CNA


+ 40GE
MZ512 2 x (2 x 10GE)
Uplink: 8 x 10GE + 2 x 40GE CNA
+ 2 x (4 x 8G FC/4 x 10GE)

CX710 Downlink: 16 x 40GE MZ710 2 x 40GE


Uplink: 8 x 40GE MZ731 2 x 100GE
(reduced to
40GE)

Multi-plane CX911 Ethernet switching plane: MZ910 2 x 10GE + 2 x


switch module Downlink: 16 x (2 x 10GE) + 2 8G FC/10G
x 40GE Fibre Channel
over Ethernet
Uplink: 16 x 10GE (FCoE)
FC switching plane (QLogic):
Downlink: 16 x 8G FC
Uplink: 8 x 8G FC

CX912 Ethernet switching plane: MZ910 2 x 10GE + 2 x


Downlink: 16 x (2 x 10GE) + 2 8G FC/10G
x 40GE FCoE
Uplink: 16 x 10GE
FC switching plane (Brocade):
Downlink: 16 x 8G FC
Uplink: 8 x 8G FC

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 5


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

CX915 Ethernet switching plane: MZ910 2 x 10GE + 2 x


Downlink: 16 x (2 x GE) 8G FC/10G
FCoE
Uplink: 12 x GE + 4 x 10GE
FC switching plane (QLogic):
Downlink: 16 x 8G FC
Uplink: 8 x 8G FC

CX916 Ethernet switching plane: MZ220 2 x 16G FC


Downlink: 16 x (2 x 10GE)
Uplink: 8 x 25GE + 2 x 40GE
FC switching plane (Brocade):
Downlink: 16 x 16G FC
Uplink: 8 x 16G FC

CX920 10GE switching plane: MZ710 2 x 40GE


Downlink: 16 x 10GE + 1 x MZ731 2 x 100GE
40GE (reduced to
Uplink: 8 x 10GE 40GE)
40GE switching plane:
Downlink: 16 x 40GE + 2 x
40GE
Uplink: 8 x 40GE

CX930 10GE switching plane: MZ731 2 x 100GE


Downlink: 16 x (2 x 10GE) MZ532 4 x 25GE
Uplink: 8 x 25GE
MZ710 2 x 40GE
100GE switching plane:
Downlink: 16 x 100GE MZ520 2 x 10GE

Uplink: 10 x 100GE MZ522 4 x 10GE

MZ312 4 x 10GE

FC switch CX210 FC switching plane (Brocade): MZ910 2 x 10GE + 2 x


module Downlink: 16 x 8G FC 8G FC/10G
FCoE
Uplink: 8 x 8G FC

CX220 FC switching plane (Brocade): MZ220 2 x 16G FC


Downlink: 16 x 8G FC
Uplink: 8 x 8G FC

InfiniBand CX610 QDR/FDR InfiniBand switch MZ611 2 x 56G FDR


switch module module:
Downlink: 16 x 4X QDR/FDR
Uplink: 18 x QDR/FDR QSFP+

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 6


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

CX611 QDR/FDR InfiniBand switch MZ610 2 x 40G QDR


module:
MZ611 2 x 56G FDR
Downlink: 16 x 4X QDR/FDR
Uplink: 18 x QDR/FDR QSFP+

CX620 QDR/FDR InfiniBand switch MZ611 2 x 56G FDR


module:
MZ612 2 x 56G FDR
Downlink: 16 x 4X QDR/FDR
Uplink: 18 x QDR/FDR QSFP+ MZ620 2 x 100G EDR

MZ622 2 x 100G EDR

OPA switch CX820 OPA switch module: MZ821 1 x 100G HFI


module Downlink: 16 x OPA
Uplink: 20 x OPA QSFP+

Pass-through CX116 Downlink: 16 x (2 x GE) MZ110 4 x GE


module Uplink: 16 x (2 x GE)

CX317 Downlink: 16 x (2 x 10GE) MZ510 2 x 10GE CNA


Uplink: 16 x (2 x 10GE) MZ512 2 x (2 x 10GE)
CNA

CX318 Downlink: 16 x (2 x 10GE) MZ510 2 x 10GE CNA


Uplink: 16 x (2 x 10GE) MZ512 2 x (2 x 10GE)
CNA

MZ310 2 x 10GE

MZ312 2 x (2 x 10GE)

MZ532 4 x 25GE
(Reduced to
10GE)

NOTE

l If the MZ910 is used with a CX210 switch module, the two 10GE ports cannot be used. (The CX210
does not provide an Ethernet switching plane).
l If the MZ910 is used with a switch module that provides the QLogic FC switching plane, ports in
slots 1 to 12 are FCoE ports and ports in slots 13 to 16 are FC ports. If the MZ910 is used with a
switch module that provides the Brocade FC switching plane, ports in slots 1 to 16 are all FC ports.
l If the MZ910 is used with a CX915 switch module, the rate of 10GE ports is automatically reduced
to GE.
l The CX916 provides Ethernet switching only if it is installed in slot 2X or 3X and works with V5
compute nodes (such as the CH121 V5 with the two-port 10GE LOM). If the CX916 is installed in
slot 1E or 4E, it only provides FC switching and works with the MZ220 NIC.
l The CX930 provides 10GE switching capability only if it is installed in slot 2X or 3X and used with
V5 compute nodes (such as the CH121 V5).

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 7


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

2.2 Mapping Between Mezzanine Cards and Switch


Modules
Each half-width compute node provides one or two mezzanine cards in slots Mezz 1 and 2.
Mezz 1 connects to switch module slots 2X and 3X. Mezz 2 connects to switch module slots
1E and 4E. A full-width compute node provides one, two, or four mezzanine cards. Mezz 1
and Mezz 3 (optional) connect to switch module slots 2X and 3X; Mezz 2 and Mezz 4
(optional) connect to switch modules slots 1E and 4E.
A mezzanine card connects to switch modules at the rear of the chassis through the midplane.
The network port type (Ethernet, FC, IB, and OPA) and quantity depend on the mezzanine
card type, which must match the switch module type.

Figure 2-3 Connections between mezzanine cards and switch modules

A multi-plane mezzanine card can be in slot Mezz 1 or Mezz 2, and connects to switch
modules in slots 2X and 3X or slots 1E and 4E. An OPA mezzanine card can only be in slot
Mezz 2 because the connected OPA switch module can only be installed in slot 1E. A
mezzanine card other than multi-plane and OPA mezzanine cards provides two or four ports.
A 4-port mezzanine card is connected to two ports of each connected switch module. Figure
2-4 shows the connections between non-OPA mezzanine cards and switch modules.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 8


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

Figure 2-4 Connections between non-OPA mezzanine cards and switch modules

Figure 2-5 Connections between OPA mezzanine cards and the OPA switch module

2.3 Networking Assistant Tool


The networking assistant tool help users query the port mapping between mezzanine cards
and switch modules and determine the network interface card (NIC) number (ethx or vmnicx)
displayed in the OS.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 9


Huawei E9000 Server
Network Technology White Paper 2 Switch Modules

Figure 2-6 Networking assistant tool

Use the tool as follows:

1. Choose the compute node type and mezzanine card type. If the mezzanine card is a
CNA, configure the mezzanine card attributes.
2. Choose the switch module type.
3. Click Show Network Connection.

The mapping between the ports on CNAs and switch modules is displayed, as shown in
Figure 2-7.

Figure 2-7 Network connections

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 10


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3 Network Technologies

This section describes common technologies used for networking of the E9000 and provides
some configuration instances. The networking technologies include iStack, Link Aggregation
Control Protocol (LACP), NIC teaming, Data Center Bridging (DCB), FCoE, Smart Link, and
Monitor Link.
3.1 iStack
3.2 LACP
3.3 M-LAG
3.4 NIC Teaming
3.5 FC Technology
3.6 DCB
3.7 FCoE
3.8 Smart Link
3.9 Monitor Link
3.10 Configuration Restoration

3.1 iStack

3.1.1 iStack Overview


An E9000 chassis can house two or four switch modules, which are similar to two or four
conventional switches. A mezzanine card on the E9000 provides two or four ports to connect
to two switch modules. Users need to log in to the two switch modules separately over SSH or
a serial port to perform configuration, which increases O&M workloads. Users need to
configure reliability features for the two switch modules separately to ensure network
reliability. For example, if an uplink port of a switch module is faulty, the fault needs to be
sent to the NIC on the OS through certain technologies, such as Monitor Link, to trigger a link
switchover. iStack allows multiple switch modules to be virtualized and stacked as one logical
device without changing the physical network topology. This technology simplifies the
network structure and network protocol deployment, and improves network reliability and
manageability. Figure 3-1 illustrates the iStack technology.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 11


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-1 Logical structure of a stack system

3.1.2 Technical Advantages

3.1.2.1 Simplified Configuration and Management


iStack allows multiple physical devices to be presented as one logical device. Users can log in
from any member device and uniformly configure and manage all the member devices.

3.1.2.2 Control Planes in 1+1 Redundancy


After being stacked, the switch modules set up control panels in 1+1 redundancy mode. The
master switch module processes services, and the standby switch module functions as a
backup and always synchronizes data with the master switch module. If the master switch
module fails, the standby switch module becomes the master and one of the slave switch
modules becomes standby.

3.1.2.3 Link Backup


iStack supports link aggregation among multiple switch modules. That is, uplink ports of
multiple switch modules can be added to an Eth-Trunk group to improve the uplink
bandwidth and network reliability. With iStack, users do not need to configure fault
association technologies.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 12


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-2 Link backup of a stack system

3.1.3 Basic Concepts

3.1.3.1 Role
Switch modules that have joined a stack are member switch modules. Each member switch
module in a stack plays one of the following roles:

1. A master switch module manages the entire stack. A stack has only one master switch
module.
2. A standby switch module serves as a backup of the master switch module. A stack has
only one standby switch module.
3. Except the master switch module, all switch modules in a stack are slave switch
modules. The standby switch module also functions as a slave switch module.

3.1.3.2 Stack Domain


A stack domain is a set of switch modules connected by stack links. Only switch modules that
share the same stack domain can be stacked together.

3.1.3.3 Stack Member ID


A member ID uniquely identifies a switch module in a stack and is the same as the slot
number of the switch module. By default, the stack member IDs of switch modules in slots
1E, 2X, 3X, and 4E are 1, 2, 3, and 4 respectively. The stack member IDs can be modified by
running the stack command.

3.1.3.4 Stack Priority


The stack priority determines the role of a member switch module in a role election. A larger
value indicates a higher priority and higher probability of being elected as the master switch
module.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 13


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.1.3.5 Physical Stack Member Port


A physical stack member port is used for stack connections. It forwards service packets or
stack protocol packets between member switch modules.

3.1.3.6 Stack Port


A stack port is a logical port dedicated for stacking and must be bound to a physical member
port. Each member switch module provides two stack ports: stack-port n/1 and stack-port n/2.
n indicates the stack member ID.

3.1.4 Basic Principles

3.1.4.1 Stack Setup


10GE/40GE midplane ports or 10GE panel ports can be used as physical stack ports. 10GE
and 40GE ports cannot be used as stack ports at the same time. Multiple physical stack
member ports can be bound to a stack port to improve the link bandwidth and reliability.
Enabling local preferential forwarding for the Eth-Trunk reduces demands for the stack port
bandwidth. Figure 3-3 shows how a stack is set up.

Figure 3-3 Stack setup

A stack consists of multiple member switch modules, each of which has a specific role. When
a stack is created, the member switch modules send stack competition packets to each other to
elect the master switch module. The remaining switch modules function as slave switch
modules. The master switch module is elected based on the following rules in sequence:
1. Running status: The switch module that has fully started and entered running state
becomes the master switch module.
2. Stack priority: The switch module with the highest stack priority becomes the master
switch module.
3. Software version: The switch module running the latest software version becomes the
master switch module.
4. MAC address: The switch module with the smallest MAC address becomes the master
switch module.
The master switch module collects stack member information, works out the stack topology,
and synchronizes the topology information to the other member switch modules. If the stack

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 14


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

ID of a slave switch module conflicts with an existing stack ID, the switch module restarts
repeatedly. If the master and slave switch modules use different software versions, the
software version of the master switch module will be synchronized to the slave switch
module, which then restarts and joins the stack again.

The master switch elects a standby switch module among the slave switch modules. If the
master switch module fails, the standby switch module takes over all services from the master
switch module. The following conditions of the member switch modules are compared in
sequence until a standby switch module is elected:

1. Stack priority: The switch module with the highest stack priority becomes the standby
switch module.
2. MAC address: The switch module with the smallest MAC address becomes the standby
switch module.

Before a stack is set up, each switch module is independent and has its own IP address. Users
need to manage the switch modules separately. After a stack is set up, the switch modules in
the stack form a logical entity, and users can use a single IP address to manage and maintain
all the member switch modules. The IP address and MAC address of the master switch
module when the stack is set up for the first time are used as the IP address and MAC address
of the stack.

3.1.4.2 Removing Stack Members


A member switch module may exit from a stack. The impact of the exit depends on the role of
the member switch module.

1. If a master switch module exits, the standby switch module becomes the master, updates
the stack topology, and designates a new standby switch module.
2. If the standby switch module exits, the master switch updates the stack topology and
designates a new standby switch module.
3. If a slave switch module exits, the master switch module updates the stack topology.
4. If both the master and standby switch modules exit, all slave switches restart and set up a
new stack.

3.1.4.3 Stack Split


If the stacking cable is faulty, the stack splits into multiple stacks. See Figure 3-4. After a
stack splits, multiple stacks with the same configuration may be generated. As a result, IP
address and MAC address conflicts cause network faults.

Figure 3-4 Stack split

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 15


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.1.4.4 DAD
Dual-active detection (DAD) is a protocol used to detect a stack split, handle conflicts, and
take recovery actions to minimize the impact of a stack split on services. A DAD link directly
connecting stacked switches can detect dual-active switches, as shown in Figure 3-5.

Figure 3-5 DAD

After a stack splits, the switch modules exchange competition packets and compare the
received competition packets with the local ones. The switch module that wins the
competition becomes the master switch, remains active, and continues forwarding service
packets. If a switch module becomes a standby switch after the competition, it shuts down all
service ports except the reserved ones, enters the recovery state, and stops forwarding service
packets. The master switch is determined based on the following DAD competition rules in
sequence:
1. Stack priority: The switch module with the highest stack priority becomes the master
switch.
2. MAC address: The switch module with the smallest MAC address becomes the master
switch.
When the faulty stack links recover, stacks in recovery state restart and the shutdown ports
restore the up state. The entire stack system then recovers.

3.1.4.5 Fast Upgrade


Fast upgrade allows the member switches in a stack to be upgraded without interrupting
services. This feature minimizes the impact of upgrades on services. During a fast upgrade of
stacked switching planes, the standby switch module restarts to transition to the new version
first. The active switch module forwards data during the restart.
l If the upgrade fails, the standby switch module restarts and rolls back to the source
version.
l If the upgrade is successful, the standby switch module becomes the new master switch
module and starts to forward data. Then the original master switch module restarts, gets
upgraded, and serves as the standby switch module. The fast stack upgrade command is
stack upgrade fast.

3.1.5 Local Preferential Forwarding


In a stack system, after an Eth-Trunk interface across switch modules is configured, traffic is
sent through the link selected based on the configured routing. Because Eth-Trunk consists of

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 16


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

ports on different switch modules, some traffic is forwarded among switch modules. Enabling
the local forwarding for the Eth-Trunk interface can reduce the traffic forwarded among
switch modules, as shown in Figure 3-6.

Figure 3-6 Local preferential forwarding

After local preferential forwarding is enabled, outgoing traffic is preferentially forwarded


through the ports receiving the traffic. If the number of active links of a switch module in the
Eth-Trunk interface is fewer than the minimum number of active links on the switch module,
local preferential forwarding will be automatically disabled. Traffic will be sent through a link
selected from the Eth-Trunk member interfaces. The local preferential forwarding function is
enabled by default after an Eth-Trunk interface is created.

3.1.6 Configuration Instance


The CX310s in slots 2X and 3X are stacked using a 40GE link provided by the chassis
midplane, as shown in Figure 3-7.

Figure 3-7 CX310 stacking network

# Run the reset saved-configuration command to restore the default configuration of the
switch module in slot 2X and restart the switch module.
<HUAWEI>reset saved-configuration
The action will delete the saved configuration in the device. The
configuration will be erased to reconfigure.Continue? [Y/N]:Y
Warning: Now clearing the configuration in the device.......
begin synchronize configuration to SMM ...
slot 2: upload configuration to SMM successfully.

Info: Succeeded in clearing the configuration in the device.


<HUAWEI>reboot fast

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 17


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

# Run the reset saved-configuration command to restore the default configuration of the
switch module in slot 3X and restart the switch module.
<HUAWEI>reset saved-configuration
The action will delete the saved configuration in the device. The
configuration will be erased to reconfigure.Continue? [Y/N]:Y
Warning: Now clearing the configuration in the device.
begin synchronize configuration to SMM ...
slot 3: upload configuration to SMM successfully.

Info: Succeeded in clearing the configuration in the device.


<HUAWEI>reboot fast

# Set the default stack member ID to 2, domain ID to 10, and priority to 150 for the CX310 in
slot 2X.
<HUAWEI> system-view
[~HUAWEI] sysname CX310_2
[*HUAWEI] commit
[~CX310_2] stack
[*CX310_2-stack] stack member priority 150
[*CX310_2-stack] stack member 2 domain 10
[*CX310_2-stack] quit
[*CX310_2] commit

# Add service port 40GE 2/18/1 of the CX310 in slot 2X to stack port 2/1.
[~CX310_2] interface 40GE 2/18/1
[*CX310_2-40GE2/18/1] port mode stack
[*CX310_2-40GE2/18/1] quit
[*CX310_2] commit
[~CX310_2] interface stack-port 2/1
[*CX310_2-Stack-Port2/1] port member-group interface 40GE 2/18/1
[*CX310_2-Stack-Port2/1] quit
[*CX310_2] commit

# Set the default stack member ID to 3, domain ID to 10, and priority to 100 for the CX310 in
slot 3X.
<HUAWEI> system-view
[~HUAWEI] sysname CX310_3
[*HUAWEI] commit
[~CX310_3] stack
[*CX310_3-stack] stack priority 100
[*CX310_3-stack] stack member 3 domain 10
[*CX310_3-stack] quit
[*CX310_3] commit

# Add service port 40GE 3/18/1 of the CX310 in slot 3X to stack port 3/1.
[~CX310_3] interface 40GE 3/18/1
[*CX310_3-40GE3/18/1] port mode stack
[*CX310_3-40GE3/18/1] quit
[*CX310_3] commit
[~CX310_3] interface stack-port 3/1
[*CX310_3-Stack-Port3/1] port member-group interface 40GE 3/18/1
[*CX310_3-Stack-Port3/1] quit
[*CX310_3] commit

# Enable the 40GE ports interconnecting slots 2X and 3X.


[~CX310_2] interface 40GE 2/18/1
[*CX310_2-40GE2/18/1] undo shutdown
[*CX310_2-40GE2/18/1] quit
[*CX310_2] commit
[~CX310_2] quit
<CX310_2> save
[~CX310_3] interface 40GE 3/18/1
[*CX310_3-40GE3/18/1] undo shutdown

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 18


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

[*CX310_3-40GE3/18/1] quit
[*CX310_3] commit
[~CX310_3] quit
<CX310_3> save

NOTE

The switch modules in slots 2X and 3X must be in the same stack domain. After a low-priority switch
module (in slot 3X) is configured, the standby switch module restarts automatically and a stack system
is created. If the stack system configuration is correct, run the save command to save the configuration.
(If the switch module in slot 2X is not the master switch after the first master competition, run the
reboot command to restart the stack system. Then, the system will select the switch module in slot 2X
as the master switch based on the priority.)

# Change the system name and view the stack system information.
[CX310_2] sysname CX310_C
[~CX310_2] commit
[~CX310_C] display stack
---------------------------------------------------------------------
MemberID Role MAC Priority Device Type Bay/Chassis
---------------------------------------------------------------------
2 Master 0004-9f31-d540 150 CX310 2X/1
3 Standby 0004-9f62-1f80 100 CX310 3X/1
---------------------------------------------------------------------
[~CX310_C] quit
<CX310_C> save

3.2 LACP

3.2.1 LACP Overview


LACP allows multiple physical ports to be bound as a logical port to increase the link
bandwidth without upgrading hardware. In addition, LACP uses the link backup mechanism
to increase link transmission reliability. LACP has the following advantages:

1. Increased bandwidth: The bandwidth of a link aggregation group (LAG) is the sum of
bandwidth of its member ports. The maximum number of LAG members is 16.
2. High reliability: If an active link fails, traffic of the link is switched to other active links,
improving the LAG reliability.
3. Load balancing: In a LAG, traffic is evenly distributed among active member links.

A switch module LAG is named Eth-Trunk. In an Eth-Trunk, all the member ports must be of
the same type and use the default configuration. Figure 3-8 shows the LAG of switch
modules.

Figure 3-8 LAG

3.2.2 Basic Concepts

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 19


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.2.2.1 Link Aggregation, LAG and LAI


Link aggregation allows multiple physical ports to be combined as a logical interface to
increase the bandwidth and reliability. A LAG is a logical link consisting of multiple Ethernet
links bound together. Each LAG corresponds to a unique logical interface, which is called an
aggregation interface or Eth-Trunk interface.

3.2.2.2 Member Interfaces and Links


Physical ports that constitute an Eth-Trunk interface are member interfaces. A link
corresponding to a member interface is called a member link.

3.2.2.3 Active/Inactive Interfaces and Links


The member interfaces in a LAP are classified into active and inactive interfaces. An active
interface has data transmitted. An inactive interface has no data transmitted. Links connected
to active interfaces are called active links, and links connected to inactive interfaces are called
inactive links.

3.2.2.4 Upper Threshold of the Active Interface Quantity


The objective of setting the upper threshold of the active interface quantity is to improve
network reliability while maintaining sufficient bandwidth. If the number of active links
reaches the upper threshold, the number of active interfaces in an Eth-Trunk remains the same
even if more member interfaces are added into the Eth-Trunk. The additional member links
are set to Down and server as backup links.

For example, an Eth-Trunk has eight links, and each link provides a bandwidth of 1 Gbit/s. If
the maximum bandwidth required is 5 Gbit/s, you can set the upper threshold to 5. As a result,
the other three member links automatically enter the backup state to improve the network
reliability.

3.2.2.5 Lower Threshold of the Active Interface Quantity


The objective of setting the lower threshold of the active interface quantity is to ensure the
minimum bandwidth. When the number of active links is lower than the lower threshold, the
Eth-Trunk interface goes Down.

For example, if each physical link provides a bandwidth of 1 Gbit/s and the minimum
bandwidth required is 2 Gbit/s, you can set the lower threshold to 2 or a larger value.

3.2.2.6 Minimum Number of Local Active Links


Local preferential forwarding is enabled for Eth-Trunks by default, so that Eth-Trunks do not
forward traffic between switch modules. To ensure the forwarding bandwidth, users can set
the minimum number of local active links. If the number of active links of each switch
module in the Eth-Trunk is smaller than the minimum number of local active links, local
preferential forwarding will be disabled automatically.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 20


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.2.3 Link Aggregation Load Balancing Mode


To improve bandwidth utilization, traffic from the Eth-Trunk is sent through different physical
member links to achieve load balancing. The switch modules support the following load
balancing modes:
1. dst-ip: load balancing based on the destination IP address. In this mode, the system
obtains the specified four-bit value from each of the destination IP address and the TCP
or UDP port number in outbound packets to perform the Exclusive-OR calculation, and
then selects an outbound port from the Eth-Trunk table based on the calculation result.
2. dst-mac: load balancing based on the destination MAC address. In this mode, the system
obtains the specified four-bit value from each of the destination MAC address, VLAN
ID, Ethernet type, and inbound port information to perform the Exclusive-OR
calculation, and then selects an outbound port from the Eth-Trunk table based on the
calculation result.
3. src-ip: load balancing based on the source IP address. In this mode, the system obtains
the specified four-bit value from each of the source IP address and the TCP or UDP port
number in inbound packets to perform the Exclusive-OR calculation, and then selects an
outbound port from the Eth-Trunk table based on the calculation result.
4. src-mac: load balancing based on the source MAC address. In this mode, the system
obtains the specified four-bit value from each of the source MAC address, VLAN ID,
Ethernet type, and inbound port information to perform the Exclusive-OR calculation,
and then selects an outbound port from the Eth-Trunk table based on the calculation
result.
5. src-dst-ip: load balancing based on the Exclusive-OR result of the source and
destination IP addresses. In this mode, the system performs the Exclusive-OR calculation
between the Exclusive-OR results of the dst-ip and src-ip modes, and then selects an
outbound port from the Eth-Trunk table based on the calculation result.
6. src-dst-mac: load balancing based on the Exclusive-OR result of the source and
destination MAC addresses. In this mode, the system obtains the specified four-bit value
from each of the source MAC address, destination MAC address, VLAN ID, Ethernet
type, and inbound port information to perform the Exclusive-OR calculation, and then
selects an outbound port from the Eth-Trunk table based on the calculation result.
7. enhanced: The system selects outbound ports for different packets based on an enhanced
load balancing mode.
The enhanced mode is the default load balancing mode. If load balancing is used, traffic
selects ports based on packet types.

Table 3-1 Default configuration of the enhanced mode


Inbound Default Load Configurable Load-balancing Remarks
Packet Type Balancing Mode Mode

IPv4 packets src-ip+dst-ip+l4-src- src-ip/dst-ip//l4-src-port/l4- The load


port+l4-dst-port dst-port/protocol balancing mode
varies with the
IPv6 packets src-ip+dst-ip+l4-src- src-ip/dst-ip//l4-src-port/l4- packet type. It is
port+l4-dst-port dst-port/protocol irrelevant to the
MPLS packets top-label+2nd-label top-label/2nd-label/dst-ip/ packet
src-ip forwarding
process.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 21


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Other Layer-2 src-mac+dst-mac src-mac/dst-mac/src- The system


(L2) packets interface/eth-type identifies the
type of the
Layer-3 packet
carried in an
Ethernet frame.
For example, if
an IPv4 packet is
identified, the
load balancing
mode configured
for IPv4 packets
will be applied
even if Layer-2
forwarding is
required. If the
packet is not an
IPv4, IPv6, or
MPLS packet,
the system
applies the L2-
specific load
balancing modes
(src-mac, dst-
mac, src-
interface, and
eth-type).

TRILL Ingress nodes: Use src-mac/dst-mac/src-ip/dst- Only when load-


packets the inner src-mac ip/src-interface/l4-src- balance
and dst-mac modes port/l4-dst-port/protocol/ enhanced
for L2 packets. Use eth-type profile profile-
the src-ip, dst-ip, l4- name is used,
src-port, and l4-dst- users can set
port modes for L3 load balancing
packets. for TRILL
packets on
Transit/Egress nodes: Cannot be configured. transit and egress
Use the inner src- nodes.
mac and dst-mac
modes for L2
packets. Use the src-
ip, dst-ip, l4-src-
port, and l4-dst-port
modes for L3
packets.

3.2.4 Link Aggregation Working Modes

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 22


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.2.4.1 Manual Link Aggregation


The Eth-Trunk is manually created, and member interfaces are manually added the Eth-Trunk.
No LACP protocol is involved. In this mode, all active links work in load-sharing mode to
forward data. If an active link is faulty, the remaining active links evenly share the traffic.

3.2.4.2 LACP Link Aggregation


The LACP protocol defined by Institute of Electrical and Electronics Engineers (IEEE)
802.3ad implements dynamic link aggregation and de-aggregation. LACP communicates with
the peer ends using link aggregation control protocol data units (LACPDUs). After member
ports are added to the Eth-Trunk, these ports send LACPDUs to inform peer ends of their
system priorities, MAC addresses, port priorities, port numbers, and operation keys (used to
determine whether the peer ends are in the same LAG and whether the ports have the same
bandwidth). After receiving the information, the peer end compares the information with the
information stored on its port and selects the ports that can be aggregated. Then, the devices at
both ends determine the active ports to be used. IEEE 802.3ad defines the following two
priorities:
1. System LACP priority: A smaller value indicates a higher priority. The end with a
higher priority is the proactive end and selects active ports. The end with a lower priority
is the passive end and uses the active links selected by the proactive end.
2. Port LACP priority: indicates the priority of ports in the same Eth-Trunk. A smaller
value indicates a higher priority. A port with a higher priority is preferentially selected as
an active port.
After a port is added to the Eth-Trunk, the port status changes from Down to Up and LACP
protocol negotiation starts. Figure 3-9 shows the LACP process.

Figure 3-9 LACP link aggregation

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 23


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

LACP has two modes: lacp static and lacp dynamic. They handle link negotiation failures in
different ways. In lacp static mode, the Eth-Trunk becomes Down and cannot forward data
after the LACP negotiation fails. In lacp dynamic mode, the Eth-Trunk becomes Down after
the LACP negotiation fails, but member ports inherit Eth-Trunk VLAN attributes and change
to Indep state to independently perform L2 data forwarding.

3.2.5 Comparison Between Huawei Eth-Trunks and Cisco Port


Channels
Both Huawei Eth-Trunks and Cisco port channels support the manual load balancing and
LACP modes.

Table 3-2 Comparison of working modes between Huawei Eth-Trunks and Cisco port
channels

- Eth-Trunk Port Channel

mode manual mode manual channel-group 1 mode on

lacp mode lacp static channel-group 1 mode active

mode lacp static channel-group 1 mode passive

A Huawei Eth-Trunk determines the proactive and passive ends based on system LACP
priorities, that is, the end with a higher system priority is the proactive end and the end with a
lower system priority is the passive end.

3.3 M-LAG

3.3.1 Introduction to M-LAG


Multichassis Link Aggregation Group (M-LAG) implements link aggregation among multiple
devices. In a dual-active system shown in Figure 3-10, one device is connected to two
devices through M-LAG to achieve device-level link reliability.

Figure 3-10 M-LAG network

M-LAG

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 24


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

As an inter-device link aggregation technology, M-LAG increases link bandwidth, improves


link reliability, and implements load balancing. It has the following advantages:
l High reliability
M-LAG protects link reliability for entire devices.
l Simplified network and configuration
M-LAG is a horizontal virtualization technology that virtualizes two dual-homed devices
into one device. M-LAG prevents loops on a Layer 2 network and implements
redundancy, without performing laborious spanning tree protocol configuration. M-LAG
greatly simplifies the network and configuration.
l Independent upgrade
Two devices can be upgraded independently. This prevents service interruption when
either device is upgrading.

3.3.2 Basic Concepts


In Figure 3-11, the user-side device (switch or host) connects to SwitchA and SwitchB
through M-LAG to constitute a dual-active system. Then SwitchA and SwitchB forward
traffic together to ensure network reliability.

Figure 3-11 M-LAG network

Network

Dual-Active
Detection
Packets
peer-link
SwitchA SwitchB

M-LAG

M-LAG M-LAG
member member
interface interface
Switch
Dual-active
system

Table 3-3 descries basic concepts of M-LAG.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 25


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Table 3-3 Basic concepts of M-LAG


Concept Description

M-LAG master device The device is configured with M-LAG and is in master state.

M-LAG backup device The device is configured with M-LAG and is in slave state.
NOTE
Normally, both the master and backup devices forward service traffic.

Peer-link A peer-link is between two directly connected devices and has


M-LAG configured. It is used to exchange negotiation packets
and transmit part of traffic.
To ensure the reliability of the peer-link, you are advised to use
multiple links for aggregation.

Peer-link interface The peer-link interfaces are on two ends of a peer-link.

Dual-Active Detection The M-LAG master and backup devices send DAD packets at
(DAD) an interval of 1s on the link of the DAD-enabled interface.
When the peer-link fails, DAD is performed.

M-LAG member M-LAG member interfaces are the Eth-Trunks on M-LAG


interface master and backup devices that are connected to the user-side
host or switch.
To improve reliability, you are advised to configure link
aggregation in LACP mode.

3.3.3 Implementation
The dual-active system that is set up based on M-LAG provides device-level reliability.
Figure 3-12 shows the M-LAG establishment process. The process includes the following
stages:
1. The devices at both ends of the M-LAG periodically send M-LAG negotiation packets
through the peer-link. When receiving M-LAG negotiation packets from the remote end,
the local end determines whether a DFS group ID in the M-LAG negotiation packets is
the same as that on the local end. If they are the same, device pairing is successful.
2. Both devices compare the DFS group priority in M-LAG negotiation packets to
determine the master and slave status. SwitchB (E9000 switch module) is used as an
example. When receiving packets from SwitchA (E9000 switch module), SwitchB
checks and records information about SwitchA, and compares its DFS group priority
with that of SwitchA. If SwitchA has a higher DFS group priority than SwitchB,
SwitchA is the master device and SwitchB is the backup device. If SwitchA and SwitchB
have the same DFS group priority, the device with a smaller MAC address functions as
the master device.
The master and slave statuses take effective when a fault occurs, and do not affect traffic
forwarding.
3. After master and backup devices are negotiated, the two devices periodically send M-
LAG dual-active detection packets every second through the dual-active detection link.
When two devices can receive packets from each other, the dual-active system starts to
work.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 26


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

M-LAG dual-active detection packets are used to detect dual master devices when the
peer-link fails.
– (Recommended)The dual-active detection link send packets from the MEth
management interface, and the IP address of the the MEth management interface
bingding to DFS group is reachable. The VRF instances binding to the MEth
management interface is also used to separate the network-side routing.
– The dual-active detection link can also send packets from the network-side link. If
the routing neighbor relationship are setup through the peer-link between M-LAG
master and backup devices, the dual-active detection packets are sended from the
shortest path. Once the peer-link fails, the dual-active detection packets are sended
from the second shortest path. The dual-active detection would be delayed for half a
second to one second.
4. The two devices send M-LAG synchronization packets through the peer-link to
synchronize information from each other in real time. M-LAG synchronization packets
include MAC address entries and ARP entries, so a fault of any device does not affect
traffic forwarding.

Figure 3-12 M-LAG setup

Network

Dual-active
detection packet
peer-link
SwitchA SwitchB

M-LAG setup
M-LAG negotiation packet
M-LAG synchronization
packet
...

After the M-LAG dual-active system is set up successfully, the M-LAG dual-active system
starts to work. The M-LAG master and backup devices load balance traffic. If a link, device,
or peer-link fault occurs, M-LAG ensures nonstop service transmission. The following
describes traffic forwarding when M-LAG works properly and a fault occurs.
Dual homing a switch to a Ethernet network and an IP network is used as an example.

3.3.3.1 Dual Homing a Switch to an Ethernet Network


The network is normal:
l Unicast traffic from a non-M-LAG member interface (S-1 for example)

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 27


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-13 Unicast traffic from a non-M-LAG member interface

SwitchA
S-1

Ethernet
Network
S-2

SwitchB

S-3

Device forwards unicast traffic based on the common forwarding process.


l Unicast traffic from an M-LAG member interface

Figure 3-14 Unicast traffic from an M-LAG member interface

SwitchA
S-1

Ethernet
Network
S-2

SwitchB

S-3

SwitchA and SwitchB load balance traffic.


l Multicast or broadcast traffic from a non-M-LAG member interface (S-1 for example)

Figure 3-15 Multicast or broadcast traffic from a non-M-LAG member interface

SwitchA
S-1

Ethernet
Network
S-2

SwitchB

S-3

After receiving multicast traffic, SwitchA forwards traffic to each next hop. The traffic
that reaches SwitchB is not forwarded to S-2 because unidirectional isolation is
configured between the peer-link and M-LAG member interface.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 28


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

l Multicast or broadcast traffic from an M-LAG member interface

Figure 3-16 Multicast or broadcast traffic from an M-LAG member interface

SwitchA
S-1

Ethernet
Network
S-2

SwitchB

S-3

Multicast or broadcast traffic from S-2 is load balanced between SwitchA and SwitchB.
The following uses the forwarding process on SwitchA as an example.
After receiving multicast traffic, SwitchA forwards traffic to each next hop. The traffic
that reaches SwitchB is not forwarded to S-2 because unidirectional isolation is
configured between the peer-link and M-LAG member interface.
l Unicast traffic from the network side

Figure 3-17 Unicast traffic from the network side

SwitchA
S-1

Ethernet
Network
S-2

SwitchB

S-3

Unicast traffic sent from the network side to M-LAG member interfaces is forwarded to
the dual-active devices by SwitchA and SwitchB in load balancing mode.
Load balancing is not performed for unicast traffic sent from the network side to non-M-
LAG member interfaces. For example, traffic sent to S-1 is directly sent to SwitchA,
which then forwards the traffic to S-1.
l Multicast or broadcast traffic from the network side

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 29


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-18 Multicast or broadcast traffic from the network side

SwitchA
S-1

Ethernet
Network
S-2

SwitchB

S-3

Multicast or broadcast traffic from the network side is load balanced between SwitchA
and SwitchB. The following uses the forwarding process on SwitchA as an example.
SwitchA forwards traffic to each user-side interface. The traffic that reaches SwitchB is
not forwarded to S-2 because unidirectional isolation is configured between the peer-link
and M-LAG member interface.
A fault occurs:
l Uplink fails

Figure 3-19 Uplink fails


Master
SwitchA
peer-link

Ethernet
M-LAG
Network
Switch

SwitchB
Backup

When a switch is dual-homed to an Ethernet network, DAD packets are often transmitted
through the management network. DAD dection on the M-LAG master device is not
affected, the dual-active system is not affected, and the M-LAG master and backup
devices can still forward traffic. Because the uplink of the M-LAG master device fails,
traffic passing the M-LAG master device is forwarded through the peer-link.
l Downlink Eth-Trunk fails

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 30


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-20 Downlink Eth-Trunk fails


Master
SwitchA

peer-link
Ethernet
M-LAG
Network
Switch

SwitchB
Backup

The M-LAG master and backup states remain unchanged, and traffic is switched to the
other Eth-Trunk. The faulty Eth-Trunk becomes Down, and the dual-homing networking
changes into a single-homing networking.
l M-LAG master device fails

Figure 3-21 M-LAG master device fails


Master
SwitchA
peer-link

Ethernet
M-LAG
Network
Switch

SwitchB
Backup à Master

The M-LAG backup device becomes the master device and continues forwarding traffic,
and its Eth-Trunk is still in Up state. The Eth-Trunk on the master device becomes
Down, and the dual-homing networking changes into a single-homing networking.
NOTE

If the M-LAG backup device fails, the master and backup states remain unchanged and the Eth-Trunk
of the M-LAG backup device becomes Down. The Eth-Trunk on the master device is still in Up state
and continues forwarding traffic. The dual-homing networking changes into a single-homing
networking.
l Peer-link fails

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 31


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-22 Peer-link fails


Master
SwitchA

peer-link
Ethernet
M-LAG
Network
Switch

SwitchB
Backup

On a dual-homing Ethernet, VXLAN, or IP network where M-LAG is deployed, when


the peer-link fails but DAD detects that the DAD status is normal, interfaces except the
management interface, peer-link interface, and stack interface on the backup device all
enter the Error-Down state. When the faulty peer-link is restored, the M-LAG interface
in Error-Down state goes Up after 2 minutes by default and other physical interfaces in
Error-Down state go up automatically.
You can run the m-lag unpaired-port suspend command to specify the interfaces that
need to enter the Error-Down state when the peer-link fails, and the interfaces that are
not specified will not enter the Error-Down state.

3.3.3.2 Dual Homing a Switch to an IP Network


The network is normal:
l Unicast traffic from a non-M-LAG member interface (S-1 for example)

Figure 3-23 Unicast traffic from a non-M-LAG member interface

SwitchA
S-1

IP
Network
S-2

SwitchB

S-3

Device1 forwards unicast traffic based on the common forwarding process.


l Unicast traffic from an M-LAG member interface

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 32


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-24 Unicast traffic from an M-LAG member interface

SwitchA
S-1

IP
Network
S-2

SwitchB

S-3

SwitchA and SwitchB load balance traffic.


l Multicast traffic from a non-M-LAG member interface (S-1 for example)

Figure 3-25 Multicast traffic from a non-M-LAG member interface

SwitchA
S-1

IP
Network
S-2

SwitchB

S-3

The forwarding path of multicast traffic depends on the selected routes.


l Multicast traffic from an M-LAG member interface

Figure 3-26 Multicast traffic from an M-LAG member interface

SwitchA
S-1

IP
Network
S-2

SwitchB
Layer 3 link
S-3

SwitchA and SwitchB load balance traffic.


NOTE

To ensure normal Layer 3 multicast transmission on the M-LAG, a Layer 3 direct link (line marked in
red) needs to be deployed between the M-LAG master and backup devices.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 33


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

l Unicast traffic from the network side

Figure 3-27 Unicast traffic from the network side

SwitchA
S-1

IP
Network
S-2

SwitchB

S-3

Unicast traffic sent from the network side to M-LAG member interfaces is forwarded to
the dual-active devices by SwitchA and SwitchB in load balancing mode.
Load balancing is not performed for unicast traffic sent from the network side to non-M-
LAG member interfaces. For example, traffic sent to S-1 is directly sent to SwitchA,
which then forwards the traffic to S-1.
l Multicast traffic from the network side

Figure 3-28 Multicast traffic from the network side

SwitchA
S-1

IP
Network
S-2

SwitchB
Layer 3 link
S-3

Normally, network-side multicast traffic is forwarded to S-2 through the M-LAG master
device (SwitchA) only. The M-LAG backup device (SwitchB) cannot forward multicast
traffic through M-LAG member interfaces.
NOTE

To ensure normal Layer 3 multicast transmission on the M-LAG, a Layer 3 direct link (line marked in
red) needs to be deployed between the M-LAG master and backup devices. Some multicast traffic can
be forwarded through the Layer 3 link.

A fault occurs:
l Downlink Eth-Trunk fails

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 34


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-29 Downlink Eth-Trunk fails


Master
SwitchA

peer-link
IP
M-LAG
Network
Switch

SwitchB
Backup

The M-LAG master and backup states remain unchanged, and traffic is switched to the
other Eth-Trunk. The faulty Eth-Trunk becomes Down, and the dual-homing networking
changes into a single-homing networking.
l M-LAG master device fails

Figure 3-30 M-LAG master device fails


Master
SwitchA
peer-link

IP
M-LAG
Network
Switch

SwitchB
Backup à Master

The M-LAG backup device becomes the master device and continues forwarding traffic,
and its Eth-Trunk is still in Up state. The Eth-Trunk on the master device becomes
Down, and the dual-homing networking changes into a single-homing networking.
NOTE

If the M-LAG backup device fails, the master and backup states remain unchanged and the Eth-Trunk
of the M-LAG backup device becomes Down. The Eth-Trunk on the master device is still in Up state
and continues forwarding traffic. The dual-homing networking changes into a single-homing
networking.
l Peer-link fails

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 35


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-31 Peer-link fails


Master
SwitchA

peer-link
IP
M-LAG
Network
Switch

SwitchB
Backup

On a dual-homing Ethernet, VXLAN, or IP network where M-LAG is deployed, when


the peer-link fails but DAD detects that the DAD status is normal, interfaces except the
management interface, peer-link interface, and stack interface on the backup device all
enter the Error-Down state. When the faulty peer-link is restored, the M-LAG interface
in Error-Down state goes Up after 2 minutes by default and other physical interfaces in
Error-Down state go up automatically.
You can run the m-lag unpaired-port suspend command to specify the interfaces that
need to enter the Error-Down state when the peer-link fails, and the interfaces that are
not specified will not enter the Error-Down state.

3.3.4 Link Aggregation Working Modes

3.3.4.1 Connecting a Server in Dual-homing Mode


As shown in Figure 3-32, to ensure reliability, a server is often connected to a network
through link aggregation. If the device connected to the server fails, services are interrupted.
To prevent this problem, a server can connect to a network through M-LAG. That is, deploy
M-LAG between SwitchA and SwitchB and connect the server to SwitchA and SwitchB.
SwitchA and SwitchB load balance traffic. When one device fails, traffic can be rapidly
switched to the other device to ensure nonstop service transmission.

NOTE

The configuration of dual homing a server is the same as common link aggregation configuration. Ensure that
the server and switches use the same link aggregation mode. The LACP mode at both ends is recommended.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 36


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-32 Connecting a server in dual-homing mode

Ethernet/IP/
TRILL/VXLAN
Network

peer-link
Switch A Switch B

M-LAG

Server

3.3.4.2 Multi-level M-LAG


As shown in Figure 3-33, after M-LAG is deployed between SwitchA and SwitchB, M-LAG
is deployed between SwitchC and SwitchD. The two M-LAGs are connected. This
deployment simplifies networking and allows more servers to be connected to the network in
dual-homing mode. Before deploying multi-level M-LAG, configure Virtual Spanning Tree
Protocol (V-STP).

Figure 3-33 Networking of multi-level M-LAG

Network

Peer-link
SwitchC SwitchD

Peer-link
SwitchA SwitchB

Server

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 37


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.4 NIC Teaming

3.4.1 Overview
NIC Teaming allows multiple physical NICs on an Ethernet server to be bound as a virtual
NIC using software. This server has only one NIC presented to the external network and only
one network connection for any application in the network. After NICs are bonded, data can
be sent in load-sharing or active/standby mode. If one link fails, a traffic switchover or active/
standby switchover is performed to ensure server network reliability.

3.4.2 NIC Teaming in Windows


NIC teaming, also known as load balancing and failover (LBFO), allows multiple network
adapters on a server to be grouped as a team for bandwidth aggregation and traffic failover to
maintain connectivity in the event of a network component failure. It is supported by
Windows Server 2012 and later versions, including Server Core and Full GUI. Versions
earlier than Windows Server 2012 usually adopt NIC binding tools built in NICs. Intel,
Emulex, and Broadcom all provide related GUI configuration tools.

Figure 3-34 Creating a NIC team

On the GUI, enter a team name, select the NICs to be bound, and set the working mode of the
NIC team.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 38


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-35 Selecting the NICs to be bound

Working modes of NIC teams are as follows:


1. Static Teaming: manual load balancing mode. The corresponding switch module ports
must be added to the same Eth-Trunk.
2. Switch Independent: active/standby mode. The switch modules do not need to be
configured.
3. LACA: After link negotiation over LACP is successful, each link periodically sends
heartbeat negotiation messages to improve link reliability. In addition, the corresponding
switch module ports must be added to the Eth-Trunk and set to the static LACP mode.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 39


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-36 Setting the working mode of the NIC team

3.4.3 Bonding in Linux


The NIC teaming function provided by Linux is called bonding, which can be configured
through a configuration file or by using a GUI configuration tool. The directory of the
configuration file varies depending on the system, for example, the configuration file on
SLES 11 is in the /etc/sysconfig/network/ directory). NIC bonding supports multiple working
modes, such as the manual load balancing, active/standby, and LACP modes. For example, to
configure NIC bonding on SLES 11, perform the following steps:
1. Go to /etc/sysconfig/network/, create the configuration file ifcfg-eth0/ifcfg-eth1 for
NICs eth0 and eth1 respectively, and add the following information to the file (if the file
exists, modify it as follows):

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 40


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-37 Configuration file

2. Go to /etc/sysconfig/network, create the file ifcfg-bonding0, and add the following


information to the file:

Figure 3-38 Configuration file

3. STARTMODE='onboot' indicates that bonding 0 automatically takes effect during the


system startup. mode=1 indicates that the bound network ports work in active/standby
mode. miimon=100 indicates that the link status is detected per 100 ms.
4. Run /etc/init.d/network restart to make the bonding configuration take effect.

3.4.4 NIC Teaming in vSphere

3.4.4.1 vSphere Virtual Network Components


Figure 3-39 shows vSphere virtual network components.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 41


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-39 vSphere virtual network components

3.4.4.2 vSphere Virtual Switches


vSphere provides two types of virtual switches: virtual standard switches (VSSs) and virtual
distributed switches (VDSs).

VSS

The VSS running mode is similar to that of a physical Ethernet switch. The VSS detects VMs
that logically connect to its virtual ports and forwards traffic to the correct VM based on the
detection result. Physical Ethernet adapters (also called uplink adapters) can be used to
connect a virtual network and a physical network so as to connect vSphere standard switches
to physical switches. This connection is similar to the connection between physical switches
for creating a large-scale network. Although similar to physical switches, vSphere VSSs do
not provide certain advanced functions of physical switches. Figure 3-40 shows the
architecture of vSphere VSSs.

Figure 3-40 Architecture of vSphere VSSs

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 42


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

VDS
The VDS can be used as a single switch for all associated hosts in a data center, to provide
centralized deployment, management, and monitoring of the virtual network. The
administrator can configure a vSphere distributed switch on a vCenter server. This
configuration will be sent to all hosts associated with the switch, which allows consistent
network configuration when VMs are migrated across hosts. Figure 3-41 shows the
architecture of vSphere VDSs.

Figure 3-41 Architecture of vSphere VDSs

Like the physical network, the virtual network also needs to improve network connection
reliability and single-port bandwidth. NIC teaming is provided for the networking in vSphere
virtual environments. vSphere NIC teaming allows the traffic between the physical and virtual
networks to be shared by some or all members and implements switchovers when a hardware
fault or network interruption occurs. Table 3-4 describes the four types of NIC teaming
provided by VMware vSphere 5.5 VSSs.

Table 3-4 NIC teaming policies


Item Description

Routing based on the originating virtual Choose an uplink based on the virtual port
port ID through which traffic is transmitted to the
virtual switch.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 43


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Route based on IP hash Choose an uplink based on a hash of the source


and destination IP addresses of each packet.

Route based on source MAC hash Choose an uplink based on a hash of the source
Ethernet MAC address.

Use explicit failover order Always use the highest-order uplink from the
list of active adapters to pass failover detection
criteria.

The NIC teaming policies can be found in the Load Balancing drop-down list shown in NIC
teaming policies.

Figure 3-42 NIC teaming policies

In addition to NIC teaming policies supported by VSSs, NIC teaming provided by port groups
of VMware vSphere 5.5 VDSs also includes "Route based on physical NIC load", as shown in
Figure 3-43. Only vSphere Enterprise Plus supports distributed switches. vSphere distributed
switches support LACP, as shown in Figure 3-43 and Figure 3-44.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 44


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-43 NIC teaming policies supported by vSphere distributed switches

Figure 3-44 LACP supported by vSphere distributed switches

For load balancing algorithms, the components for performing hash functions in the physical
environment include the source and destination IP addresses, TCP/UDP port numbers, and
source and destination MAC addresses. In LACP load balancing mode, virtual switches and
physical switches negotiate with each other to determine the forwarding policy. In the
vSphere virtual environment, components for performing hash functions include the source
and destination IP addresses, source MAC address, and switch port numbers.

3.4.5 Mapping Between NIC Teaming and Switch Modules


The working mode of a NIC team varies depending on the switch module configuration.
Some modes take effect only after switch modules are correctly configured. The main
working modes of NIC teams are as follows:

1. Switch Independent: The working mode of a NIC team does not depend on a switch.
The NIC teaming works independently. For example, mode = 1/5/6 in Linux, Route
based on the originating virtual port ID, Route based on source MAC hash, and Use
explicit failover order in VMware, and Switch Independent mode in Windows.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 45


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

2. Manual load balancing: In manual load balancing mode, the NIC team chooses active
links by using the hash algorithm, regardless of LACP negotiation results. For example,
mode = 0/2/3 in Linux, Route based on IP hash in VMware, and Static teaming in
Windows.
3. LACP load balancing: NICs and switch modules on a server negotiate with each other
about NIC teaming through LACP. Physical links become active only after the LACP
negotiation is successful.

Table 3-5 Mapping between working modes of NIC teams and switch modules
Operating NIC Teaming Mode Eth-Trunk Mode
System

Linux balance-rr or 0 (default) mode manual


balance-xor or 2
broadcast or 3

active-backup or 1 (recommended) N/A


balance-tlb or 5
balance-alb or 6

802.3ad (LACP) or 4 mode lacp static


(recommended)

VMware Route based on the originating N/A


virtual port ID (recommended)
Route based on source MAC hash
Use explicit failover order

Route based on the source and mode manual


destination IP hash

LACP (recommended) mode lacp static

Windows Switch Independent N/A

Static Teaming mode manual

LACP mode lacp static

3.5 FC Technology
This chapter describes the working principles of the FC technology.

3.5.1 Basic Concepts


As shown in Figure 3-45, FC involves the following concepts: Fabric, FCF, NPV, WWN,
FC_ID, Domain_ID, Area_ID, Port_ID, FC-MAP, Zone, and port roles.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 46


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-45 FC networking

l Fabric

A fabric is the network topology where servers and storage devices are interconnected
through one or more switches.

l FCF

A Fibre Channel Forwarder (FCF) is a switch that supports both FCoE and FC protocol stacks
and is used to connect to a SAN or LAN. An FCF forwards FCoE packets and encapsulates or
decapsulates them

l NPV

An N-Port Virtualization (NPV) switch is at the edge of a fabric network and between ENodes
and FCFs, forwarding traffic from node devices to FCF switches.

l WWN

A World Wide Name (WWN) identifies an entity in a fabric network. A WWN is either a
World Wide Node Name (WWNN) that identifies a node device or a World Wide Port Name
(WWPN) that identifies a device port. Each entity in a SAN is assigned with a WWN before
the entity is delivered from the factory.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 47


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

l FC ID

Figure 3-46 FC_ID format

An FC_ID is an FC address. In a SAN, the FC protocol accesses entities by using their FC


addresses. An FC address uniquely identifies an N-Port of a node device.
l Domain_ID
A domain ID in a SAN uniquely identifies an FC switch. Routing and forwarding among FC
switches are based on domain IDs.
l Area_ID
One or more N_Ports of a node device can be assigned to an area, which is identified by an
area ID.
l Port_ID
A port ID identifies an N_Port.
l FC-MAP

Figure 3-47 Fabric-provided MAC address (FPMA)

FCoE frames are forwarded by using locally unique MAC addresses (unique only in the local
Ethernet subnetwork). FCFs assign locally unique MAC addresses to ENodes or ENodes
specify their own locally unique MAC addresses and inform FCFs. In FPMA mode, FCFs
assigned locally uniquely MAC addresses to ENodes. An FPMA is an FC ID with a 24-bit
FCoE MAC address prefix (FC-MAP).
l Zone
N_Ports are added to different zones so that the N_Ports are isolated. A zone set is a set of
zones. It is a logical control unit between zones and instances and simplifies configurations.
Each instance can have only one activated zone set.
l Port roles

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 48


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-48 Port roles

In a traditional FC network, FC devices interact with each other through FC ports. FC ports
include N-Ports, F-Ports, and NP-Ports.
1. Node port (N_Port): indicates a port on an FC host (server or storage device) and
connects to an FC switch.
2. Fabric port (F_Port): indicates a port on an FC switch and connects to an FC host,
enabling the FC host to access the fabric.
3. N_Port Proxy (NP_Port): indicates a port on an NPV switch and connects to an FCF
switch.

3.5.2 Working Principles


A fabric network formed by FC switches provides data transmission services. The following
introduces the FC SAN communication process by describing how a server accesses a storage
array, as shown in Figure 3-49.

Figure 3-49 FC SAN communication process

1. When servers or storage arrays go online in the fabric network, they request FC switches
to provide services and register with FC switches through fabric login (FLOGI) packets.
2. FC switches allocate FC addresses to servers and storage devices.
3. Servers and storage devices send name service registration requests to FC switches,
which create and maintain the mapping table between FC addresses and WWNs.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 49


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

4. A server sends session requests to a target node through a port login (PLOGI).
5. After the session is established between a server and a storage device through a PLOGI,
FC data can be transmitted. An FC switch determines the route and forwards data based
on FC addresses of the server and the storage device.

3.5.3 FCF
A Fibre Channel Forwarder (FCF) switch supports both FCoE and FC protocol stacks for
connecting to SAN and LAN environments. In an FC SAN, an FCF is mainly used for
transmitting FC data. An FCF forwards FCoE packets and encapsulates or decapsulates them.

Figure 3-50 FCF network

As shown in Figure 3-50, F_Ports of the FCF directly connect to N_Ports of a server and a
storage array. Each FCF switch has a domain ID. Each FC SAN supports a maximum of 239
domain IDs. Therefore, each FC SAN can contain a maximum of 239 FCF switches.

3.5.4 NPV
A SAN has high demands for edge switches directly connected to node devices. N-Port
virtualization (NPV) switches do not occupy domain IDs and enable a SAN to exceed the
limit of 239 edge switches.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 50


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-51 NPV network

As shown in Figure 3-51, an NPV switch is located at the edge of a fabric network and
between node devices and an FCF. The NPV switch use F_Ports to connect to N_Ports of
node devices and use NP_Ports to connect to F_Ports of the FCF switch. As a result, the node
devices connect to the fabric network through the NPV switch, which forwards traffic from all
node devices to the core switch.

For a node device, the NPV switch is an FCF switch that provides F_Ports. For an FCF
switch, an NPV switch is a node device that provides N_Ports.

3.5.5 Zone
In an FCoE network, users can use zones to control access between node devices to improve
network security.

l Zone

A zone contains multiple zone members. A node device can join different zones at the same
time. Node devices in the same zone can access each other. Node devices in different zones
cannot access each other. A zone member can be defined in the following ways:

1. Zone alias: After a zone alias joins a zone, the members in the zone alias also join the
zone.
2. FC_ID: indicates an FC address. Node devices in an FC network access each other
through FC addresses.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 51


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3. FCoE port: Node devices in an FCoE network interact with each other through FCoE
ports.
4. WWNN (World Wide Node Name): A WWNN is a 64-bit address used to identify a
node device in an FC network.
5. WWPN (World Wide Port Name): A WWPN is a 64-bit address used to identify a port of
a node device in an FC network.

Figure 3-52 Zone network

As shown in Figure 3-52, users can control access between node devices by adding node
devices to different zones. For example, array B can only interact with Server B and Server C,
not Server A.

l Zone set
A zone set contains multiple zones. A zone can join different zone sets at the same time.
Zones in a zone set are valid only after the zone set is activated. Each instance can have only
one activated zone set.

l Zone alias
Applying zone aliases to zone configurations simplifies the configurations. If multiple zone
members need to join multiple zones, you can add the zone members to a zone alias and then
add the zone alias to a zone as a zone member. This avoids adding zone members one by one.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 52


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-53 Zone alias

As shown in Figure 3-53, if Members C and D both need to join Zones A and B, you can add
Members C and D to Zone Alias A, and then add Zone Alias A to Zones A and B. This
simplifies configurations.

3.6 DCB

3.6.1 DCB Overview


Data Center Bridging (DCB) is a set of enhancements to Ethernet for use in data center
environments. It is defined by the IEEE 802.1 working group. DCB is used to build lossless
Ethernet to meet the quality of service (QoS) requirements of a converged data center
network. Table 3-6 describes three features defined by DCB, that is, PFC, ETS, and DCBX.

Table 3-6 Features defined by DCB

Feature Description

Priority-based Flow control (PFC) Implements priority-based flow control on a


shared link.

Enhanced transmission selection (ETS) Implements priority-based bandwidth


control on a shared link.

Data Center Bridging Exchange (DCBX) Provides auto-negotiation of PFC/ETS


protocol parameters between Ethernet devices.

3.6.2 PFC
PFC is also called Per Priority Pause or Class Based Flow Control (CBFC). It is an
enhancement to the Ethernet Pause mechanism. PFC is a priority-based flow control
mechanism. As shown in Figure 3-54, the transmit interface of Device A is divided into eight
queues of different priorities. The receive interface of Device B is divided into eight buffers.
The eight queues and eights buffers are in one-to-one correspondence. When a receive buffer
on Device B is to be congested, Device B sends a STOP signal to Device A. Device A stops
sending packets in the corresponding priority queue when receiving the STOP signal.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 53


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-54 PFC working principle

PFC allows traffic in one or multiple queues to be stopped, which does not affect data
exchange on the entire interface. Data transmission in each queue can be separately stopped or
resumed without affecting other queues. This feature enables various types of traffic to be
transmitted on the same link. The system does not apply the backpressure mechanism to the
priority queues with PFC disabled and directly discards packets in these queues when
congestion occurs.

3.6.3 ETS
The converged data center network bears three types of traffic: inter-process communication
(IPC) traffic, local area network (LAN) traffic, and storage area network (SAN) traffic. The
converged network has high QoS requirements. The traditional QoS cannot meet
requirements of the converged network, whereas Enhanced Transmission Selection (ETS)
uses hierarchical scheduling to guarantee QoS on the converged network. ETS provides two
levels of scheduling: scheduling based on the priority group (PG) and scheduling based on the
priority. Figure 3-55 illustrates how ETS works. On an interface, PG-based scheduling is
performed first, and then priority-based scheduling is performed.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 54


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-55 ETS scheduling model

A PG is a group of priority queues with the same scheduling attributes. Users can add queues
with different priorities to a PG. PG-based scheduling is called level-1 scheduling. ETS
defines three PGs: PG0 for LAN traffic, PG1 for SAN traffic, and PG15 for IPC traffic.
As defined by ETS, PG0, PG1, and PG15 use priority queue (PQ)+Deficit Round Robin
(DRR). PG15 uses PQ to schedule delay-sensitive IPC traffic. PG0 and PG1 use DRR. In
addition, bandwidth can be allocated to PGs based on actual networking.
As shown in Figure 3-56, the queue with priority 3 carries FCoE traffic and is added to the
SAN group (PG1). Queues with priorities 0, 1, 2, 4, and 5 carry LAN traffic and are added to
the LAN group (PG0). The queue with priority 7 carries IPC traffic and is added to the IPC
group (PG15). The total bandwidth of the interface is 10 Gbit/s. Each of PG1 and PG0 is
assigned 50% of the total bandwidth, that is, 5 Gbit/s.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 55


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-56 Configure the PG bandwidth

At t1 and t2, all traffic can be forwarded because the total traffic on the interface is within the
interface bandwidth. At t3, the total traffic exceeds the interface bandwidth and LAN traffic
exceeds the given bandwidth. At this time, LAN traffic is scheduled based on ETS parameters
and 1 Gbit/s LAN traffic is discarded.
ETS also provides PG-based traffic shaping. The traffic shaping mechanism limits traffic
bursts in a PG to ensure that traffic in this group is sent out at an even rate.
In addition to PG-based scheduling, ETS also provides priority-based scheduling, that is,
level-2 scheduling, for queues in the same PG. Queues in the same PG support queue
congestion management, queue shaping, and queue congestion avoidance.

3.6.4 DCBX
To implement lossless Ethernet on a converged data center network, both ends of an FCoE
link must have the same PFC and ETS parameter settings. Manual configuration of PFC and
ETS parameters may increase administrator's workloads and cause configuration errors.
DCBX, a link discovery protocol, enables devices at both ends of a link to discover and
exchange DCB configurations. This greatly reduces administrator's workloads. DCBX
provides the following functions:
1. Detects the DCB configurations of the peer device.
2. Detects the DCB configuration errors of the peer device.
3. Configures DCB parameters for the peer device.
DCBX enables DCB devices at both ends to exchange the following DCB configurations:
1. ETS PG information
2. PFC
DCBX encapsulates DCB configurations into Link Layer Discovery Protocol (LLDP) type-
length-values (TLVs) so that devices at both ends of a link can exchange DCB configurations.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 56


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.6.5 Configuration Instance


CNAs on compute nodes connect to an FCoE storage device through CX310s and Cisco N5K
switches, as shown in Figure 3-57. Configure DCB features for the FCoE links between the
CX310s and N5K switches. This section provides only the configuration procedure of the
CX310 in slot 2X. The configuration procedure of the CX310 in slot 3X is the same.

Figure 3-57 DCB application networking

# Configure DCBX functions.


[~CX310_2] lldp enable
[*CX310_2] interface 10GE 2/17/1
[*CX310_2-10GE2/17/1] lldp tlv-enable dcbx
[*CX310_2-10GE2/17/1] quit

# Enable the PFC function of the interface.


[*CX310_2] interface 10GE 2/17/1
[*CX310_2-10GE2/17/1] dcb pfc enable mode auto
[*CX310_2-10GE2/17/1] quit

# Create an ETS profile and add queue 3 to PG1 (by default), and other queues to PG0. PG15
is empty.
[*CX310_2] dcb ets-profile ets1
[*CX310_2-ets-ets1] priority-group 0 queue 0 to 2 4 to 7

# Configure PG-based flow control and set DRR weights of PG0 and PG1 to 60% and 40%
respectively.
[*CX310_2-ets-ets1] priority-group 0 drr weight 60
[*CX310_2-ets-ets1] quit

# Apply the ETS Profile to the FCoE interface.


[*CX310_2] interface 10GE 2/17/1
[*CX310_2-10GE2/17/1] dcb ets enable ets1
[*CX310_2-10GE2/17/1] quit
[*CX310_2] commit

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 57


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.7 FCoE

3.7.1 FCoE Overview


The LAN and SAN in a traditional data center are deployed and maintained independently.
The LAN implements communication between servers and between servers and clients, and
the SAN implements communication between servers and storage devices. As the number of
servers dramatically increases with rapid development of data centers, independent
deployment of LANs and SANs results in the following problems:
1. Complicated networking: Independent LANs and SANs result in poor flexibility in
service deployment and network expansion, and high management costs.
2. Low energy efficiency: Four to six NICs must be installed on each server, including
NICs connected to LANs and host bus adapters (HBAs) connected to SANs. The use of
diversified NICs on servers increases the power consumption and cooling cost for data
centers.
FCoE and DCB are developed to resolve the preceding problems:
1. FCoE encapsulates FC frames into Ethernet frames, allowing LANs and SANs to share
network resources. With FCoE technology, LANs and SANs can be converged.
2. DCB builds a lossless Ethernet network in a data center network. This technology
enables traditional Ethernet to implement congestion control like the FC SAN and
ensures transmission quality for FCoE convergence services.

3.7.2 Basic Concepts


Figure 3-58 shows FCoE networking. The basic concepts of FCoE include ENode, FCoE
Forwarder (FCF), Fabric, FCoE virtual link, FCoE Initialization Protocol (FIP), FIP Snooping
Bridge (FSB), port roles, and FCoE VLAN.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 58


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-58 FCoE networking

3.7.2.1 ENode
An ENode is a CNA that supports FCoE and FC. A traditional server houses two network
adapters: an NIC connected to a LAN and an HBA connected to a SAN. The CNA provides
both NIC and HBA functions. It can forward Ethernet data, process FCoE packets in upper
layers, and encapsulate and decapsulate FCoE frames.

3.7.2.2 FCoE Virtual Link


An FCoE virtual link is a point-to-point logical link between FCoE devices, for example,
between an ENode and an FCF. The connection between an ENode and FCF is not point-to-
point when the ENode and FCF are connected through a lossless Ethernet network. The FCoE
virtual link is used to solve this problem.

3.7.2.3 FIP
FIP is a L2 protocol that discovers FC terminals on an FCoE network, implements fabric
login, and establishes FCoE virtual links. An ENode can log in to the fabric over FIP to
communicate with the target FC device. FIP can also maintain FCoE virtual links.

3.7.2.4 FCoE VLAN


FCoE frames are forwarded in specified VLANs as defined in FC-BB-5. In the FC protocol
stack, FC devices support multiple virtual storage area networks (VSANs), which are similar
to Ethernet VLANs. FC traffic in different VSANs is identified by FCoE VLANs during
FCoE encapsulation. An FCoE virtual link corresponds to one FCoE VLAN. An FCoE
VLAN bears only FCoE traffic and does not bear any Ethernet traffic, such as IP traffic.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 59


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.7.3 FCoE Packet Format


In the traditional FC protocol, the FC protocol stack is divided into five layers:
1. FC-0: defines the bearer medium type.
2. FC-1: defines the frame coding and decoding mode.
3. FC-2: defines the frame division protocol and flow control mechanism.
4. FC-3: defines general services.
5. FC-4: defines the mapping from upper-layer protocols to FC.
FC-0 and FC-1 in the FCoE protocol stack map physical and MAC layers in IEEE 802.3
Ethernet respectively. The FCoE protocol stack adds FCoE mapping as an adaptation layer
between the upper-layer FC protocol stack and lower-layer Ethernet protocol stack, as shown
in Figure 3-59.

Figure 3-59 FCoE packet format

FCoE encapsulates an FC frame into an Ethernet frame. Figure 3-60 shows FCoE frame
encapsulation.

Figure 3-60 FCoE packet protocol

1. The Ethernet Header specifies the packet source, destination MAC address, Ethernet
frame type, and FCoE VLAN.
2. The FCoE Header specifies the FCoE frame version number and flow control
information.
3. Similar to a traditional FC frame, the FC Header specifies the source and destination
addresses of an FC frame.

3.7.4 FIP
FIP, an FCoE control protocol, establishes and maintains FCoE virtual links between FCoE
devices, for example, between ENodes and FCFs. In the process of creating a virtual link:

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 60


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

1. FIP discovers an FCoE VLAN and the FCoE virtual interface of the remote device.
2. FIP completes initialization tasks, such as fabric login (FLOGI) and fabric discovery
(FDISC), for the FCoE virtual link.

After an FCoE virtual link is set up, FIP maintains the FCoE virtual link in the following way:

1. Periodically checks whether FCoE virtual interfaces at both ends of the FCoE virtual link
are reachable.
2. Remove the FCoE virtual link through fabric logout (FLOGO).

The following figure shows the process of setting up an FCoE virtual link between an ENode
and FCF. The ENode and FCF exchange FIP frames to establish the FCoE virtual link. After
the FCoE virtual link is set up, FCoE frames are transmitted over the link.

An FCoE virtual link is set up through three phases: FIP VLAN discovery, FIP FCF
discovery, and FIP FLOGI and FDISC. The FIP FLOGI and FDISC process is similar to the
FLOGI and FDISC process defined in traditional FC protocol.

3.7.4.1 FIP VLAN Discovery


FIP VLAN discovery discovers the FCoE VLANs that will transmit FCoE frames. In this
phase, an ENode discovers all potential FCoE VLANs but does not select an FCF. The FIP
VLAN discovery process is as follows:

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 61


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

1. An ENode sends a FIP VLAN discovery request packet (FIP VLAN request) to a
multicast MAC address called All-FCF-MAC 01-10-18-01-00-02. All FCFs listen on
packets destined for this MAC address.
2. All FCFs report one or more FCoE VLANs to the ENode through a common VLAN.
The FCoE VLANs are available for the ENode's VN_Port login. FIP VLAN discovery is
an optional phase as defined in FC-BB-5. An FCoE VLAN can be manually configured
by an administrator, or dynamically discovered using FIP VLAN discovery.

3.7.4.2 FIP FCF Discovery


FIP FCF discovery is used by ENodes to discover FCFs that allow logins. The FIP FCF
discovery process is as follows:
1. Each FCF periodically sends FIP FCF discovery advertisement messages in each
configured FCoE VLAN. The advertisement messages are destined for the multicast
MAC address All-ENode-MAC 01-10-18-01-00-01 to which all ENodes can listen. The
FIP FCF discovery advertisement messages contain the FCF MAC address and FCoE
virtual link parameters, such as the FCF priority and the timeout interval of FIP packets.
2. The ENode obtains FCF information from the received discovery advertisement
messages, selects an FCF with the highest priority, and sends a unicast FIP FCF
discovery solicitation message to the selected FCF.
3. After receiving the discovery solicitation message, the FCF sends a unicast discovery
advertisement message, allowing the ENode to log in.
In addition to receiving discovery advertisement messages periodically, ENodes newly joining
a network do not need to wait for the messages from all FCFs. FC-BB-5 allows ENodes to
send FIP FCF discovery solicitation messages to the multicast MAC address All-FCF-MAC.
FCFs that receive the solicitation messages send a unicast FIP FCF discovery advertisement
message to the requesting ENode. Based on the received Advertisement messages, the ENode
selects an FCF with high priority to set up a virtual link with its VN_Port.

3.7.4.3 FIP FLOGI and FDISC


After discovering all FCFs and selecting one for login, the ENode notifies the selected FCF to
set up a virtual link with its VF_Port. Then FCoE frames can be exchanged on the established
FCoE virtual link. FIP FLOGI and FIP FDISC packets are unicast packets, similar to the FC
FLOGI and FDISC packets that they replace. The unicast packets assign MAC address to the
ENode so that it can log in to the fabric. FIP FDISC is similar to FIP FLOGI. The difference
is that FIP FLOGI refers to the procedure for setting up a virtual link when ENode logs in to
the fabric for the first time, and FIP FDISC refers to the procedure for setting up a virtual link
for each virtual machine (VM) when multiple VMs exist on an ENode. Take FIP FLOGI as an
example and the FIP FLOGI process is as follows:
1. An ENode sends an FIP FLOGI request to an FCF.
2. The FCF allocates a locally unique fabric provided MAC address (FPMA) or a server
provided MAC address (SPMA) to the ENode.

3.7.5 FCoE Virtual Link Maintenance


On the traditional FC network, FC can immediately detect faults on a physical link. In FCoE,
FC cannot immediately detect faults on a physical link because Ethernet encapsulation is

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 62


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

used. FIP provides a Keepalive mechanism to solve the problem. FCoE monitors FCoE
virtual links as follows:

1. The ENode periodically sends FIP Keepalive packets to the FCF. If the FCF does not
receive FIP Keepalive packets within 2.5 times the keepalive interval, the FCF considers
the FCoE virtual link faulty and terminates the FCoE virtual link.
2. The FCF periodically sends multicast discovery advertisement messages to the
destination MAC address ALL-ENode-MAC to all ENodes. If the ENode does not
receive multicast discovery advertisement messages within 2.5 times the keepalive
interval, the ENode considers the FCoE virtual link faulty and terminates the FCoE
virtual link.

If the FCF does not receive FIP keepalive packets from the ENode, the FCF sends an FIP
clear virtual link message to the ENode to clear the FCoE virtual link. When the ENode logs
out, it sends a Fabric Logout request message to the FCF to terminate the virtual link.

3.7.6 FIP Snooping


On an FC network, FC switches are considered as trusted devices. Other FC devices, such as
ENodes, must log in to an FC switch before they can connect to the FC network. The FCF
switch then assigns addresses to the FC devices. FC links are point-to-point, and an FC switch
can completely control traffic received and sent by FC devices. Therefore, FC switches enable
FC devices to exchange packets using the specified addresses and protect FC devices against
malicious attacks.

When an FCoE switch is deployed between an ENode and an FCF, FCoE frames are
forwarded on the FCoE switch based on the Ethernet protocol because FCoE switches do not
support FC. In this case, FCoE frames may not be destined for the FCF, and the point-to-point
connection between the ENode and FCF is terminated. To achieve equivalent robustness as an
FC network, the FCoE switch must forward FCoE traffic from all ENodes to the FCF. FIP
snooping obtains FCoE virtual link information by listening to FIP packets, controls the setup
of FCoE virtual links, and defends against malicious attacks.

The FCoE switch running FIP snooping is called an FSB. The 10GE switch modules of the
E9000 support FIP Snooping.

3.7.7 Configuration Instance


# Create an FCoE instance.
[~CX310_2] fcoe FSB
[*CX310_2-fcoe-FSB] vlan 2094
[*CX310_2-fcoe-FSB] quit
[*CX310_2] commit

# Configure port roles.


[~CX310_2-10GE2/17/1] fcoe role vnp
[*CX310_2-10GE2/17/1] quit
[*CX310_2] commit

3.8 Smart Link

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 63


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.8.1 Background
Dual-uplink networking is commonly used to connect E9000 servers to the existing network
system to ensure network reliability. Figure 3-61 and Figure 3-62 show two types of dual-
uplink networking.

Figure 3-61 Switch module dual-uplink networking 1

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 64


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-62 Switch module dual-uplink networking 2

The dual-uplink networking, however, creates a loop between the cascaded access switches A
and B and switch modules in slots 2X and 3X of the E9000, which may cause network
broadcast storms. Generally, Spanning Tree Protocol (STP) can be used to prevent loops.
However, STP convergence time is long and a large amount of traffic is lost during the
convergence. Therefore, STP cannot be applied to the network that demands short
convergence time. In addition, STP cannot be directly used for Cisco network devices due to
Cisco proprietary protocol Per-VLAN Spanning Tree Plus (PVST+). To address these issues,
Huawei has proposed the Smart Link solution.

3.8.2 Smart Link Basic Concepts

3.8.2.1 Smart Link Group


A Smart Link group consists of two member interfaces. One is the master port, and the other
one is the slave port. Typically, one port is active, and the other one is blocked and in the
standby state. When a link fault occurs on the port in the active state, the Smart Link group
automatically blocks the port, and the previous blocked port in the standby state switches to
the active state. Link faults mainly refer to that a port becomes Down or an Ethernet
operation, administration, and maintenance (OAM) link fault occurs.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 65


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-63 Smart Link group

3.8.2.2 Master Port


The master port is a port role in a Smart Link group specified by using the command line. It
can be an Ethernet interface (an electrical or optical interface) or aggregation interface.

3.8.2.3 Slave Port


The slave port is another port role in a Smart Link group specified by using the command
line. It can be an Ethernet interface (an electrical or optical interface) or aggregation interface.
The link of the slave port is also known as the standby link.

3.8.2.4 Control VLAN


Control VLANs consist of transmit control VLANs and receive control VLANs. A transmit
control VLAN is used by a Smart Link group to broadcast Flush packets. A receive control
VLAN is used by upstream devices to receive and process Flush packets.

3.8.2.5 Flush Packet


When a failover occurs between links of a Smart Link group, the original forwarding entries
no longer apply to the new topology. All MAC address forwarding entries and ARP entries on
the network need to be updated. The Smart Link group notifies other devices to refresh the
entries by sending Flush packets.
Flush packets are encapsulated using IEEE 802.3, including information fields, such as the
destination MAC address, source MAC address, control VLAN ID, and VLAN bitmap.
Figure 3-64 shows the Flush packet format.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 66


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

Figure 3-64 Flush packet format

3.8.3 Smart Link Working Mechanism

When all links and devices are running properly, the links between the switch module in slot
2X and switch A are active links and forward service data. The links between the switch
module in slot 3X and switch B are blocked. When switch A is faulty (a VRRP switchover is
required) or the links connected to the switch module in slot 2X are faulty, the links between
the switch module in slot 3X and switch B change to the active state and forward data. That
is, data is forwarded along the red lines shown in the figure.
As Flush packets are Huawei proprietary protocol packets, traffic switchovers can be
performed through Flush packets only when uplink devices are Huawei switches and support

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 67


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

the Smart Link feature. Otherwise, perform a forwarding path switchover by using the uplink
and downlink devices, or wait until the MAC address entries are aged.

3.8.4 Configuration Instance

# Cascade the CX310s in slots 2X and 3X. (For details, see the iStack
configuration instance.)

# Create an Eth-Trunk, add the first and second ports to the Eth-Trunk, and
disable STP.
[~CX310_C] interface Eth-Trunk 2
[*CX310_C-Eth-Trunk2] mode lacp-static
[*CX310_C-Eth-Trunk2] stp disable
[*CX310_C-Eth-Trunk2] trunkport 10GE 2/17/1 to 2/17/2
[*CX310_C-Eth-Trunk2] quit
[*CX310_C] interface Eth-Trunk 3
[*CX310_C-Eth-Trunk3] mode lacp-static
[*CX310_C-Eth-Trunk3] stp disable
[*CX310_C-Eth-Trunk3] trunkport 10GE 3/17/1 to 3/17/2
[*CX310_C-Eth-Trunk3] quit
[*CX310_C] commit

# Create a Smart Link group, and set Eth-Trunk 2 as the master port and Eth-Trunk
3 as the slave port.
[~CX310_C] smart-link group 1
[*CX310_C-smlk-group1] port Eth-Trunk 2 master
[*CX310_C-smlk-group1] port Eth-Trunk 3 slave
[*CX310_C-smlk-group1 quit
[*CX310_C] commit

3.9 Monitor Link

3.9.1 Monitor Link Overview


A Monitor Link group consists of one or more uplink and downlink ports. The downlink port
status changes based on the status of uplink ports.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 68


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

3.9.1.1 Uplink Ports


An uplink port is a monitored object specified by using the command line. The uplink port of
a Monitor Link group can be an Ethernet port (an electrical or optical port), aggregation port,
or a Smart Link group. If multiple ports are configured as uplink ports of a Monitor Link
group, the Monitor Link group status is Up if any of the ports is forwarding data. If all uplink
ports are faulty, the Monitor Link group status changes to Down and all downlink ports will
be shut down.

3.9.1.2 Downlink Ports


The downlink port, another port role specified by using the command line, monitors the
uplink port in the same Monitor Link group. The downlink port of a Monitor Link group can
be an Ethernet port (an electrical or optical port) or aggregation port.

3.9.2 Monitor Link Working Mechanism

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 69


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

As shown in the preceding figure, the CX310s in slots 2X and 3X are configured with one
Monitor Link group respectively, to monitor uplinks to switch A and switch B. Blades 1, 2,
and 3 are connected to two 10GE ports of CX310s in slots 2X and 3X separately, working in
active/standby mode (or in VM-based load-sharing mode). If the link between the CX310 in
slot 3X and switch B is faulty, the Monitor Link group in the CX310 shuts down all downlink
ports of this group (ports connected to the blade servers). When the blade servers detect the
port faults, service data is switched to the link between the CX310 in slot 2X and switch A
(through the red lines shown in the figure) to synchronize uplink status with downlink status.

3.9.3 Configuration Instance


The CX310 switch modules in slots 2X and 3X connect to access switches A and B
respectively and work independently. Create a Monitor Link group for each of the two switch
modules to monitor the uplinks. Set the ports connected to compute nodes 1, 2, and 3 as the
downlink ports of the Monitor Link groups.

# Create a Monitor Link group, and add uplink and downlink ports to the group.
(Perform these operations for the CX310s in slots 2X and 3X.)
[~CX310_2] interface 10GE2/17/1
[~CX310_2-10GE2/17/1] stp disable
[*CX310_2-10GE2/17/1] quit
[*CX310_2] monitor-link group 1
[*CX310_2-mtlk-group1] port 10GE2/17/1 uplink
[*CX310_2-mtlk-group1] port 10GE2/1/1 downlink 1
[*CX310_2-mtlk-group1] port 10GE2/2/1 downlink 2
[*CX310_2-mtlk-group1] port 10GE2/3/1 downlink 3
[*CX310_2-mtlk-group1] quit
[*CX310_2] commit

3.10 Configuration Restoration


When a newly installed switch module starts, it obtains the configuration file from the active
MM910. After the switch module is started, it uses the configuration file to update

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 70


Huawei E9000 Server
Network Technology White Paper 3 Network Technologies

configuration information. This prevents configuration loss during switch module


replacement. The following figure shows how the configuration file is transmitted between
modules.

If the switch module fails to obtain the configuration file from the active MM910 during the
startup process, an error message is always displayed over the serial port (or SOL).

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 71


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4 Networking Applications

This section describes common networking modes of the E9000 in the actual application
scenarios, including the Ethernet, FC, and FCoE converged networking, and networking with
proprietary protocols of Cisco switches.

4.1 Ethernet Networking


4.2 Networking with Cisco Switches (PVST+)
4.3 Cisco vPC Interoperability
4.4 FCoE Converged Networking
4.5 FC Networking

4.1 Ethernet Networking

4.1.1 Stack Networking


The switch modules in slots 2X and 3X or in slots 1E and 4E are stacked as a logical switch.
The switch modules in slots 2X and 3X or in slots 1E and 4E must be of the same type. The
ports on the switch modules are connected to switch A and switch B that are also stacked. The
links between the two logical switches are configured with (manual or static LACP) link
aggregation. Blade servers connect to switch module NICs, which can be bound in active/
standby or load-sharing mode as required. In this way, the networking between one server and
two switches is simplified. Figure 4-1 shows the networking.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 72


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-1 Stack networking

4.1.2 Smart Link Networking


In stack networking, the uplink switches A and B must also be stacked. In actual networking,
however, the existing switches may not support stacking. In this case, you can use Smart
Link, which allows the E9000 to connect to the existing network without any modification.
Figure 4-2 shows the Smart Link networking architecture.

Figure 4-2 Smart Link networking architecture

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 73


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

For details about the Smart Link networking, see section 3.6. Smart Link prevents a loop
between multiple switches by blocking the standby links. It also eliminates the modification
on the access switches A and B.

4.1.3 STP/RSTP Networking


Activate the 40GE link between switch modules in slots 2X and 3X as a cascading link.
Connect the switch modules to uplink access switches A and B. Enable STP or Rapid
Spanning Tree Protocol (RSTP) on uplink ports and cascading ports to prevent a loop between
the switch modules and access switches.

Figure 4-3 STP/RSTP networking

4.1.4 Monitor Link Networking


Deactivate the 40GE link between switch modules in slots 2X and 3X. (The 40GE link is
disabled by default.) Create a Monitor Link group for each switch module. The physical or
aggregation links between the switch modules and switch A B are uplinks. The links between
the switch modules and blade servers are downlinks. Configure NIC teaming for the compute
nodes. If the uplinks in a Monitor Link group are unavailable, the downlinks are blocked. The
NIC switches over services to the other Monitor Link group. Figure 4-4 shows the Monitor
Link networking architecture.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 74


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-4 Monitor Link networking

4.2 Networking with Cisco Switches (PVST+)

4.2.1 Cisco PVST+ Protocol


Cisco switches support the following spanning tree protocols: Per-VLAN Spanning Tree
(PVST), PVST+, Rapid-PVST+, Multiple Instance Spanning Tree Protocol (MISTP), and
Multiple Spanning Tree (MST). MST is an MSTP protocol that complies with the IEEE
802.1s standard. Table 4-1 lists protocols supported by different series of switches.

Table 4-1 Protocols supported by Catalyst and Nexus series switches

- PVST+ Rapid-PVST+ MST

Catalyst series switches (IOS 12.2 and later) Y Y Y

Nexus series switches N Y Y

PVST runs a common STP in each VLAN. Each VLAN has its independent STP status and
calculation. PVST series protocols cannot interact with IEEE standard STP/RSTP/MSTP
series protocols. Table 4-2 lists the differences between PVST and STP/RSTP/MSTP frames.

Table 4-2 Differences between PVST and STP/RSTP/MSTP frames

- STP/RSTP/MSTP PVST Series Protocols

The Ethernet frame header No Yes


carries a VLAN ID.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 75


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

- STP/RSTP/MSTP PVST Series Protocols

Destination MAC address 01-80-C2-00-00-00. This 01-00-0C-CC-CC-CD. This


MAC address is a bridge MAC address is a BPDU
protocol data unit (BPDU) MAC address defined by
MAC address defined by Cisco.
IEEE standard 802.1D
spanning tree protocols.

Protocol frame format Frame format defined by Cisco-defined frame format


IEEE 802.1D/802.1w/802.1s

Cisco develops PVST+ based on PVST and develops Rapid-PVST+ based on PVST+. Table
4-3 describes the improvements.

Table 4-3 PVST+ evolution

Protocol Improvement

PVST+ Enables interworking with standard STPs.


Adds PortFast, UplinkFast, and BackboneFast
functions.

Rapid-PVST+ Uses the RSTP mechanism to implement rapid


migration.

Table 4-4 describes how interworking between PVST+ and standard STPs is implemented on
different types of ports.

Table 4-4 PVST+ interoperability with standard STPs

Port Type PVST+ Interoperability with Standard STPs

Access PVST+ allows only the BPDUs in standard STP/RSTP format to


be sent over an Access port.

Trunk The default VLAN (VLAN 1) allows two types of packets: BPDU
in standard STP/RSTP format and private PVST BPDUs without
tags.
Private PVST BPDUs (with the destination MAC address of
01-00-0C-CC-CC-CD) are sent over other VLANs allowed.

NOTE

If the Trunk port does not allow the packets from the default VLAN to pass through, the port does not
transmit standard STP/RSTP BPDUs or private PVST BPDUs without tags.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 76


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4.2.2 Processing of PVST+ BPDUs


By default, the CX switch modules consider Cisco PVST BPDUs (the destination MAC
address is 01-00-0C-CC-CC-CD) as unknown multicast packets and broadcast the packets.
The mac-address bpdu command can be used to set the BPDU MAC address to 01-00-0C-
CC-CC-CD. Then, the system discards PVST BPDUs.

4.2.3 Standard MSTP


MSTP is a new spanning tree protocol defined by IEEE 802.1s. Compared with STP and
RSTP, MSTP has the following advantages:
1. MSTP divides a switching network into multiple domains. Each domain has multiple
spanning trees that are independent from each other. MSTP uses a Common and Internal
Spanning Tree (CIST) to prevent loops in the entire network topology.
2. MSTP maps multiple VLANs to one instance to reduce communication overheads and
conserve resources. The topology of each MSTP instance is calculated independently
(each instance has an independent spanning tree). The traffic from different VLANs is
evenly distributed by the instances.
3. MSTP provides a fast port transition mechanism similar to that used in RSTP.
4. MSTP is compatible with STP and RSTP.
MSTP and RSTP can recognize each other's BPDUs. STP cannot identify MSTP BPDUs. To
implement networking with STP devices, MSTP is compatible with RSTP and provides STP-
compatible mode, RSTP mode, and MSTP mode.
1. In STP-compatible mode, each port sends STP BPDUs.
2. In RSTP mode, each port sends RSTP BPDUs. When a port is connected to a device
running STP, the port transits to the STP-compatible mode automatically.
3. In MSTP mode, each port sends MSTP BPDUs. When a port is connected to a device
running STP, the port transits to the STP-compatible mode automatically.
A device working in RSTP or MSTP mode can transit to the STP-compatible mode
automatically, but a device working in STP-compatible mode cannot transit to the RSTP or
MSTP mode automatically. To change the working mode from STP-compatible to RSTP or
MSTP, perform the mCheck operation. If a port of an MSTP or RSTP device on a switching
network is connected to an STP device, the port automatically transits to the STP-compatible
mode. If the STP device is removed, the port cannot automatically transit back to MSTP or
RSTP mode. You can perform the mCheck operation to forcibly transit the port to MSTP or
RSTP mode. In STP-compatible or RSTP mode, multiple instances can be configured and the
status of each port of MSTP is consistent with that of the CIST. To reduce loads on CPUs, do
not configure multiple instances in STP-compatible or RSTP mode.

4.2.4 Difference Between Cisco and Huawei MSTPs


Cisco MST is a standard MSTP protocol. MST BPDUs use the standard format defined by
IEEE. Huawei and Cisco switches use different keys to generate MSTP digests in BPDUs. By
default, MSTP and Cisco MST can implement only inter-domain interoperation because
Huawei and Cisco switches generate different digests. To implement communication between
MSTP and Cisco MST within an MPST domain, enable the digest snooping function for the
ports on the Huawei and Cisco switches.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 77


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4.2.5 Interconnection Scheme

4.2.5.1 Smart Link Group Interconnecting with Cisco PVST+ Network


Stack the switch modules in slots 2X and 3X or in slots 1E and 4E. Connect the stacked
switch modules to the Cisco PVST+ network through two uplinks. Configure the two uplinks
as one Smart Link group, enable one link (Master) to forward data, and block the other link
(Slave). All packets, including Cisco PVST BPDUs, received on the slave link are discarded.
Loops are prevented in this way. Service data sent from blade servers may pass through the
stack link between slots 2X and 3X. Therefore, configure sufficient bandwidths for the stack
links.

Figure 4-5 Smart Link group interconnecting with Cisco PVST+ network

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 78


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4.2.5.2 IEEE Standard Protocol (Root Bridge on the Cisco PVST+ Network Side)

Figure 4-6 Interconnecting with a Cisco PVST+ network through MSTP

CX switch modules are connected to the Cisco PVST+ network over MSTP, and
interconnection ports automatically switch to the RSTP mode. To ensure that root ports are on
the Cisco PVST+ network side, the CX switch modules and Cisco switches must be
configured with proper cost values and priorities (the bridge priority for VLAN 1 should be
higher than that of Huawei CST). Ensure that root bridges are on the Cisco switches and
blocked ports of VLAN 1 are on the CX switch ports. The CX switch modules also block
other VLAN packets. Therefore, block points of other VLANs of the Cisco PVST+ network
are on the same port of the CX switch module.

4.3 Cisco vPC Interoperability


Virtual Port Channel (vPC) implements network virtualization of Cisco Nexus series data
center switches, allowing multiple physical links of downstream devices to connect to two
different Nexus switches by using link aggregation. Logically, the two physical switches are
presented as one logical switch.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 79


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-7 vPC network topology

vPC uses two independent control planes, whereas the VSS (stack) technology uses one
control plane. Figure 4-8 shows the functional components involved in vPC.

Figure 4-8 vPC functional components

l vPC peer link: synchronizes vPC device status.


l vPC peer-keepalive link: checks whether the peer vPC device is available.
l vPC member port: accesses vPC to link aggregation ports, as long as the connected
device supports manual or static LACP link aggregation.
l vPC member port: accesses vPC to a single link aggregation port.
l Router: an upstream router. Independent links connect to two vPC switches, and the
router selects a path over ECMP.

If the E9000 connects to the vPC network, CX switch modules are used as downstream
devices, which connect to the vPC domain through L2 link aggregation ports. Figure 4-9
shows two access scenarios.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 80


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-9 Connecting CX switch modules to the vPC domain

Stack the two CX switch modules as a logical switch and connect the switch modules to the
vPC domain through the Eth-Trunk across switch modules. Alternatively, connect each of the
switch module to the vPC domain through an Eth-Trunk.

4.4 FCoE Converged Networking

4.4.1 CX311 FCoE Converged Networking

4.4.1.1 CX311 Switch Module


The CX311 converged switch module provides sixteen 10GE ports and eight 8G FC ports on
the panel to connect to Ethernet devices, FC storage devices, or FC switches. The CX311 has
a built-in QLogic FCoE switch module MX510. Figure 4-10 shows the networking of CX311
converged switch modules.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 81


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-10 CX311 converged switch module networking

The CX311 10GE Ethernet switching chip (LSW) connects to the MX510 through eight
10GE ports. FCoE traffic sent from CNAs is forwarded to the MX510s through the eight
10GE ports. The MX510 implements the conversion between FCoE and FC ports and
externally connects to FC storage or switches.

The MX510 can work in Transparent (NPV) or Full Fabric (by default) mode. You can
change the working mode on the MX510 CLI. Table 4-5 lists the working modes of the
MX510 when it is connected to different devices.

Table 4-5 Working modes of the MX510

Connected Device Model MX510 Working


Mode

FC storage device N/A Full fabric


(direct connection)

FC switch MDS series: Transparent


MDS 9100/MDS 9500

QLogic 3000/QLogic 5000/QLogic 9000 Full fabric

FC switch: Transparent
Brocade 300/Brocade 5000/Brocade 7500

Connect the FC switches from different vendors in NPV mode to reduce interconnection
risks.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 82


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4.4.1.2 Default Configuration of the CX311


The default working mode of the MX510 is full fabric. The MX510 in default working mode
can directly connect to FC storage devices. FCoE features take effect only when FSB and
security-related features are configured for the 10GE switching plane of the CX311. Table
4-6 describes the default FCoE configuration on the 10GE switching plane of the CX311.

Table 4-6 Default configuration


Configuration Default Command
Task Configuration

Create an FCoE Create FCoE VLAN [~CX311_C] fcoe FCOE-1002


VLAN. 1002, and add the ports [~CX311_C-fcoe-FCOE-1002] vlan 1002
connected to compute
nodes and those
connected to the
MX510 to the VLAN.
Set the former port
type to Hybrid, and the
latter to Trunk.

Configure port VNP (ENode-facing by # Configure roles for the ports connected to
roles. default) the MX510.
[~CX311_C] interface 10ge 2/20/3
[~CX311_C-10GE2/20/3] fcoe role vnp

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 83


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Configuration Default Command


Task Configuration

Configure DCB l Enable the DCBX # Enable the DCBX protocol.


features. protocol on the [~CX311_C-10GE2/1/1] lldp tlv-enable
ports connected to dcbx
compute nodes and
ports connected to # Enable PFC (ports connected to the
the MX510. The MX510 are in manual mode, and other
DCBX version is ports are in auto mode).
Intel-DCBX. [~CX311_C-10GE2/1/1] dcb pfc enable
l Set ETS mode auto
parameters. The [~CX311_C-10GE2/20/1] dcb pfc enable
DRR weights of mode manual
PG0 and PG1 are # Configure the ETS queue and bandwidth
both 50. Queues 0, profile.
1, 2, 4, 5, 6, and 7
[~CX311] dcb ets-profile DCBX
belong to PG0, and
queue 3 belongs to [*CX311-ets-DCBX] priority-group 0
PG1. queue 0 to 2 4 to 7
l Prevent the ports [*CX311_C-10GE3/1/1] dcb ets enable
connected to the DCBX
MX510 from # Set the DCBX version to Intel.
advertising three
[*CX311_C-10GE2/1/1] dcb compliance
TLVs: basic, dot1,
intel-oui
and dot3.
# Prevent the ports connected to the
MX510 from advertising several TLVs.
[*CX311_C-10GE2/20/1] lldp tlv-disable
basic-tlv all
[*CX311_C-10GE2/20/1] lldp tlv-disable
dot1-tlv all
[*CX311_C-10GE2/20/1] lldp tlv-disable
dot3-tlv all

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 84


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Configuration Default Command


Task Configuration

Configure security l Add all the ports # Apply the traffic policy FCOE-p1 to the
features. connect to the ports connected to the MX510.
MX510 to a port [*CX311_C-10GE2/20/1] port-isolate
group to prevent enable group 1
mutual interference
between Ethernet [*CX311_C-10GE2/20/1] stp disable
packets. Suppress [*CX311_C-10GE2/20/1] storm
outbound broadcast, suppression broadcast block outbound
unknown unicast, [*CX311_C-10GE2/20/1] storm
and multicast suppression unknown-unicast block
packets on all ports. outbound
(Only allow port */
[*CX311_C-10GE2/20/1] traffic-policy
20/1 to send
FCOE-p1 outbound
multicast packets
with the destination # Apply the traffic policy FCOE-p11 to
MAC address panel ports.
0110-1801-0002
[*CX311_C-10GE2/17/1] traffic-policy
and traffic less than
FCOE-p11 inbound
or equal to 2
Mbit/s). Of all [*CX311_C-10GE2/17/1] traffic-policy
outbound packets, FCOE-p11 outbound
only FIP and FCoE
Note: If the panel ports are added to an
packets are allowed.
Eth-Trunk, apply the traffic policy FCOE-
l Prevent panel ports p11 to the Eth-Trunk.
from sending or
receiving FIP and
FCoE packets.
l Prevent the ports
connected to
MM910s and the
40GE ports
interconnecting
switch modules
from sending or
receiving FIP and
FCoE packets.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 85


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Configuration Default Command


Task Configuration

Configure the Create FCoE VLAN FCoE_GW (admin-config): admin> vlan


MX510. 1002 and add the eight 1002 create
MX510 ports FCoE_GW(admin-config): admin>vlan
connected to the LSW 1002 add port 12
to the VLAN.
FCoE_GW(admin-config): admin>vlan
1002 add port 13
FCoE_GW(admin-config): admin>vlan
1002 add port 14
FCoE_GW(admin-config): admin>vlan
1002 add port 15
FCoE_GW(admin-config): admin>vlan
1002 add port 16
FCoE_GW(admin-config): admin>vlan
1002 add port 17
FCoE_GW(admin-config): admin>vlan
1002 add port 18
FCoE_GW(admin-config): admin>vlan
1002 add port 19

The CX311 of default configurations (non-stacking) can directly connect to FC storage


devices. To connect the CX311 to FC switches, change the MX510 working mode to
Transparent or Full fabric based on the type of the connected FC switch.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 86


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4.4.1.3 Connecting MX510s to Storage Devices

Figure 4-11 Connecting MX510s to storage devices

The MX510 in full fabric mode can directly connect to storage devices. Connect the FC ports
on the switch module panel to both the controllers A and B of the storage device to ensure
system reliability and storage performance.

4.4.1.4 Connecting MX510s to FC Switches


Connect each MX510 to one external FC switch through one or more FC ports on the switch
module panel. Do not connect one MX510 to two FC switches. It may cause multiple paths to
connect to one switch.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 87


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-12 Connecting MX510s to FC switches

4.4.1.5 MX510 Link Load Balancing and Failover

Figure 4-13 MX510 link load balancing and failover

A CNA of a blade server provides at least two ports to connect to the switch modules in slots
2X and 3X or in slots 1E and 4E. The two ports perform the FIP VLAN discovery, FIP FCF
discovery, and FIP FLOGI operations on the connected MX510 in sequence. When receiving
a CNA registration request, the MX510 randomly selects a port from the eight 10GE ports
that connect the MX510 to the LSW as the FCoE session port. Then, all FCoE packets are
sent and received through this port. If the port is faulty, FIP Keepalive fails and the FCoE

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 88


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

session is invalid. The CNA repeats the FIP VLAN discovery, FIP FCF discovery, and FIP
FLOGI operations to enable the MX510 to assign a new port. After the port is assigned, a new
FCoE session is set up automatically.
Similarly, the MX510 connects to an FC switch through multiple FC links. During FIP login,
the MX510 randomly selects an FC link as the FCoE session transmission link. If the FC link
is faulty, the CNA registers again and the MX510 reselects an FC link randomly to implement
a failover. During the failover, the FC panel port and the 10GE port between the LSW and
MX510 are both reselected randomly, as shown in Figure 4-13.

4.4.1.6 CX311s in Stacking Mode


After the CX311s in slots 2X and 3X or in slots 1E and 4E are stacked, the default
configuration of the CX311s changes. Only the master CX311 retains the default FCoE
configuration during the initial setup of the stack. The standby CX311 does not keep the
default FCoE configuration. Therefore, after the stack is set up, you must add the FCoE
configuration. For example, if the CX311 in slot 2X is the master module, Table 4-7 lists the
FCoE configuration to be added for the stack.

Table 4-7 FCoE configuration to be added


Configuration Task Configuration Items Main Command

Create an FCoE VLAN. Create FCoE VLAN 1003. [~CX311_C] fcoe FCOE-1003
[~CX311_C-fcoe- FCOE-1003]
vlan 1003

Configure port roles for Set the role of ports (in slot Same as the default configuration.
the MX510. 3X) connected to the
MX510 to VNP.

Configure DCB Configure the ports (in slot Same as the default configuration.
features. 3X) connected to the
MX510 and compute
nodes. Configuration items
are the same.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 89


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Configuration Task Configuration Items Main Command

Configure the MX510. Create FCoE VLAN 1003, FCoE_GW(admin-config):


remove the eight ports admin> vlan 1002 remove port 12
connected to the LSW from FCoE_GW(admin-config):
FCoE VLAN 1002, and admin> vlan 1002 remove port 13
add the ports to FCoE
VLAN 1003. FCoE_GW(admin-config):
admin> vlan 1002 remove port 14
FCoE_GW(admin-config):
admin> vlan 1002 remove port 15
FCoE_GW(admin-config):
admin> vlan 1002 remove port 16
FCoE_GW(admin-config):
admin> vlan 1002 remove port 17
FCoE_GW(admin-config):
admin> vlan 1002 remove port 18
FCoE_GW(admin-config):
admin> vlan 1002 remove port 19
FCoE_GW(admin-config):
admin> vlan 1003 create
FCoE_GW(admin-config):
admin> vlan 1003 add port 12
FCoE_GW(admin-config):
admin> vlan 1003 add port 13
FCoE_GW(admin-config):
admin> vlan 1003 add port 14
FCoE_GW(admin-config):
admin> vlan 1003 add port 15
FCoE_GW(admin-config):
admin> vlan 1003 add port 16
FCoE_GW(admin-config):
admin> vlan 1003 add port 17
FCoE_GW(admin-config):
admin> vlan 1003 add port 18
FCoE_GW(admin-config):
admin> vlan 1003 add port 19

4.4.2 Connecting CX310s to Cisco Nexus 5000 Series Switches


The CX310 converged switch module supports FSB and provides sixteen 10GE ports on the
panel. The ports can be connected to external FCoE switches, which provide FC ports to
connect to FC storage devices or FC switches. Figure 4-14 shows the connections between
CX310s and Cisco Nexus 5000 series FCoE switches.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 90


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-14 Connecting CX310s to Cisco Nexus 5000 series switches

The CX310s are not stacked. (If CX310s are stacked, you only need to configure the ETS
template and LLDP once. Port configurations of slots 2X and 3X are the same except the
FCoE VLAN.) This section uses the CX310 in slot 2X as an example to describe how to
configure the connections between the CX310 switch module and Cisco Nexus 5000 switch.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 91


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Table 4-8 Configuration items


Configura Description Description
tion Item

Create an Create FCoE [~CX310_2] fcoe FCOE-1002


FCoE VLAN 1002 [*CX310_2-fcoe-FSB] vlan 1002
VLAN. and add the
ports [*CX310_2] interface 10GE 2/1/1
connected to [*CX310_2-10GE2/1/1] port link-type hybrid
compute nodes [*CX310_2-10GE2/1/1] port hybrid tagged vlan 1002
and the FCoE
panel ports to [*CX310_2-10GE2/1/1] quit
the VLAN. [*CX310_2] interface 10GE 2/2/1
The port type [*CX310_2-10GE2/2/1] port link-type hybrid
is Hybrid.
[*CX310_2-10GE2/2/1] port hybrid tagged vlan 1002
[*CX310_2-10GE2/2/1] quit
[*CX310_2] interface 10GE 2/3/1
[*CX310_2-10GE2/3/1] port link-type hybrid
[*CX310_2-10GE2/3/1] port hybrid tagged vlan 1002
[*CX310_2-10GE2/3/1] quit
[*CX310_2] interface 10GE 2/17/1
[*CX310_2-10GE2/17/1] port link-type hybrid
[*CX310_2-10GE2/17/1] port hybrid tagged vlan 1002
[*CX310_2-10GE2/17/1] quit
[*CX310_2] commit

Configure Set the type of [*CX310_2] interface 10GE 2/17/1


port roles. the panel ports [*CX310_2-10GE2/17/1] fcoe role vnp
to VNP.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 92


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Configura Description Description


tion Item

Configure l Enable # Enable LLDP and PFC.


DCB LLDP and [*CX310_2] lldp enable
features. PFC.
[*CX310_2] interface 10GE 2/1/1
l Configure
the ETS [*CX310_2-10GE2/1/1] lldp tlv-enable dcbx
queue [*CX310_2-10GE2/1/1] dcb pfc enable mode manual
bandwidth [*CX310_2-10GE2/1/1] quit
control
[*CX310_2] interface 10GE 2/2/1
template
and DCBX [*CX310_2-10GE2/2/1] lldp tlv-enable dcbx
version. [*CX310_2-10GE2/2/1] dcb pfc enable mode manual
l Prevent the [*CX310_2-10GE2/2/1] quit
ports
[*CX310_2] interface 10GE 2/3/1
connected
to Cisco [*CX310_2-10GE2/3/1] lldp tlv-enable dcbx
Nexus 5000 [*CX310_2-10GE2/3/1] dcb pfc enable mode manual
switches
[*CX310_2-10GE2/3/1] quit
from
advertising [*CX310_2] interface 10GE 2/17/1
some [*CX310_2-10GE2/17/1] lldp tlv-enable dcbx
TLVs.
[*CX310_2-10GE2/17/1] dcb pfc enable mode manual
[*CX310_2-10GE2/17/1] quit
[*CX310_2] commit
# Configure ETS parameters (ports connecting CNAs and
Cisco Nexus 5000 switches) and set the DCBX version to
Intel DCBX.
[~CX310_2] dcb ets-profile DCBX
[*CX310_2-ets-DCBX] priority-group 0 queue 0 to 2 4 to
7
[*CX310_2-ets-DCBX] quit
[*CX310_2] interface 10GE 2/1/1
[*CX310_2-10GE2/1/1] dcb ets enable DCBX
[*CX310_2-10GE2/1/1] dcb compliance intel-oui
[*CX310_2-10GE2/1/1] quit
[*CX310_2] interface 10GE 2/2/1
[*CX310_2-10GE2/2/1] dcb ets enable DCBX
[*CX310_2-10GE2/2/1] dcb compliance intel-oui
[*CX310_2-10GE2/2/1] quit
[*CX310_2] interface 10GE 2/3/1
[*CX310_2-10GE2/3/1] dcb ets enable DCBX
[*CX310_2-10GE2/3/1] dcb compliance intel-oui
[*CX310_2-10GE2/3/1] quit

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 93


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Configura Description Description


tion Item

[*CX310_2] interface 10GE 2/17/1


[*CX310_2-10GE2/17/1] dcb ets enable DCBX
[*CX310_2-10GE2/17/1] dcb compliance intel-oui
[*CX310_2-10GE2/17/1] quit
[*CX310_2] commit
# Prevent the ports connecting to Cisco Nexus 5000
switches from advertising some TLVs.
[~CX310_2-10GE2/17/1] lldp tlv-disable basic-tlv all
[*CX310_2-10GE2/17/1] lldp tlv-disable dot1-tlv all
[*CX310_2-10GE2/17/1] lldp tlv-disable dot3-tlv all
[*CX310_2-10GE2/17/1] quit
[*CX310_2] commit

4.4.3 Connecting CX310s to Brocade VDX6700 Series Switches


The connections between CX310s and Brocade VDX6700 series switches over FCoE links
are the same as that between CX310s and Cisco Nexus 5000 series switches, except the
mapping between ETS queues and PGs. Table 4-9 describes ETS configurations.

Table 4-9 Configuration items


Config Description Command
uration
Item

Configu Add an ETS # Configure ETS parameters (ports connected to Brocade


re DCB profile DCBXF to VDX6700 series switches) and the DCBXF version.
features. configure switch [*CX310_2] dcb ets-profile DCBXF
ports connected to
the Brocade [*CX310_2-ets-DCBXF] priority-group 0 queue 0 to 2 4
VDX6700 series to 6
switches. [*CX310_2-ets-DCBXF] priority-group 15 queue 7
[*CX310_2-ets-DCBXF] quit
[*CX310_2] interface 10GE 2/17/1
[*CX310_2-10GE2/17/1] dcb ets enable DCBXF
[*CX310_2-10GE2/17/1] dcb compliance intel-oui
[*CX310_2-10GE2/17/1] quit
[*CX310_2] commit

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 94


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4.4.4 CX320 Converged Networking

4.4.4.1 CX320 Converged Switch Module


The CX320 converged switch module provides fixed ports on the panel: 8 x GE + 2 x 40GE.
It also supports two flexible cards, such as the MX517 card, which provides four SFP+
Unified ports that can be configured as four 10GE or 8G FC ports. Figure 4-15 shows the
CX320.

Figure 4-15 CX320 switch module

The CX320 supports three FCoE networking modes: FSB, FCF, and NPV. It is the best choice
for converged networks of enterprises and carriers. The CX320 can be configured with an
FCF instance to connect to FC or FCoE storage devices or configured with an NPV instance
to connect to FC or FCoE switches, so that various converged networking requirements are
met.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 95


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-16 CX320 converged network

The CX320 switch modules can be installed in slots 2X and 3X or slots 1E and 4E and work
with 2-port or 4-port CNAs to meet various converged application requirements. A shown in
Figure 4-16, CX320 switch modules are installed in the E9000 server to implement a
converged network inside the E9000 chassis and convergence of the external LAN and SAN.
The MX517 flexible cards provide 8G FC ports to connect to FC switches or storage so that
the existing SAN resources are fully utilized. With flexible cards, the CX320 supports
evolution to 16G FC, 32G FC, and 25GE to fully protect customers' investments.

4.4.4.2 FCF Networking


l FCF network
The CX320 switch modules function as FCF switches and can connect to FC and FCoE
storage devices at the same time, as shown in Figure 4-17.

Figure 4-17 FCF network

The CNAs on compute nodes connect to the CX320 switch modules in slots 2X and 3X or
slots 1E and 4E. Each CX320 connects to both storage controllers of the FC or FCoE storage.
If one CX320 fails, the other one is still connected to the two controllers so that the number of

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 96


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

storage controllers and storage performance are not reduced. After the CX320 switch modules
are configured with FCF instances and connected to FC storage, the port roles are as shown in
Figure 4-18.

Figure 4-18 Port roles in an FCF network

In the figure, the CX320 switch modules use VF_Ports connect to CNAs of compute nodes
and use F_Ports to connect to storage devices.

l FCF network configuration example

As shown in Figure 4-19, CX320 switch modules are installed in slots 2X and 3X; an MX517
is installed in flexible card slot 1 of each CX320; the compute node is installed in slot 1; an
MZ510 is installed on the compute node; each CX320 uses one 10GE link to connect to the
LAN and one FC link to connect to FC storage so that network reliability is ensured.

Figure 4-19 FCF network configuration example

Step 1 Plan the configuration process.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 97


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

1. Configure the MZ510 working mode to NIC + FCoE.


2. Configure the ports of the flexible card as FC ports.
3. Create an NPV instance, specify an FCoE VLAN, and add FC and FCoE ports to the
FCF instance.

Step 2 Prepare data.


Prepare the following data for this example.

Table 4-10 Device information


Device Port No. WWPN/VLAN FC_ID Connected
To

CX320 (2X) FC 2/21/1 - - FC storage

10GE 2/1/1 2094, 2093 - CH121 V3

10GE 2/17/1 2093 - LAN

CX320 (3X) FC 3/21/1 - - FC storage

10GE 3/1/1 2094, 2093 - CH121 V3

10GE 3/17/1 2093 - LAN

CH121 V3 - 30:00:00:68:50:40:30:02 16.00.01 CX320 (2X)

- 30:00:00:68:50:40:30:04 16.00.03 CX320 (3X)

FC storage - 30:00:00:68:50:40:30:01 16.00.02 CX320 (2X)

- 30:00:00:68:50:40:30:03 16.00.04 CX320 (3X)

Step 3 Perform the configuration.

Configure the MZ510.


1. During the BIOS startup, press Ctrl+P when "Press <Ctrl><P> for PXESelect (TM)
Utility" is displayed. The Controller Configuration screen shown in Figure 4-20 is
displayed.

Figure 4-20 Configuring the MZ510

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 98


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

2. Select Personality by pressing arrow keys and press Enter. The Personality options
shown in Figure 4-21 are displayed.

Figure 4-21 Setting the NIC working mode

3. Select FCoE by pressing arrow keys and press Enter. Then select Save by pressing
arrow keys and press Enter, as shown in Figure 4-22.

Figure 4-22 Saving the configuration

Configure FC ports.
# Before configuring FC ports, insert FC optical modules into the ports. If the Link indicator
is steady green, the port is up.

# Configure port 10GE 2/21/1 of the CX320 in slot 2X as an FC port.


<HUAWEI> system-view
[~HUAWEI] sysname CX320_2X
[*HUAWEI] commit
[~CX320_2X] port mode fc 10GE 2/21/1
Warning: This operation will cause all the other configurations on the port to be lost.
Continue?[Y/N]:Y
[*CX320_2X] commit

# Check the FC port status.


[~CX320_2X] display interface fc 2/21/1

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 99


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

FC2/21/1 current state : UP (ifindex:144)


Line protocol current state : UP
......

Create an FCF instance.


# Create VLAN 2093 for Ethernet services.
[~CX320_2X] vlan 2093

# For the CX320 in slot 2X, create an FCF instance fcf1 and specify VLAN 2094 as the FCoE
VLAN of the instance.
[*CX320_2X] fcoe fcf1 fcf
[*CX320_2X-fcoe-fcf-fcf1] vlan 2094
[*CX320_2X] quit

# Create FCoE port 1.


[*CX320_2X] interface fcoe-port 1
[*CX320_2X-FCoE-Port1] quit

# Set the ports connected to CH121 compute nodes as hybrid ports and add the FCoE and
Ethernet VLANs.
[*CX320_2X] interface 10GE 2/1/1
[*CX320_2X-10GE2/1/1] port link-type hybrid
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2094
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2093
[*CX320_2X-10GE2/1/1] fcoe-port 1
[*CX320_2X-10GE2/1/1] quit

# Set ports connected to the LAN as access ports and add the Ethernet VLAN.
[*CX320_2X] interface 10GE 2/17/1
[*CX320_2X-10GE2/17/1] port link-type access
[*CX320_2X-10GE2/17/1] port default vlan 2093

# Add port FC 2/21/1 of the CX320 in slot 2X to the fcf1 instance.


[*CX320_2X] fcoe fcf1
[*CX320_2X-fcoe-fcf-fcf1] member interface fc 2/21/1
[*CX320_2X-fcoe-fcf-fcf1] member interface fcoe-port 1
[*CX320_2X] commit

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 100


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Verify the configuration.


# View registration information of servers and storage.
[~CX320_2X] display fcoe name-server brief
The Name-Server Information:
-------------------------------------------------------------------------------
Interface FC-ID WWPN
-------------------------------------------------------------------------------
FCoE-Port1 16.00.01 30:00:00:68:50:40:30:02
FC2/21/1 16.00.02 30:00:00:68:50:40:30:01
-------------------------------------------------------------------------------
Total: 2

Configure the CX320 in 3X.


The configuration procedure is the same as that of the CX320 in slot 2X, except that the FC
port numbers and the numbers of the ports connected to compute nodes.

----End

4.4.4.3 NPV Networking


l NPV network
Functioning as an NPV switch, the CX320 is located at the edge of the SAN fabric network
and between node devices and FCF switches, as shown in Figure 4-23.

Figure 4-23 NPV network

The CNA on a compute node connects to the CX320 switch modules in slots 2X and 3X or
slots 1E and 4E. Each CX320 connects to an FC or FCoE switch, which is connected to
storage through a SAN. After the CX320 switch modules are configured with FCF instances
and connected to FC switches, the port roles are as shown in Figure 4-24.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 101


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-24 Port roles in an NPV network

In the figure, the NPV instances of the CX320 switch modules use VF_Ports to connect to
CNAs of compute nodes and use NP_Ports to connect to FC switches.

l NPV network configuration example


As shown in Figure 4-25, CX320 switch modules are installed in slots 2X and 3X; an MX517
is installed in flexible card slot 1 of each CX320; the compute node is installed in slot 1; an
MZ510 is installed on the compute node; each CX320 uses one 10GE link to connect to the
LAN and one FC link to connect to an FC switch so that network reliability is ensured.

Figure 4-25 NPV network configuration example

Step 1 Plan the configuration process.


1. Configure the MZ510 working mode to NIC + FCoE.
2. Configure the ports of the flexible card as FC ports.
3. Create an NPV instance, specify an FCoE VLAN, and add FC and FCoE ports to the
NPV instance.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 102


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Step 2 Prepare data.


Prepare the following data for this example.

Table 4-11 Device information


Device Port No. WWPN/VLAN FC_ID Connected
To

CX320 (2X) FC 2/21/1 - - FCA

10GE 2/1/1 2094, 2093 - CH121 V3

10GE 2/17/1 2093 - LAN

CX320 (3X) FC 3/21/1 - - FCB

10GE 3/1/1 2094, 2093 - CH121 V3

10GE 3/17/1 2093 - LAN

CH121 V3 - 30:00:00:68:50:40:30:02 16.00.01 CX320 (2X)

- 30:00:00:68:50:40:30:04 16.00.03 CX320 (3X)

FCA - 30:00:00:68:50:40:30:01 16.00.02 CX320 (2X)

FCB - 30:00:00:68:50:40:30:03 16.00.04 CX320 (3X)

Step 3 Perform the configuration.


Configure the MZ510.
1. During the BIOS startup, press Ctrl+P when "Press <Ctrl><P> for PXESelect (TM)
Utility" is displayed. The Controller Configuration screen shown in Figure 4-26 is
displayed.

Figure 4-26 Configuring the MZ510

2. Select Personality by pressing arrow keys and press Enter. The Personality options
shown in Figure 4-27 are displayed.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 103


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-27 Setting the NIC working mode

3. Select FCoE by pressing arrow keys and press Enter. Then select Save by pressing
arrow keys and press Enter, as shown in Figure 4-28.

Figure 4-28 Saving the configuration

Configure FC ports.
# Before configuring FC ports, insert FC optical modules into the ports. If the Link indicator
is steady green, the port is up.
# Configure port 10GE 2/21/1 of the CX320 in slot 2X as an FC port.
<HUAWEI> system-view
[~HUAWEI] sysname CX320_2X
[*HUAWEI] commit
[~CX320_2X] port mode fc 10GE 2/21/1
Warning: This operation will cause all the other configurations on the port to be lost.
Continue?[Y/N]:Y
[*CX320_2X] commit

# Check the FC port status.


[~CX320_2X] display interface fc 2/21/1
FC2/21/1 current state : UP (ifindex:144)
Line protocol current state : UP
......

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 104


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Create an NPV instance.


# Create VLAN 2093 for Ethernet services.
[~CX320_2X] vlan 2093

# Create FCoE port 1.


[*CX320_2X] interface fcoe-port 1
[*CX320_2X-FCoE-Port1] quit

# For the CX320 in slot 2X, create an NPV instance npv1 and specify VLAN 2094 as the
FCoE VLAN and VLAN 2095 as the NPV VLAN.
[*CX320_2X] fcoe npv1 npv
[*CX320_2X-fcoe-npv-npv1] vlan 2094
[*CX320_2X-fcoe-npv-npv1] npv-vlan 2095
[*CX320_2X] quit

# Add port 10GE 2/1/1 of the CX320 in slot 2X to FCoE port 1.


[*CX320_2X-10GE2/1/1] fcoe-port 1
[*CX320_2X-10GE2/1/1] quit

# Set the ports connected to CH121 compute nodes as hybrid ports and add the FCoE and
Ethernet VLANs.
[*CX320_2X] interface 10GE 2/1/1
[*CX320_2X-10GE2/1/1] port link-type hybrid
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2094
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2093
[*CX320_2X-10GE2/1/1] fcoe-port 1
[*CX320_2X-10GE2/1/1] quit

# Set the ports connected to the LAN as access ports and add the Ethernet VLAN.
[*CX320_2X] interface 10GE 2/17/1
[*CX320_2X-10GE2/17/1] port link-type access
[*CX320_2X-10GE2/17/1] port default vlan 2093

# Add port FC 2/21/1 of the CX320 in slot 2X to the npv1 instance.


[*CX320_2X] fcoe npv1
[*CX320_2X-fcoe-fcf-npv1] member interface fc 2/21/1
[*CX320_2X-fcoe-fcf-npv1] member interface fcoe-port 1

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 105


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

[*CX320_2X-fcoe-fcf-npv1] fcoe role np-port interface fc 2/21/1

[*CX320_2X] commit

Verify the configuration.

# View registration information of servers and storage.

[~CX320_2X] display fcoe instance npv

FCoE instance with NPV type:

-------------------------------------------------------------------------------

Instance name : npv1

VLAN : 2094

Instance MAC : 20:0b:c7:23:42:03

FKA-ADV-Period(ms) : 8000

Number of FCoE Port(VF & F) : 1

Number of FCoE Port(VNP & NP) : 1

Number of online VF(F)-Port : 1

Number of online ENode VN(N)-Port : 1

-------------------------------------------------------------------------------

Total: 1

Configure the CX320 in slot 3X.

The configuration procedure is the same as that of the CX320 in slot 2X, except that the FC
port numbers and the numbers of the ports connected to compute nodes.

----End

4.5 FC Networking

4.5.1 Multi-Plane Switch Module

4.5.1.1 Overview
The CX9xx series multi-plane switch modules include the CX911 and CX912 with 10GE
+ 8G FC ports, CX915 with GE + 8G FC ports, and CX916 with 10GE + 16G FC ports.
Figure 4-29 shows the internal structure of multi-plane switch modules.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 106


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-29 Internal structure of multi-plane switch modules

Ethernet and FC planes are independent in management, control, and data channels, similar to
two independent switches combined into one switch module. The two planes use the same
hardware management and monitoring systems, improving system integration. In addition, the
CX210 and CX220 consist of only an FC switching plane. CX210 and CX912 both integrate
the MX210 as the FC switching plane. The CX220 and CX916 both integrate the MX220 as
the FC switching plane. The CX916 provides 16G FC switching when working with the
MZ220 NIC. The CX916 provides Ethernet switching only if it is installed in slot 2X or 3X
and works with V5 compute nodes (such as the CH121 V5 with the two-port 10GE LOM). If
the CX916 is installed in slot 1E or 4E, it only provides FC switching and works with the
MZ220 NIC.

4.5.1.2 MZ910 Port Types


The FC switching planes of the CX911 and CX915 are the QLogic FCoE switch (MX510),
and the FC switching plane of the CX912 is the Brocade 300 FC switch (MX210). The
MX510 switch module provides 12 FCoE ports and 12 FC ports, and the E9000 supports a
maximum of 16 compute nodes. Therefore, if a multi-plane switch module is used with the
MZ910, the port types depend on the slot numbers.
The MZ910 provides four ports. Ports 1 and 3 are 10GE Ethernet ports, and ports 2 and 4 are
FC or FCoE ports. (The MM910 configures the MZ910 port types based on the switch
module type.) Figure 4-30 shows port types of the MZ910s connected to the CX911s or
CX915s.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 107


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Figure 4-30 MZ910 port types

From the OS perspective, each FCoE port has a NIC and an HBA, and each 10GE port has a
NIC. Therefore, if the CX911 or CX915 is used with the MZ910, each MZ910 in slots 1 to 12
provides four NICs and two HBAs and each MZ910 in slots 13 to 16 provides two NICs and
two HBAs. See Table 4-12.

Table 4-12 Quantities of NICs and HBAs


Switch Slot MZ910 Port Number of Number of
Module Number Type NICs HBAs

CX911/CX915 01-12 FCoE 4 2

13-16 FC 2 2

CX912 01-16 FC 2 2

4.5.1.3 MX210/MX510 Working Modes

Table 4-13 MX210/MX510 working mode


Connected Vender/Model MX510 Working MX210 Working
Device Mode Mode

FC storage NA Full fabric Native (switch)


device (direct
connection)

FC switch MDS series: Transparent Access Gateway (NPV)


MD S9100/MDS 9500 (NPV)
Nexus series:
Nexus 5000/Nexus7000

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 108


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

Connected Vender/Model MX510 Working MX210 Working


Device Mode Mode

QLogic 3000/QLogic Full fabric Access Gateway (NPV)


5000/QLogic 9000

FC switch: Transparent Access Gateway or


Brocade 300/Brocade (NPV) Native (switch,
5000/Brocade 7500 requiring the full fabric
license)

When MX210s (Native) are connected to Brocade switches through E-Ports, the Full Fabric
license needs to be configured.

4.5.2 Connecting MX210s/MX510s to Storage Devices


Figure 4-31 Connecting MX210s/MX510s to storage devices

Connect the MX510s of the CX911/CX915, MX210s of the CX912, and CX210s to storage
devices in crossover mode to ensure system reliability and storage performance. The MX210s
and MX510s in default configuration can directly connect to storage devices.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 109


Huawei E9000 Server
Network Technology White Paper 4 Networking Applications

4.5.3 Connecting MX210s/MX510s to FC Switches


Figure 4-32 Connecting MX210s/MX510s to FC switches

Connect MX210s/MX510s to FC switches in parallel connection mode. (The crossover


connection mode is not recommended. Cross-connections may cause multiple paths to
connect to one switch.)

4.5.3.1 MX210 Load Balancing and Failover


Same as the CX311, the CX911 and CX915 use the MX510 as the FC plane module. The link
load balancing and failover of the CX911 and CX915 are also the same as that of the CX311.
When registering with an external switch, the MX510 randomly selects an FC port from the
available FC ports on the panel to transmit data. If the port is faulty, the HBA registers again
and selects a new FC port.
The CX912 and CX210 integrate the MX210. When connecting a CX912 or CX210 to an
external Brocade FC switch, bind the links using ISL Trunking. The link load balancing and
failover of the CX912 and CX210 are as follows:
1. Access Gateway: Use N_Port trunking to implement link aggregation and load
balancing. If a physical link is faulty, the CX912 and CX210 automatically switch traffic
to another link. The HBA does not need to register again.
2. Native (Switch): Use E_Port trunking to implement link aggregation and load balancing.
If an FC link is faulty, the CX912 and CX210 automatically switch traffic to another
link. The HBA does not need to register again.
To implement the ISL Trunking feature, the MX210s and external FC switches must support
ISL Trunking, and the MX210 must be configured with an independent license. Compared
with the MX510, the MX210 offers a shorter failover duration when an FC link is faulty.

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 110


Huawei E9000 Server
Network Technology White Paper A Acronyms and Abbreviations

A Acronyms and Abbreviations

Acronyms and Full Name


Abbreviations

BPDU Bridge Protocol Data Unit

CNA Converged Network Adapter

DAD Dual-Active Detect

DCB Data Center Bridging

DCBX Data Center Bridging Exchange Protocol

DRR Deficit Round Robin

ETS Enhanced transmission selection

FC Fibre Channel

FCoE Fibre Channel over Ethernet

FCF Fibre Channel Forwarder

FDR Fourteen Data Rate

FIP FCoE Initialization Protocol

FSB FCoE Initialization Protocol Snooping Bridge

HBA Host Bus Adapter

LACP Link Aggregation Control Protocol

LAG Link Aggregation Group

LACPDU Link Aggregation Control Protocol Data Unit

LLDP Link Layer Discovery Protocol

MSTP Multiple Spanning Tree Protocol

NPV N_Port Virtualizer

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 111


Huawei E9000 Server
Network Technology White Paper A Acronyms and Abbreviations

Acronyms and Full Name


Abbreviations

NPIV N_Port_ID virtualization

PFC Priority-based Flow control

PG Port Group

PVST Per-VLAN Spanning Tree

Issue 04 (2019-03-22) Copyright © Huawei Technologies Co., Ltd. 112

You might also like