Huawei E9000 Server Network Technology White Paper
Huawei E9000 Server Network Technology White Paper
Huawei E9000 Server Network Technology White Paper
V100R001
Issue 04
Date 2019-03-22
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Website: http://e.huawei.com
Purpose
This document describes network technologies of the Huawei E9000 server and typical
networking suggestions and configurations for connecting to other vendors' switches in
various application scenarios. This document helps you quickly understand E9000 network
features and improve deployment efficiency.
For details about more E9000 network features, see the switch module white papers,
configuration guides, and command references of the E9000 server.
Intended Audience
This document is intended for:
l Marketing engineers
l Technical support engineers
l Maintenance engineers
Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol Description
Symbol Description
Change History
Issue Date Description
Contents
3 Network Technologies............................................................................................................... 11
3.1 iStack............................................................................................................................................................................ 11
3.1.1 iStack Overview.........................................................................................................................................................11
3.1.2 Technical Advantages................................................................................................................................................12
3.1.2.1 Simplified Configuration and Management........................................................................................................... 12
3.1.2.2 Control Planes in 1+1 Redundancy........................................................................................................................ 12
3.1.2.3 Link Backup........................................................................................................................................................... 12
3.1.3 Basic Concepts.......................................................................................................................................................... 13
3.1.3.1 Role.........................................................................................................................................................................13
3.1.3.2 Stack Domain......................................................................................................................................................... 13
3.1.3.3 Stack Member ID....................................................................................................................................................13
3.1.3.4 Stack Priority.......................................................................................................................................................... 13
3.1.3.5 Physical Stack Member Port...................................................................................................................................14
3.1.3.6 Stack Port................................................................................................................................................................14
3.1.4 Basic Principles......................................................................................................................................................... 14
3.1.4.1 Stack Setup............................................................................................................................................................. 14
3.1.4.2 Removing Stack Members......................................................................................................................................15
3.1.4.3 Stack Split...............................................................................................................................................................15
3.1.4.4 DAD........................................................................................................................................................................16
3.1.4.5 Fast Upgrade...........................................................................................................................................................16
3.1.5 Local Preferential Forwarding...................................................................................................................................16
3.1.6 Configuration Instance.............................................................................................................................................. 17
3.2 LACP............................................................................................................................................................................ 19
3.2.1 LACP Overview........................................................................................................................................................ 19
3.2.2 Basic Concepts.......................................................................................................................................................... 19
3.2.2.1 Link Aggregation, LAG and LAI........................................................................................................................... 20
3.7.2.1 ENode..................................................................................................................................................................... 59
3.7.2.2 FCoE Virtual Link.................................................................................................................................................. 59
3.7.2.3 FIP.......................................................................................................................................................................... 59
3.7.2.4 FCoE VLAN...........................................................................................................................................................59
3.7.3 FCoE Packet Format..................................................................................................................................................60
3.7.4 FIP............................................................................................................................................................................. 60
3.7.4.1 FIP VLAN Discovery............................................................................................................................................. 61
3.7.4.2 FIP FCF Discovery................................................................................................................................................. 62
3.7.4.3 FIP FLOGI and FDISC...........................................................................................................................................62
3.7.5 FCoE Virtual Link Maintenance................................................................................................................................62
3.7.6 FIP Snooping............................................................................................................................................................. 63
3.7.7 Configuration Instance.............................................................................................................................................. 63
3.8 Smart Link.................................................................................................................................................................... 63
3.8.1 Background................................................................................................................................................................64
3.8.2 Smart Link Basic Concepts....................................................................................................................................... 65
3.8.2.1 Smart Link Group................................................................................................................................................... 65
3.8.2.2 Master Port............................................................................................................................................................. 66
3.8.2.3 Slave Port................................................................................................................................................................66
3.8.2.4 Control VLAN........................................................................................................................................................ 66
3.8.2.5 Flush Packet............................................................................................................................................................66
3.8.3 Smart Link Working Mechanism...............................................................................................................................67
3.8.4 Configuration Instance.............................................................................................................................................. 68
3.9 Monitor Link.................................................................................................................................................................68
3.9.1 Monitor Link Overview.............................................................................................................................................68
3.9.1.1 Uplink Ports............................................................................................................................................................ 69
3.9.1.2 Downlink Ports....................................................................................................................................................... 69
3.9.2 Monitor Link Working Mechanism........................................................................................................................... 69
3.9.3 Configuration Instance.............................................................................................................................................. 70
3.10 Configuration Restoration.......................................................................................................................................... 70
4 Networking Applications.......................................................................................................... 72
4.1 Ethernet Networking.....................................................................................................................................................72
4.1.1 Stack Networking...................................................................................................................................................... 72
4.1.2 Smart Link Networking............................................................................................................................................. 73
4.1.3 STP/RSTP Networking..............................................................................................................................................74
4.1.4 Monitor Link Networking......................................................................................................................................... 74
4.2 Networking with Cisco Switches (PVST+)..................................................................................................................75
4.2.1 Cisco PVST+ Protocol...............................................................................................................................................75
4.2.2 Processing of PVST+ BPDUs................................................................................................................................... 77
4.2.3 Standard MSTP..........................................................................................................................................................77
4.2.4 Difference Between Cisco and Huawei MSTPs........................................................................................................ 77
4.2.5 Interconnection Scheme............................................................................................................................................ 78
4.2.5.1 Smart Link Group Interconnecting with Cisco PVST+ Network...........................................................................78
4.2.5.2 IEEE Standard Protocol (Root Bridge on the Cisco PVST+ Network Side)......................................................... 79
4.3 Cisco vPC Interoperability........................................................................................................................................... 79
4.4 FCoE Converged Networking...................................................................................................................................... 81
4.4.1 CX311 FCoE Converged Networking....................................................................................................................... 81
4.4.1.1 CX311 Switch Module........................................................................................................................................... 81
4.4.1.2 Default Configuration of the CX311...................................................................................................................... 83
4.4.1.3 Connecting MX510s to Storage Devices................................................................................................................87
4.4.1.4 Connecting MX510s to FC Switches..................................................................................................................... 87
4.4.1.5 MX510 Link Load Balancing and Failover............................................................................................................88
4.4.1.6 CX311s in Stacking Mode......................................................................................................................................89
4.4.2 Connecting CX310s to Cisco Nexus 5000 Series Switches...................................................................................... 90
4.4.3 Connecting CX310s to Brocade VDX6700 Series Switches.................................................................................... 94
4.4.4 CX320 Converged Networking................................................................................................................................. 95
4.4.4.1 CX320 Converged Switch Module.........................................................................................................................95
4.4.4.2 FCF Networking..................................................................................................................................................... 96
4.4.4.3 NPV Networking.................................................................................................................................................. 101
4.5 FC Networking........................................................................................................................................................... 106
4.5.1 Multi-Plane Switch Module.....................................................................................................................................106
4.5.1.1 Overview.............................................................................................................................................................. 106
4.5.1.2 MZ910 Port Types................................................................................................................................................ 107
4.5.1.3 MX210/MX510 Working Modes......................................................................................................................... 108
4.5.2 Connecting MX210s/MX510s to Storage Devices................................................................................................. 109
4.5.3 Connecting MX210s/MX510s to FC Switches....................................................................................................... 110
4.5.3.1 MX210 Load Balancing and Failover.................................................................................................................. 110
1 E9000 Overview
The Huawei E9000 converged architecture blade server (E9000 for short) is a new-generation
powerful infrastructure platform that integrates computing, storage, switching, and
management. It delivers high availability, computing density, energy efficiency, midplane
bandwidth, as well as low network latency and intelligent management and control. It also
allows elastic configuration and flexible expansion of computing and storage resources and
application acceleration. It supports eight full-width compute nodes, 16 half-width compute
nodes, or 32 child compute nodes. Half-width and full-width compute nodes can be combined
flexibly to meet various service needs. Each compute node provides up to two or four CPUs
and 48 DIMMs, supporting a maximum of 6 TB memory capacity.
The E9000 provides flexible computing and I/O expansion. The compute nodes support Intel
CPUs of the next three generations. The E9000 supports Ethernet, Fiber Channel (FC),
InfiniBand (IB), and Omni-Path Architecture (OPA) switching. The supported network ports
range from mainstream GE and 10GE ports to 40GE ports and future 100GE ports. The
E9000 houses four switch modules at the rear of the chassis to support Ethernet, FC, IB, and
OPA switching networks. The switch modules provide powerful data switching capability and
rich network features. The compute nodes can provide standard PCIe slots. A full-width
compute node supports a maximum of six PCIe cards.
2 Switch Modules
The E9000 chassis provides four rear slots for installing pass-through and switch modules that
support GE, 10GE, 40GE, 8G/16G FC, 40G/56G IB, 100G OPA, and evolution towards
100GE, 100G IB, and 200G OPA. The switch modules provide powerful data switching
capabilities.
The four switch module slots are numbered 1E, 2X, 3X, and 4E from left to right. The chassis
midplane provides eight pairs of links (10GE or 40GE ports) to connect switch modules in
slots 1E and 4E and switch modules in slots 2X and 3X. The switch modules can be stacked
or cascaded through the links. Figure 2-2 shows the switch module slots and their
interconnections on the midplane.
If switch modules in slots 1E and 4E or those in slots 2X and 3X need to be stacked and local
preferential forwarding is enabled (default setting) for the Eth-Trunk, the midplane ports can
provide the bandwidth required for stacking. No stacking cable is required.
NOTE
The midplane does not provide links for interconnecting CX915 or CX111 switch modules. To stack
CX915 or CX111 switch modules, use optical or electric cables to connect the 10GE ports on the panels.
MZ512 2 x (2 x 10GE)
CNA
MZ312 4 x 10GE
MZ310 2 x 10GE
MZ312 2 x (2 x 10GE)
MZ532 4 x 25GE
(Reduced to
10GE)
NOTE
l If the MZ910 is used with a CX210 switch module, the two 10GE ports cannot be used. (The CX210
does not provide an Ethernet switching plane).
l If the MZ910 is used with a switch module that provides the QLogic FC switching plane, ports in
slots 1 to 12 are FCoE ports and ports in slots 13 to 16 are FC ports. If the MZ910 is used with a
switch module that provides the Brocade FC switching plane, ports in slots 1 to 16 are all FC ports.
l If the MZ910 is used with a CX915 switch module, the rate of 10GE ports is automatically reduced
to GE.
l The CX916 provides Ethernet switching only if it is installed in slot 2X or 3X and works with V5
compute nodes (such as the CH121 V5 with the two-port 10GE LOM). If the CX916 is installed in
slot 1E or 4E, it only provides FC switching and works with the MZ220 NIC.
l The CX930 provides 10GE switching capability only if it is installed in slot 2X or 3X and used with
V5 compute nodes (such as the CH121 V5).
A multi-plane mezzanine card can be in slot Mezz 1 or Mezz 2, and connects to switch
modules in slots 2X and 3X or slots 1E and 4E. An OPA mezzanine card can only be in slot
Mezz 2 because the connected OPA switch module can only be installed in slot 1E. A
mezzanine card other than multi-plane and OPA mezzanine cards provides two or four ports.
A 4-port mezzanine card is connected to two ports of each connected switch module. Figure
2-4 shows the connections between non-OPA mezzanine cards and switch modules.
Figure 2-4 Connections between non-OPA mezzanine cards and switch modules
Figure 2-5 Connections between OPA mezzanine cards and the OPA switch module
1. Choose the compute node type and mezzanine card type. If the mezzanine card is a
CNA, configure the mezzanine card attributes.
2. Choose the switch module type.
3. Click Show Network Connection.
The mapping between the ports on CNAs and switch modules is displayed, as shown in
Figure 2-7.
3 Network Technologies
This section describes common technologies used for networking of the E9000 and provides
some configuration instances. The networking technologies include iStack, Link Aggregation
Control Protocol (LACP), NIC teaming, Data Center Bridging (DCB), FCoE, Smart Link, and
Monitor Link.
3.1 iStack
3.2 LACP
3.3 M-LAG
3.4 NIC Teaming
3.5 FC Technology
3.6 DCB
3.7 FCoE
3.8 Smart Link
3.9 Monitor Link
3.10 Configuration Restoration
3.1 iStack
3.1.3.1 Role
Switch modules that have joined a stack are member switch modules. Each member switch
module in a stack plays one of the following roles:
1. A master switch module manages the entire stack. A stack has only one master switch
module.
2. A standby switch module serves as a backup of the master switch module. A stack has
only one standby switch module.
3. Except the master switch module, all switch modules in a stack are slave switch
modules. The standby switch module also functions as a slave switch module.
A stack consists of multiple member switch modules, each of which has a specific role. When
a stack is created, the member switch modules send stack competition packets to each other to
elect the master switch module. The remaining switch modules function as slave switch
modules. The master switch module is elected based on the following rules in sequence:
1. Running status: The switch module that has fully started and entered running state
becomes the master switch module.
2. Stack priority: The switch module with the highest stack priority becomes the master
switch module.
3. Software version: The switch module running the latest software version becomes the
master switch module.
4. MAC address: The switch module with the smallest MAC address becomes the master
switch module.
The master switch module collects stack member information, works out the stack topology,
and synchronizes the topology information to the other member switch modules. If the stack
ID of a slave switch module conflicts with an existing stack ID, the switch module restarts
repeatedly. If the master and slave switch modules use different software versions, the
software version of the master switch module will be synchronized to the slave switch
module, which then restarts and joins the stack again.
The master switch elects a standby switch module among the slave switch modules. If the
master switch module fails, the standby switch module takes over all services from the master
switch module. The following conditions of the member switch modules are compared in
sequence until a standby switch module is elected:
1. Stack priority: The switch module with the highest stack priority becomes the standby
switch module.
2. MAC address: The switch module with the smallest MAC address becomes the standby
switch module.
Before a stack is set up, each switch module is independent and has its own IP address. Users
need to manage the switch modules separately. After a stack is set up, the switch modules in
the stack form a logical entity, and users can use a single IP address to manage and maintain
all the member switch modules. The IP address and MAC address of the master switch
module when the stack is set up for the first time are used as the IP address and MAC address
of the stack.
1. If a master switch module exits, the standby switch module becomes the master, updates
the stack topology, and designates a new standby switch module.
2. If the standby switch module exits, the master switch updates the stack topology and
designates a new standby switch module.
3. If a slave switch module exits, the master switch module updates the stack topology.
4. If both the master and standby switch modules exit, all slave switches restart and set up a
new stack.
3.1.4.4 DAD
Dual-active detection (DAD) is a protocol used to detect a stack split, handle conflicts, and
take recovery actions to minimize the impact of a stack split on services. A DAD link directly
connecting stacked switches can detect dual-active switches, as shown in Figure 3-5.
After a stack splits, the switch modules exchange competition packets and compare the
received competition packets with the local ones. The switch module that wins the
competition becomes the master switch, remains active, and continues forwarding service
packets. If a switch module becomes a standby switch after the competition, it shuts down all
service ports except the reserved ones, enters the recovery state, and stops forwarding service
packets. The master switch is determined based on the following DAD competition rules in
sequence:
1. Stack priority: The switch module with the highest stack priority becomes the master
switch.
2. MAC address: The switch module with the smallest MAC address becomes the master
switch.
When the faulty stack links recover, stacks in recovery state restart and the shutdown ports
restore the up state. The entire stack system then recovers.
ports on different switch modules, some traffic is forwarded among switch modules. Enabling
the local forwarding for the Eth-Trunk interface can reduce the traffic forwarded among
switch modules, as shown in Figure 3-6.
# Run the reset saved-configuration command to restore the default configuration of the
switch module in slot 2X and restart the switch module.
<HUAWEI>reset saved-configuration
The action will delete the saved configuration in the device. The
configuration will be erased to reconfigure.Continue? [Y/N]:Y
Warning: Now clearing the configuration in the device.......
begin synchronize configuration to SMM ...
slot 2: upload configuration to SMM successfully.
# Run the reset saved-configuration command to restore the default configuration of the
switch module in slot 3X and restart the switch module.
<HUAWEI>reset saved-configuration
The action will delete the saved configuration in the device. The
configuration will be erased to reconfigure.Continue? [Y/N]:Y
Warning: Now clearing the configuration in the device.
begin synchronize configuration to SMM ...
slot 3: upload configuration to SMM successfully.
# Set the default stack member ID to 2, domain ID to 10, and priority to 150 for the CX310 in
slot 2X.
<HUAWEI> system-view
[~HUAWEI] sysname CX310_2
[*HUAWEI] commit
[~CX310_2] stack
[*CX310_2-stack] stack member priority 150
[*CX310_2-stack] stack member 2 domain 10
[*CX310_2-stack] quit
[*CX310_2] commit
# Add service port 40GE 2/18/1 of the CX310 in slot 2X to stack port 2/1.
[~CX310_2] interface 40GE 2/18/1
[*CX310_2-40GE2/18/1] port mode stack
[*CX310_2-40GE2/18/1] quit
[*CX310_2] commit
[~CX310_2] interface stack-port 2/1
[*CX310_2-Stack-Port2/1] port member-group interface 40GE 2/18/1
[*CX310_2-Stack-Port2/1] quit
[*CX310_2] commit
# Set the default stack member ID to 3, domain ID to 10, and priority to 100 for the CX310 in
slot 3X.
<HUAWEI> system-view
[~HUAWEI] sysname CX310_3
[*HUAWEI] commit
[~CX310_3] stack
[*CX310_3-stack] stack priority 100
[*CX310_3-stack] stack member 3 domain 10
[*CX310_3-stack] quit
[*CX310_3] commit
# Add service port 40GE 3/18/1 of the CX310 in slot 3X to stack port 3/1.
[~CX310_3] interface 40GE 3/18/1
[*CX310_3-40GE3/18/1] port mode stack
[*CX310_3-40GE3/18/1] quit
[*CX310_3] commit
[~CX310_3] interface stack-port 3/1
[*CX310_3-Stack-Port3/1] port member-group interface 40GE 3/18/1
[*CX310_3-Stack-Port3/1] quit
[*CX310_3] commit
[*CX310_3-40GE3/18/1] quit
[*CX310_3] commit
[~CX310_3] quit
<CX310_3> save
NOTE
The switch modules in slots 2X and 3X must be in the same stack domain. After a low-priority switch
module (in slot 3X) is configured, the standby switch module restarts automatically and a stack system
is created. If the stack system configuration is correct, run the save command to save the configuration.
(If the switch module in slot 2X is not the master switch after the first master competition, run the
reboot command to restart the stack system. Then, the system will select the switch module in slot 2X
as the master switch based on the priority.)
# Change the system name and view the stack system information.
[CX310_2] sysname CX310_C
[~CX310_2] commit
[~CX310_C] display stack
---------------------------------------------------------------------
MemberID Role MAC Priority Device Type Bay/Chassis
---------------------------------------------------------------------
2 Master 0004-9f31-d540 150 CX310 2X/1
3 Standby 0004-9f62-1f80 100 CX310 3X/1
---------------------------------------------------------------------
[~CX310_C] quit
<CX310_C> save
3.2 LACP
1. Increased bandwidth: The bandwidth of a link aggregation group (LAG) is the sum of
bandwidth of its member ports. The maximum number of LAG members is 16.
2. High reliability: If an active link fails, traffic of the link is switched to other active links,
improving the LAG reliability.
3. Load balancing: In a LAG, traffic is evenly distributed among active member links.
A switch module LAG is named Eth-Trunk. In an Eth-Trunk, all the member ports must be of
the same type and use the default configuration. Figure 3-8 shows the LAG of switch
modules.
For example, an Eth-Trunk has eight links, and each link provides a bandwidth of 1 Gbit/s. If
the maximum bandwidth required is 5 Gbit/s, you can set the upper threshold to 5. As a result,
the other three member links automatically enter the backup state to improve the network
reliability.
For example, if each physical link provides a bandwidth of 1 Gbit/s and the minimum
bandwidth required is 2 Gbit/s, you can set the lower threshold to 2 or a larger value.
LACP has two modes: lacp static and lacp dynamic. They handle link negotiation failures in
different ways. In lacp static mode, the Eth-Trunk becomes Down and cannot forward data
after the LACP negotiation fails. In lacp dynamic mode, the Eth-Trunk becomes Down after
the LACP negotiation fails, but member ports inherit Eth-Trunk VLAN attributes and change
to Indep state to independently perform L2 data forwarding.
Table 3-2 Comparison of working modes between Huawei Eth-Trunks and Cisco port
channels
A Huawei Eth-Trunk determines the proactive and passive ends based on system LACP
priorities, that is, the end with a higher system priority is the proactive end and the end with a
lower system priority is the passive end.
3.3 M-LAG
M-LAG
Network
Dual-Active
Detection
Packets
peer-link
SwitchA SwitchB
M-LAG
M-LAG M-LAG
member member
interface interface
Switch
Dual-active
system
M-LAG master device The device is configured with M-LAG and is in master state.
M-LAG backup device The device is configured with M-LAG and is in slave state.
NOTE
Normally, both the master and backup devices forward service traffic.
Dual-Active Detection The M-LAG master and backup devices send DAD packets at
(DAD) an interval of 1s on the link of the DAD-enabled interface.
When the peer-link fails, DAD is performed.
3.3.3 Implementation
The dual-active system that is set up based on M-LAG provides device-level reliability.
Figure 3-12 shows the M-LAG establishment process. The process includes the following
stages:
1. The devices at both ends of the M-LAG periodically send M-LAG negotiation packets
through the peer-link. When receiving M-LAG negotiation packets from the remote end,
the local end determines whether a DFS group ID in the M-LAG negotiation packets is
the same as that on the local end. If they are the same, device pairing is successful.
2. Both devices compare the DFS group priority in M-LAG negotiation packets to
determine the master and slave status. SwitchB (E9000 switch module) is used as an
example. When receiving packets from SwitchA (E9000 switch module), SwitchB
checks and records information about SwitchA, and compares its DFS group priority
with that of SwitchA. If SwitchA has a higher DFS group priority than SwitchB,
SwitchA is the master device and SwitchB is the backup device. If SwitchA and SwitchB
have the same DFS group priority, the device with a smaller MAC address functions as
the master device.
The master and slave statuses take effective when a fault occurs, and do not affect traffic
forwarding.
3. After master and backup devices are negotiated, the two devices periodically send M-
LAG dual-active detection packets every second through the dual-active detection link.
When two devices can receive packets from each other, the dual-active system starts to
work.
M-LAG dual-active detection packets are used to detect dual master devices when the
peer-link fails.
– (Recommended)The dual-active detection link send packets from the MEth
management interface, and the IP address of the the MEth management interface
bingding to DFS group is reachable. The VRF instances binding to the MEth
management interface is also used to separate the network-side routing.
– The dual-active detection link can also send packets from the network-side link. If
the routing neighbor relationship are setup through the peer-link between M-LAG
master and backup devices, the dual-active detection packets are sended from the
shortest path. Once the peer-link fails, the dual-active detection packets are sended
from the second shortest path. The dual-active detection would be delayed for half a
second to one second.
4. The two devices send M-LAG synchronization packets through the peer-link to
synchronize information from each other in real time. M-LAG synchronization packets
include MAC address entries and ARP entries, so a fault of any device does not affect
traffic forwarding.
Network
Dual-active
detection packet
peer-link
SwitchA SwitchB
M-LAG setup
M-LAG negotiation packet
M-LAG synchronization
packet
...
After the M-LAG dual-active system is set up successfully, the M-LAG dual-active system
starts to work. The M-LAG master and backup devices load balance traffic. If a link, device,
or peer-link fault occurs, M-LAG ensures nonstop service transmission. The following
describes traffic forwarding when M-LAG works properly and a fault occurs.
Dual homing a switch to a Ethernet network and an IP network is used as an example.
SwitchA
S-1
Ethernet
Network
S-2
SwitchB
S-3
SwitchA
S-1
Ethernet
Network
S-2
SwitchB
S-3
SwitchA
S-1
Ethernet
Network
S-2
SwitchB
S-3
After receiving multicast traffic, SwitchA forwards traffic to each next hop. The traffic
that reaches SwitchB is not forwarded to S-2 because unidirectional isolation is
configured between the peer-link and M-LAG member interface.
SwitchA
S-1
Ethernet
Network
S-2
SwitchB
S-3
Multicast or broadcast traffic from S-2 is load balanced between SwitchA and SwitchB.
The following uses the forwarding process on SwitchA as an example.
After receiving multicast traffic, SwitchA forwards traffic to each next hop. The traffic
that reaches SwitchB is not forwarded to S-2 because unidirectional isolation is
configured between the peer-link and M-LAG member interface.
l Unicast traffic from the network side
SwitchA
S-1
Ethernet
Network
S-2
SwitchB
S-3
Unicast traffic sent from the network side to M-LAG member interfaces is forwarded to
the dual-active devices by SwitchA and SwitchB in load balancing mode.
Load balancing is not performed for unicast traffic sent from the network side to non-M-
LAG member interfaces. For example, traffic sent to S-1 is directly sent to SwitchA,
which then forwards the traffic to S-1.
l Multicast or broadcast traffic from the network side
SwitchA
S-1
Ethernet
Network
S-2
SwitchB
S-3
Multicast or broadcast traffic from the network side is load balanced between SwitchA
and SwitchB. The following uses the forwarding process on SwitchA as an example.
SwitchA forwards traffic to each user-side interface. The traffic that reaches SwitchB is
not forwarded to S-2 because unidirectional isolation is configured between the peer-link
and M-LAG member interface.
A fault occurs:
l Uplink fails
Ethernet
M-LAG
Network
Switch
SwitchB
Backup
When a switch is dual-homed to an Ethernet network, DAD packets are often transmitted
through the management network. DAD dection on the M-LAG master device is not
affected, the dual-active system is not affected, and the M-LAG master and backup
devices can still forward traffic. Because the uplink of the M-LAG master device fails,
traffic passing the M-LAG master device is forwarded through the peer-link.
l Downlink Eth-Trunk fails
peer-link
Ethernet
M-LAG
Network
Switch
SwitchB
Backup
The M-LAG master and backup states remain unchanged, and traffic is switched to the
other Eth-Trunk. The faulty Eth-Trunk becomes Down, and the dual-homing networking
changes into a single-homing networking.
l M-LAG master device fails
Ethernet
M-LAG
Network
Switch
SwitchB
Backup à Master
The M-LAG backup device becomes the master device and continues forwarding traffic,
and its Eth-Trunk is still in Up state. The Eth-Trunk on the master device becomes
Down, and the dual-homing networking changes into a single-homing networking.
NOTE
If the M-LAG backup device fails, the master and backup states remain unchanged and the Eth-Trunk
of the M-LAG backup device becomes Down. The Eth-Trunk on the master device is still in Up state
and continues forwarding traffic. The dual-homing networking changes into a single-homing
networking.
l Peer-link fails
peer-link
Ethernet
M-LAG
Network
Switch
SwitchB
Backup
SwitchA
S-1
IP
Network
S-2
SwitchB
S-3
SwitchA
S-1
IP
Network
S-2
SwitchB
S-3
SwitchA
S-1
IP
Network
S-2
SwitchB
S-3
SwitchA
S-1
IP
Network
S-2
SwitchB
Layer 3 link
S-3
To ensure normal Layer 3 multicast transmission on the M-LAG, a Layer 3 direct link (line marked in
red) needs to be deployed between the M-LAG master and backup devices.
SwitchA
S-1
IP
Network
S-2
SwitchB
S-3
Unicast traffic sent from the network side to M-LAG member interfaces is forwarded to
the dual-active devices by SwitchA and SwitchB in load balancing mode.
Load balancing is not performed for unicast traffic sent from the network side to non-M-
LAG member interfaces. For example, traffic sent to S-1 is directly sent to SwitchA,
which then forwards the traffic to S-1.
l Multicast traffic from the network side
SwitchA
S-1
IP
Network
S-2
SwitchB
Layer 3 link
S-3
Normally, network-side multicast traffic is forwarded to S-2 through the M-LAG master
device (SwitchA) only. The M-LAG backup device (SwitchB) cannot forward multicast
traffic through M-LAG member interfaces.
NOTE
To ensure normal Layer 3 multicast transmission on the M-LAG, a Layer 3 direct link (line marked in
red) needs to be deployed between the M-LAG master and backup devices. Some multicast traffic can
be forwarded through the Layer 3 link.
A fault occurs:
l Downlink Eth-Trunk fails
peer-link
IP
M-LAG
Network
Switch
SwitchB
Backup
The M-LAG master and backup states remain unchanged, and traffic is switched to the
other Eth-Trunk. The faulty Eth-Trunk becomes Down, and the dual-homing networking
changes into a single-homing networking.
l M-LAG master device fails
IP
M-LAG
Network
Switch
SwitchB
Backup à Master
The M-LAG backup device becomes the master device and continues forwarding traffic,
and its Eth-Trunk is still in Up state. The Eth-Trunk on the master device becomes
Down, and the dual-homing networking changes into a single-homing networking.
NOTE
If the M-LAG backup device fails, the master and backup states remain unchanged and the Eth-Trunk
of the M-LAG backup device becomes Down. The Eth-Trunk on the master device is still in Up state
and continues forwarding traffic. The dual-homing networking changes into a single-homing
networking.
l Peer-link fails
peer-link
IP
M-LAG
Network
Switch
SwitchB
Backup
NOTE
The configuration of dual homing a server is the same as common link aggregation configuration. Ensure that
the server and switches use the same link aggregation mode. The LACP mode at both ends is recommended.
Ethernet/IP/
TRILL/VXLAN
Network
peer-link
Switch A Switch B
M-LAG
Server
Network
Peer-link
SwitchC SwitchD
Peer-link
SwitchA SwitchB
Server
3.4.1 Overview
NIC Teaming allows multiple physical NICs on an Ethernet server to be bound as a virtual
NIC using software. This server has only one NIC presented to the external network and only
one network connection for any application in the network. After NICs are bonded, data can
be sent in load-sharing or active/standby mode. If one link fails, a traffic switchover or active/
standby switchover is performed to ensure server network reliability.
On the GUI, enter a team name, select the NICs to be bound, and set the working mode of the
NIC team.
VSS
The VSS running mode is similar to that of a physical Ethernet switch. The VSS detects VMs
that logically connect to its virtual ports and forwards traffic to the correct VM based on the
detection result. Physical Ethernet adapters (also called uplink adapters) can be used to
connect a virtual network and a physical network so as to connect vSphere standard switches
to physical switches. This connection is similar to the connection between physical switches
for creating a large-scale network. Although similar to physical switches, vSphere VSSs do
not provide certain advanced functions of physical switches. Figure 3-40 shows the
architecture of vSphere VSSs.
VDS
The VDS can be used as a single switch for all associated hosts in a data center, to provide
centralized deployment, management, and monitoring of the virtual network. The
administrator can configure a vSphere distributed switch on a vCenter server. This
configuration will be sent to all hosts associated with the switch, which allows consistent
network configuration when VMs are migrated across hosts. Figure 3-41 shows the
architecture of vSphere VDSs.
Like the physical network, the virtual network also needs to improve network connection
reliability and single-port bandwidth. NIC teaming is provided for the networking in vSphere
virtual environments. vSphere NIC teaming allows the traffic between the physical and virtual
networks to be shared by some or all members and implements switchovers when a hardware
fault or network interruption occurs. Table 3-4 describes the four types of NIC teaming
provided by VMware vSphere 5.5 VSSs.
Routing based on the originating virtual Choose an uplink based on the virtual port
port ID through which traffic is transmitted to the
virtual switch.
Route based on source MAC hash Choose an uplink based on a hash of the source
Ethernet MAC address.
Use explicit failover order Always use the highest-order uplink from the
list of active adapters to pass failover detection
criteria.
The NIC teaming policies can be found in the Load Balancing drop-down list shown in NIC
teaming policies.
In addition to NIC teaming policies supported by VSSs, NIC teaming provided by port groups
of VMware vSphere 5.5 VDSs also includes "Route based on physical NIC load", as shown in
Figure 3-43. Only vSphere Enterprise Plus supports distributed switches. vSphere distributed
switches support LACP, as shown in Figure 3-43 and Figure 3-44.
For load balancing algorithms, the components for performing hash functions in the physical
environment include the source and destination IP addresses, TCP/UDP port numbers, and
source and destination MAC addresses. In LACP load balancing mode, virtual switches and
physical switches negotiate with each other to determine the forwarding policy. In the
vSphere virtual environment, components for performing hash functions include the source
and destination IP addresses, source MAC address, and switch port numbers.
1. Switch Independent: The working mode of a NIC team does not depend on a switch.
The NIC teaming works independently. For example, mode = 1/5/6 in Linux, Route
based on the originating virtual port ID, Route based on source MAC hash, and Use
explicit failover order in VMware, and Switch Independent mode in Windows.
2. Manual load balancing: In manual load balancing mode, the NIC team chooses active
links by using the hash algorithm, regardless of LACP negotiation results. For example,
mode = 0/2/3 in Linux, Route based on IP hash in VMware, and Static teaming in
Windows.
3. LACP load balancing: NICs and switch modules on a server negotiate with each other
about NIC teaming through LACP. Physical links become active only after the LACP
negotiation is successful.
Table 3-5 Mapping between working modes of NIC teams and switch modules
Operating NIC Teaming Mode Eth-Trunk Mode
System
3.5 FC Technology
This chapter describes the working principles of the FC technology.
l Fabric
A fabric is the network topology where servers and storage devices are interconnected
through one or more switches.
l FCF
A Fibre Channel Forwarder (FCF) is a switch that supports both FCoE and FC protocol stacks
and is used to connect to a SAN or LAN. An FCF forwards FCoE packets and encapsulates or
decapsulates them
l NPV
An N-Port Virtualization (NPV) switch is at the edge of a fabric network and between ENodes
and FCFs, forwarding traffic from node devices to FCF switches.
l WWN
A World Wide Name (WWN) identifies an entity in a fabric network. A WWN is either a
World Wide Node Name (WWNN) that identifies a node device or a World Wide Port Name
(WWPN) that identifies a device port. Each entity in a SAN is assigned with a WWN before
the entity is delivered from the factory.
l FC ID
FCoE frames are forwarded by using locally unique MAC addresses (unique only in the local
Ethernet subnetwork). FCFs assign locally unique MAC addresses to ENodes or ENodes
specify their own locally unique MAC addresses and inform FCFs. In FPMA mode, FCFs
assigned locally uniquely MAC addresses to ENodes. An FPMA is an FC ID with a 24-bit
FCoE MAC address prefix (FC-MAP).
l Zone
N_Ports are added to different zones so that the N_Ports are isolated. A zone set is a set of
zones. It is a logical control unit between zones and instances and simplifies configurations.
Each instance can have only one activated zone set.
l Port roles
In a traditional FC network, FC devices interact with each other through FC ports. FC ports
include N-Ports, F-Ports, and NP-Ports.
1. Node port (N_Port): indicates a port on an FC host (server or storage device) and
connects to an FC switch.
2. Fabric port (F_Port): indicates a port on an FC switch and connects to an FC host,
enabling the FC host to access the fabric.
3. N_Port Proxy (NP_Port): indicates a port on an NPV switch and connects to an FCF
switch.
1. When servers or storage arrays go online in the fabric network, they request FC switches
to provide services and register with FC switches through fabric login (FLOGI) packets.
2. FC switches allocate FC addresses to servers and storage devices.
3. Servers and storage devices send name service registration requests to FC switches,
which create and maintain the mapping table between FC addresses and WWNs.
4. A server sends session requests to a target node through a port login (PLOGI).
5. After the session is established between a server and a storage device through a PLOGI,
FC data can be transmitted. An FC switch determines the route and forwards data based
on FC addresses of the server and the storage device.
3.5.3 FCF
A Fibre Channel Forwarder (FCF) switch supports both FCoE and FC protocol stacks for
connecting to SAN and LAN environments. In an FC SAN, an FCF is mainly used for
transmitting FC data. An FCF forwards FCoE packets and encapsulates or decapsulates them.
As shown in Figure 3-50, F_Ports of the FCF directly connect to N_Ports of a server and a
storage array. Each FCF switch has a domain ID. Each FC SAN supports a maximum of 239
domain IDs. Therefore, each FC SAN can contain a maximum of 239 FCF switches.
3.5.4 NPV
A SAN has high demands for edge switches directly connected to node devices. N-Port
virtualization (NPV) switches do not occupy domain IDs and enable a SAN to exceed the
limit of 239 edge switches.
As shown in Figure 3-51, an NPV switch is located at the edge of a fabric network and
between node devices and an FCF. The NPV switch use F_Ports to connect to N_Ports of
node devices and use NP_Ports to connect to F_Ports of the FCF switch. As a result, the node
devices connect to the fabric network through the NPV switch, which forwards traffic from all
node devices to the core switch.
For a node device, the NPV switch is an FCF switch that provides F_Ports. For an FCF
switch, an NPV switch is a node device that provides N_Ports.
3.5.5 Zone
In an FCoE network, users can use zones to control access between node devices to improve
network security.
l Zone
A zone contains multiple zone members. A node device can join different zones at the same
time. Node devices in the same zone can access each other. Node devices in different zones
cannot access each other. A zone member can be defined in the following ways:
1. Zone alias: After a zone alias joins a zone, the members in the zone alias also join the
zone.
2. FC_ID: indicates an FC address. Node devices in an FC network access each other
through FC addresses.
3. FCoE port: Node devices in an FCoE network interact with each other through FCoE
ports.
4. WWNN (World Wide Node Name): A WWNN is a 64-bit address used to identify a
node device in an FC network.
5. WWPN (World Wide Port Name): A WWPN is a 64-bit address used to identify a port of
a node device in an FC network.
As shown in Figure 3-52, users can control access between node devices by adding node
devices to different zones. For example, array B can only interact with Server B and Server C,
not Server A.
l Zone set
A zone set contains multiple zones. A zone can join different zone sets at the same time.
Zones in a zone set are valid only after the zone set is activated. Each instance can have only
one activated zone set.
l Zone alias
Applying zone aliases to zone configurations simplifies the configurations. If multiple zone
members need to join multiple zones, you can add the zone members to a zone alias and then
add the zone alias to a zone as a zone member. This avoids adding zone members one by one.
As shown in Figure 3-53, if Members C and D both need to join Zones A and B, you can add
Members C and D to Zone Alias A, and then add Zone Alias A to Zones A and B. This
simplifies configurations.
3.6 DCB
Feature Description
3.6.2 PFC
PFC is also called Per Priority Pause or Class Based Flow Control (CBFC). It is an
enhancement to the Ethernet Pause mechanism. PFC is a priority-based flow control
mechanism. As shown in Figure 3-54, the transmit interface of Device A is divided into eight
queues of different priorities. The receive interface of Device B is divided into eight buffers.
The eight queues and eights buffers are in one-to-one correspondence. When a receive buffer
on Device B is to be congested, Device B sends a STOP signal to Device A. Device A stops
sending packets in the corresponding priority queue when receiving the STOP signal.
PFC allows traffic in one or multiple queues to be stopped, which does not affect data
exchange on the entire interface. Data transmission in each queue can be separately stopped or
resumed without affecting other queues. This feature enables various types of traffic to be
transmitted on the same link. The system does not apply the backpressure mechanism to the
priority queues with PFC disabled and directly discards packets in these queues when
congestion occurs.
3.6.3 ETS
The converged data center network bears three types of traffic: inter-process communication
(IPC) traffic, local area network (LAN) traffic, and storage area network (SAN) traffic. The
converged network has high QoS requirements. The traditional QoS cannot meet
requirements of the converged network, whereas Enhanced Transmission Selection (ETS)
uses hierarchical scheduling to guarantee QoS on the converged network. ETS provides two
levels of scheduling: scheduling based on the priority group (PG) and scheduling based on the
priority. Figure 3-55 illustrates how ETS works. On an interface, PG-based scheduling is
performed first, and then priority-based scheduling is performed.
A PG is a group of priority queues with the same scheduling attributes. Users can add queues
with different priorities to a PG. PG-based scheduling is called level-1 scheduling. ETS
defines three PGs: PG0 for LAN traffic, PG1 for SAN traffic, and PG15 for IPC traffic.
As defined by ETS, PG0, PG1, and PG15 use priority queue (PQ)+Deficit Round Robin
(DRR). PG15 uses PQ to schedule delay-sensitive IPC traffic. PG0 and PG1 use DRR. In
addition, bandwidth can be allocated to PGs based on actual networking.
As shown in Figure 3-56, the queue with priority 3 carries FCoE traffic and is added to the
SAN group (PG1). Queues with priorities 0, 1, 2, 4, and 5 carry LAN traffic and are added to
the LAN group (PG0). The queue with priority 7 carries IPC traffic and is added to the IPC
group (PG15). The total bandwidth of the interface is 10 Gbit/s. Each of PG1 and PG0 is
assigned 50% of the total bandwidth, that is, 5 Gbit/s.
At t1 and t2, all traffic can be forwarded because the total traffic on the interface is within the
interface bandwidth. At t3, the total traffic exceeds the interface bandwidth and LAN traffic
exceeds the given bandwidth. At this time, LAN traffic is scheduled based on ETS parameters
and 1 Gbit/s LAN traffic is discarded.
ETS also provides PG-based traffic shaping. The traffic shaping mechanism limits traffic
bursts in a PG to ensure that traffic in this group is sent out at an even rate.
In addition to PG-based scheduling, ETS also provides priority-based scheduling, that is,
level-2 scheduling, for queues in the same PG. Queues in the same PG support queue
congestion management, queue shaping, and queue congestion avoidance.
3.6.4 DCBX
To implement lossless Ethernet on a converged data center network, both ends of an FCoE
link must have the same PFC and ETS parameter settings. Manual configuration of PFC and
ETS parameters may increase administrator's workloads and cause configuration errors.
DCBX, a link discovery protocol, enables devices at both ends of a link to discover and
exchange DCB configurations. This greatly reduces administrator's workloads. DCBX
provides the following functions:
1. Detects the DCB configurations of the peer device.
2. Detects the DCB configuration errors of the peer device.
3. Configures DCB parameters for the peer device.
DCBX enables DCB devices at both ends to exchange the following DCB configurations:
1. ETS PG information
2. PFC
DCBX encapsulates DCB configurations into Link Layer Discovery Protocol (LLDP) type-
length-values (TLVs) so that devices at both ends of a link can exchange DCB configurations.
# Create an ETS profile and add queue 3 to PG1 (by default), and other queues to PG0. PG15
is empty.
[*CX310_2] dcb ets-profile ets1
[*CX310_2-ets-ets1] priority-group 0 queue 0 to 2 4 to 7
# Configure PG-based flow control and set DRR weights of PG0 and PG1 to 60% and 40%
respectively.
[*CX310_2-ets-ets1] priority-group 0 drr weight 60
[*CX310_2-ets-ets1] quit
3.7 FCoE
3.7.2.1 ENode
An ENode is a CNA that supports FCoE and FC. A traditional server houses two network
adapters: an NIC connected to a LAN and an HBA connected to a SAN. The CNA provides
both NIC and HBA functions. It can forward Ethernet data, process FCoE packets in upper
layers, and encapsulate and decapsulate FCoE frames.
3.7.2.3 FIP
FIP is a L2 protocol that discovers FC terminals on an FCoE network, implements fabric
login, and establishes FCoE virtual links. An ENode can log in to the fabric over FIP to
communicate with the target FC device. FIP can also maintain FCoE virtual links.
FCoE encapsulates an FC frame into an Ethernet frame. Figure 3-60 shows FCoE frame
encapsulation.
1. The Ethernet Header specifies the packet source, destination MAC address, Ethernet
frame type, and FCoE VLAN.
2. The FCoE Header specifies the FCoE frame version number and flow control
information.
3. Similar to a traditional FC frame, the FC Header specifies the source and destination
addresses of an FC frame.
3.7.4 FIP
FIP, an FCoE control protocol, establishes and maintains FCoE virtual links between FCoE
devices, for example, between ENodes and FCFs. In the process of creating a virtual link:
1. FIP discovers an FCoE VLAN and the FCoE virtual interface of the remote device.
2. FIP completes initialization tasks, such as fabric login (FLOGI) and fabric discovery
(FDISC), for the FCoE virtual link.
After an FCoE virtual link is set up, FIP maintains the FCoE virtual link in the following way:
1. Periodically checks whether FCoE virtual interfaces at both ends of the FCoE virtual link
are reachable.
2. Remove the FCoE virtual link through fabric logout (FLOGO).
The following figure shows the process of setting up an FCoE virtual link between an ENode
and FCF. The ENode and FCF exchange FIP frames to establish the FCoE virtual link. After
the FCoE virtual link is set up, FCoE frames are transmitted over the link.
An FCoE virtual link is set up through three phases: FIP VLAN discovery, FIP FCF
discovery, and FIP FLOGI and FDISC. The FIP FLOGI and FDISC process is similar to the
FLOGI and FDISC process defined in traditional FC protocol.
1. An ENode sends a FIP VLAN discovery request packet (FIP VLAN request) to a
multicast MAC address called All-FCF-MAC 01-10-18-01-00-02. All FCFs listen on
packets destined for this MAC address.
2. All FCFs report one or more FCoE VLANs to the ENode through a common VLAN.
The FCoE VLANs are available for the ENode's VN_Port login. FIP VLAN discovery is
an optional phase as defined in FC-BB-5. An FCoE VLAN can be manually configured
by an administrator, or dynamically discovered using FIP VLAN discovery.
used. FIP provides a Keepalive mechanism to solve the problem. FCoE monitors FCoE
virtual links as follows:
1. The ENode periodically sends FIP Keepalive packets to the FCF. If the FCF does not
receive FIP Keepalive packets within 2.5 times the keepalive interval, the FCF considers
the FCoE virtual link faulty and terminates the FCoE virtual link.
2. The FCF periodically sends multicast discovery advertisement messages to the
destination MAC address ALL-ENode-MAC to all ENodes. If the ENode does not
receive multicast discovery advertisement messages within 2.5 times the keepalive
interval, the ENode considers the FCoE virtual link faulty and terminates the FCoE
virtual link.
If the FCF does not receive FIP keepalive packets from the ENode, the FCF sends an FIP
clear virtual link message to the ENode to clear the FCoE virtual link. When the ENode logs
out, it sends a Fabric Logout request message to the FCF to terminate the virtual link.
When an FCoE switch is deployed between an ENode and an FCF, FCoE frames are
forwarded on the FCoE switch based on the Ethernet protocol because FCoE switches do not
support FC. In this case, FCoE frames may not be destined for the FCF, and the point-to-point
connection between the ENode and FCF is terminated. To achieve equivalent robustness as an
FC network, the FCoE switch must forward FCoE traffic from all ENodes to the FCF. FIP
snooping obtains FCoE virtual link information by listening to FIP packets, controls the setup
of FCoE virtual links, and defends against malicious attacks.
The FCoE switch running FIP snooping is called an FSB. The 10GE switch modules of the
E9000 support FIP Snooping.
3.8.1 Background
Dual-uplink networking is commonly used to connect E9000 servers to the existing network
system to ensure network reliability. Figure 3-61 and Figure 3-62 show two types of dual-
uplink networking.
The dual-uplink networking, however, creates a loop between the cascaded access switches A
and B and switch modules in slots 2X and 3X of the E9000, which may cause network
broadcast storms. Generally, Spanning Tree Protocol (STP) can be used to prevent loops.
However, STP convergence time is long and a large amount of traffic is lost during the
convergence. Therefore, STP cannot be applied to the network that demands short
convergence time. In addition, STP cannot be directly used for Cisco network devices due to
Cisco proprietary protocol Per-VLAN Spanning Tree Plus (PVST+). To address these issues,
Huawei has proposed the Smart Link solution.
When all links and devices are running properly, the links between the switch module in slot
2X and switch A are active links and forward service data. The links between the switch
module in slot 3X and switch B are blocked. When switch A is faulty (a VRRP switchover is
required) or the links connected to the switch module in slot 2X are faulty, the links between
the switch module in slot 3X and switch B change to the active state and forward data. That
is, data is forwarded along the red lines shown in the figure.
As Flush packets are Huawei proprietary protocol packets, traffic switchovers can be
performed through Flush packets only when uplink devices are Huawei switches and support
the Smart Link feature. Otherwise, perform a forwarding path switchover by using the uplink
and downlink devices, or wait until the MAC address entries are aged.
# Cascade the CX310s in slots 2X and 3X. (For details, see the iStack
configuration instance.)
# Create an Eth-Trunk, add the first and second ports to the Eth-Trunk, and
disable STP.
[~CX310_C] interface Eth-Trunk 2
[*CX310_C-Eth-Trunk2] mode lacp-static
[*CX310_C-Eth-Trunk2] stp disable
[*CX310_C-Eth-Trunk2] trunkport 10GE 2/17/1 to 2/17/2
[*CX310_C-Eth-Trunk2] quit
[*CX310_C] interface Eth-Trunk 3
[*CX310_C-Eth-Trunk3] mode lacp-static
[*CX310_C-Eth-Trunk3] stp disable
[*CX310_C-Eth-Trunk3] trunkport 10GE 3/17/1 to 3/17/2
[*CX310_C-Eth-Trunk3] quit
[*CX310_C] commit
# Create a Smart Link group, and set Eth-Trunk 2 as the master port and Eth-Trunk
3 as the slave port.
[~CX310_C] smart-link group 1
[*CX310_C-smlk-group1] port Eth-Trunk 2 master
[*CX310_C-smlk-group1] port Eth-Trunk 3 slave
[*CX310_C-smlk-group1 quit
[*CX310_C] commit
As shown in the preceding figure, the CX310s in slots 2X and 3X are configured with one
Monitor Link group respectively, to monitor uplinks to switch A and switch B. Blades 1, 2,
and 3 are connected to two 10GE ports of CX310s in slots 2X and 3X separately, working in
active/standby mode (or in VM-based load-sharing mode). If the link between the CX310 in
slot 3X and switch B is faulty, the Monitor Link group in the CX310 shuts down all downlink
ports of this group (ports connected to the blade servers). When the blade servers detect the
port faults, service data is switched to the link between the CX310 in slot 2X and switch A
(through the red lines shown in the figure) to synchronize uplink status with downlink status.
# Create a Monitor Link group, and add uplink and downlink ports to the group.
(Perform these operations for the CX310s in slots 2X and 3X.)
[~CX310_2] interface 10GE2/17/1
[~CX310_2-10GE2/17/1] stp disable
[*CX310_2-10GE2/17/1] quit
[*CX310_2] monitor-link group 1
[*CX310_2-mtlk-group1] port 10GE2/17/1 uplink
[*CX310_2-mtlk-group1] port 10GE2/1/1 downlink 1
[*CX310_2-mtlk-group1] port 10GE2/2/1 downlink 2
[*CX310_2-mtlk-group1] port 10GE2/3/1 downlink 3
[*CX310_2-mtlk-group1] quit
[*CX310_2] commit
If the switch module fails to obtain the configuration file from the active MM910 during the
startup process, an error message is always displayed over the serial port (or SOL).
4 Networking Applications
This section describes common networking modes of the E9000 in the actual application
scenarios, including the Ethernet, FC, and FCoE converged networking, and networking with
proprietary protocols of Cisco switches.
For details about the Smart Link networking, see section 3.6. Smart Link prevents a loop
between multiple switches by blocking the standby links. It also eliminates the modification
on the access switches A and B.
PVST runs a common STP in each VLAN. Each VLAN has its independent STP status and
calculation. PVST series protocols cannot interact with IEEE standard STP/RSTP/MSTP
series protocols. Table 4-2 lists the differences between PVST and STP/RSTP/MSTP frames.
Cisco develops PVST+ based on PVST and develops Rapid-PVST+ based on PVST+. Table
4-3 describes the improvements.
Protocol Improvement
Table 4-4 describes how interworking between PVST+ and standard STPs is implemented on
different types of ports.
Trunk The default VLAN (VLAN 1) allows two types of packets: BPDU
in standard STP/RSTP format and private PVST BPDUs without
tags.
Private PVST BPDUs (with the destination MAC address of
01-00-0C-CC-CC-CD) are sent over other VLANs allowed.
NOTE
If the Trunk port does not allow the packets from the default VLAN to pass through, the port does not
transmit standard STP/RSTP BPDUs or private PVST BPDUs without tags.
Figure 4-5 Smart Link group interconnecting with Cisco PVST+ network
4.2.5.2 IEEE Standard Protocol (Root Bridge on the Cisco PVST+ Network Side)
CX switch modules are connected to the Cisco PVST+ network over MSTP, and
interconnection ports automatically switch to the RSTP mode. To ensure that root ports are on
the Cisco PVST+ network side, the CX switch modules and Cisco switches must be
configured with proper cost values and priorities (the bridge priority for VLAN 1 should be
higher than that of Huawei CST). Ensure that root bridges are on the Cisco switches and
blocked ports of VLAN 1 are on the CX switch ports. The CX switch modules also block
other VLAN packets. Therefore, block points of other VLANs of the Cisco PVST+ network
are on the same port of the CX switch module.
vPC uses two independent control planes, whereas the VSS (stack) technology uses one
control plane. Figure 4-8 shows the functional components involved in vPC.
If the E9000 connects to the vPC network, CX switch modules are used as downstream
devices, which connect to the vPC domain through L2 link aggregation ports. Figure 4-9
shows two access scenarios.
Stack the two CX switch modules as a logical switch and connect the switch modules to the
vPC domain through the Eth-Trunk across switch modules. Alternatively, connect each of the
switch module to the vPC domain through an Eth-Trunk.
The CX311 10GE Ethernet switching chip (LSW) connects to the MX510 through eight
10GE ports. FCoE traffic sent from CNAs is forwarded to the MX510s through the eight
10GE ports. The MX510 implements the conversion between FCoE and FC ports and
externally connects to FC storage or switches.
The MX510 can work in Transparent (NPV) or Full Fabric (by default) mode. You can
change the working mode on the MX510 CLI. Table 4-5 lists the working modes of the
MX510 when it is connected to different devices.
FC switch: Transparent
Brocade 300/Brocade 5000/Brocade 7500
Connect the FC switches from different vendors in NPV mode to reduce interconnection
risks.
Configure port VNP (ENode-facing by # Configure roles for the ports connected to
roles. default) the MX510.
[~CX311_C] interface 10ge 2/20/3
[~CX311_C-10GE2/20/3] fcoe role vnp
Configure security l Add all the ports # Apply the traffic policy FCOE-p1 to the
features. connect to the ports connected to the MX510.
MX510 to a port [*CX311_C-10GE2/20/1] port-isolate
group to prevent enable group 1
mutual interference
between Ethernet [*CX311_C-10GE2/20/1] stp disable
packets. Suppress [*CX311_C-10GE2/20/1] storm
outbound broadcast, suppression broadcast block outbound
unknown unicast, [*CX311_C-10GE2/20/1] storm
and multicast suppression unknown-unicast block
packets on all ports. outbound
(Only allow port */
[*CX311_C-10GE2/20/1] traffic-policy
20/1 to send
FCOE-p1 outbound
multicast packets
with the destination # Apply the traffic policy FCOE-p11 to
MAC address panel ports.
0110-1801-0002
[*CX311_C-10GE2/17/1] traffic-policy
and traffic less than
FCOE-p11 inbound
or equal to 2
Mbit/s). Of all [*CX311_C-10GE2/17/1] traffic-policy
outbound packets, FCOE-p11 outbound
only FIP and FCoE
Note: If the panel ports are added to an
packets are allowed.
Eth-Trunk, apply the traffic policy FCOE-
l Prevent panel ports p11 to the Eth-Trunk.
from sending or
receiving FIP and
FCoE packets.
l Prevent the ports
connected to
MM910s and the
40GE ports
interconnecting
switch modules
from sending or
receiving FIP and
FCoE packets.
The MX510 in full fabric mode can directly connect to storage devices. Connect the FC ports
on the switch module panel to both the controllers A and B of the storage device to ensure
system reliability and storage performance.
A CNA of a blade server provides at least two ports to connect to the switch modules in slots
2X and 3X or in slots 1E and 4E. The two ports perform the FIP VLAN discovery, FIP FCF
discovery, and FIP FLOGI operations on the connected MX510 in sequence. When receiving
a CNA registration request, the MX510 randomly selects a port from the eight 10GE ports
that connect the MX510 to the LSW as the FCoE session port. Then, all FCoE packets are
sent and received through this port. If the port is faulty, FIP Keepalive fails and the FCoE
session is invalid. The CNA repeats the FIP VLAN discovery, FIP FCF discovery, and FIP
FLOGI operations to enable the MX510 to assign a new port. After the port is assigned, a new
FCoE session is set up automatically.
Similarly, the MX510 connects to an FC switch through multiple FC links. During FIP login,
the MX510 randomly selects an FC link as the FCoE session transmission link. If the FC link
is faulty, the CNA registers again and the MX510 reselects an FC link randomly to implement
a failover. During the failover, the FC panel port and the 10GE port between the LSW and
MX510 are both reselected randomly, as shown in Figure 4-13.
Create an FCoE VLAN. Create FCoE VLAN 1003. [~CX311_C] fcoe FCOE-1003
[~CX311_C-fcoe- FCOE-1003]
vlan 1003
Configure port roles for Set the role of ports (in slot Same as the default configuration.
the MX510. 3X) connected to the
MX510 to VNP.
Configure DCB Configure the ports (in slot Same as the default configuration.
features. 3X) connected to the
MX510 and compute
nodes. Configuration items
are the same.
The CX310s are not stacked. (If CX310s are stacked, you only need to configure the ETS
template and LLDP once. Port configurations of slots 2X and 3X are the same except the
FCoE VLAN.) This section uses the CX310 in slot 2X as an example to describe how to
configure the connections between the CX310 switch module and Cisco Nexus 5000 switch.
The CX320 supports three FCoE networking modes: FSB, FCF, and NPV. It is the best choice
for converged networks of enterprises and carriers. The CX320 can be configured with an
FCF instance to connect to FC or FCoE storage devices or configured with an NPV instance
to connect to FC or FCoE switches, so that various converged networking requirements are
met.
The CX320 switch modules can be installed in slots 2X and 3X or slots 1E and 4E and work
with 2-port or 4-port CNAs to meet various converged application requirements. A shown in
Figure 4-16, CX320 switch modules are installed in the E9000 server to implement a
converged network inside the E9000 chassis and convergence of the external LAN and SAN.
The MX517 flexible cards provide 8G FC ports to connect to FC switches or storage so that
the existing SAN resources are fully utilized. With flexible cards, the CX320 supports
evolution to 16G FC, 32G FC, and 25GE to fully protect customers' investments.
The CNAs on compute nodes connect to the CX320 switch modules in slots 2X and 3X or
slots 1E and 4E. Each CX320 connects to both storage controllers of the FC or FCoE storage.
If one CX320 fails, the other one is still connected to the two controllers so that the number of
storage controllers and storage performance are not reduced. After the CX320 switch modules
are configured with FCF instances and connected to FC storage, the port roles are as shown in
Figure 4-18.
In the figure, the CX320 switch modules use VF_Ports connect to CNAs of compute nodes
and use F_Ports to connect to storage devices.
As shown in Figure 4-19, CX320 switch modules are installed in slots 2X and 3X; an MX517
is installed in flexible card slot 1 of each CX320; the compute node is installed in slot 1; an
MZ510 is installed on the compute node; each CX320 uses one 10GE link to connect to the
LAN and one FC link to connect to FC storage so that network reliability is ensured.
2. Select Personality by pressing arrow keys and press Enter. The Personality options
shown in Figure 4-21 are displayed.
3. Select FCoE by pressing arrow keys and press Enter. Then select Save by pressing
arrow keys and press Enter, as shown in Figure 4-22.
Configure FC ports.
# Before configuring FC ports, insert FC optical modules into the ports. If the Link indicator
is steady green, the port is up.
# For the CX320 in slot 2X, create an FCF instance fcf1 and specify VLAN 2094 as the FCoE
VLAN of the instance.
[*CX320_2X] fcoe fcf1 fcf
[*CX320_2X-fcoe-fcf-fcf1] vlan 2094
[*CX320_2X] quit
# Set the ports connected to CH121 compute nodes as hybrid ports and add the FCoE and
Ethernet VLANs.
[*CX320_2X] interface 10GE 2/1/1
[*CX320_2X-10GE2/1/1] port link-type hybrid
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2094
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2093
[*CX320_2X-10GE2/1/1] fcoe-port 1
[*CX320_2X-10GE2/1/1] quit
# Set ports connected to the LAN as access ports and add the Ethernet VLAN.
[*CX320_2X] interface 10GE 2/17/1
[*CX320_2X-10GE2/17/1] port link-type access
[*CX320_2X-10GE2/17/1] port default vlan 2093
----End
The CNA on a compute node connects to the CX320 switch modules in slots 2X and 3X or
slots 1E and 4E. Each CX320 connects to an FC or FCoE switch, which is connected to
storage through a SAN. After the CX320 switch modules are configured with FCF instances
and connected to FC switches, the port roles are as shown in Figure 4-24.
In the figure, the NPV instances of the CX320 switch modules use VF_Ports to connect to
CNAs of compute nodes and use NP_Ports to connect to FC switches.
2. Select Personality by pressing arrow keys and press Enter. The Personality options
shown in Figure 4-27 are displayed.
3. Select FCoE by pressing arrow keys and press Enter. Then select Save by pressing
arrow keys and press Enter, as shown in Figure 4-28.
Configure FC ports.
# Before configuring FC ports, insert FC optical modules into the ports. If the Link indicator
is steady green, the port is up.
# Configure port 10GE 2/21/1 of the CX320 in slot 2X as an FC port.
<HUAWEI> system-view
[~HUAWEI] sysname CX320_2X
[*HUAWEI] commit
[~CX320_2X] port mode fc 10GE 2/21/1
Warning: This operation will cause all the other configurations on the port to be lost.
Continue?[Y/N]:Y
[*CX320_2X] commit
# For the CX320 in slot 2X, create an NPV instance npv1 and specify VLAN 2094 as the
FCoE VLAN and VLAN 2095 as the NPV VLAN.
[*CX320_2X] fcoe npv1 npv
[*CX320_2X-fcoe-npv-npv1] vlan 2094
[*CX320_2X-fcoe-npv-npv1] npv-vlan 2095
[*CX320_2X] quit
# Set the ports connected to CH121 compute nodes as hybrid ports and add the FCoE and
Ethernet VLANs.
[*CX320_2X] interface 10GE 2/1/1
[*CX320_2X-10GE2/1/1] port link-type hybrid
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2094
[*CX320_2X-10GE2/1/1] port hybrid tagged vlan 2093
[*CX320_2X-10GE2/1/1] fcoe-port 1
[*CX320_2X-10GE2/1/1] quit
# Set the ports connected to the LAN as access ports and add the Ethernet VLAN.
[*CX320_2X] interface 10GE 2/17/1
[*CX320_2X-10GE2/17/1] port link-type access
[*CX320_2X-10GE2/17/1] port default vlan 2093
[*CX320_2X] commit
-------------------------------------------------------------------------------
VLAN : 2094
FKA-ADV-Period(ms) : 8000
-------------------------------------------------------------------------------
Total: 1
The configuration procedure is the same as that of the CX320 in slot 2X, except that the FC
port numbers and the numbers of the ports connected to compute nodes.
----End
4.5 FC Networking
4.5.1.1 Overview
The CX9xx series multi-plane switch modules include the CX911 and CX912 with 10GE
+ 8G FC ports, CX915 with GE + 8G FC ports, and CX916 with 10GE + 16G FC ports.
Figure 4-29 shows the internal structure of multi-plane switch modules.
Ethernet and FC planes are independent in management, control, and data channels, similar to
two independent switches combined into one switch module. The two planes use the same
hardware management and monitoring systems, improving system integration. In addition, the
CX210 and CX220 consist of only an FC switching plane. CX210 and CX912 both integrate
the MX210 as the FC switching plane. The CX220 and CX916 both integrate the MX220 as
the FC switching plane. The CX916 provides 16G FC switching when working with the
MZ220 NIC. The CX916 provides Ethernet switching only if it is installed in slot 2X or 3X
and works with V5 compute nodes (such as the CH121 V5 with the two-port 10GE LOM). If
the CX916 is installed in slot 1E or 4E, it only provides FC switching and works with the
MZ220 NIC.
From the OS perspective, each FCoE port has a NIC and an HBA, and each 10GE port has a
NIC. Therefore, if the CX911 or CX915 is used with the MZ910, each MZ910 in slots 1 to 12
provides four NICs and two HBAs and each MZ910 in slots 13 to 16 provides two NICs and
two HBAs. See Table 4-12.
13-16 FC 2 2
CX912 01-16 FC 2 2
When MX210s (Native) are connected to Brocade switches through E-Ports, the Full Fabric
license needs to be configured.
Connect the MX510s of the CX911/CX915, MX210s of the CX912, and CX210s to storage
devices in crossover mode to ensure system reliability and storage performance. The MX210s
and MX510s in default configuration can directly connect to storage devices.
FC Fibre Channel
PG Port Group