IP Multicast, Volume II
IP Multicast, Volume II
IP Multicast, Volume II
1
Title Page
IP Multicast, Volume II
Advanced Multicast Concepts and Large-Scale Multicast Design
2
Copyright Page
IP Multicast, Volume II
Advanced Multicast Concepts and Large-Scale Multicast Design
Josh Loveless, Ray Blair, and Arvind Durai
Copyright© 2018 Cisco Systems, Inc.
Published by:
Cisco Press
800 East 96th Street
Indianapolis, IN 46240 USA
All rights reserved. No part of this book may be reproduced or transmitted in any form or
by any means, electronic or mechanical, including photocopying, recording, or by any
information storage and retrieval system, without written permission from the publisher,
except for the inclusion of brief quotations in a review.
Printed in the United States of America
1 18
Library of Congress Control Number: 2017962613
ISBN-13: 978-1-58714-493-6
ISBN-10: 1-58714-493-X
Warning and Disclaimer
This book is designed to provide information about advanced topics in IP Multicast
networking. Every effort has been made to make this book as complete and as accurate
as possible, but no warranty or fitness is implied.
The information is provided on an “as is” basis. The authors, Cisco Press, and Cisco
Systems, Inc. shall have neither liability nor responsibility to any person or entity with
respect to any loss or damages arising from the information contained in this book or
from the use of the discs or programs that may accompany it.
The opinions expressed in this book belong to the author and are not necessarily those of
3
Copyright Page
Cisco Systems, Inc.
Trademark Acknowledgments
All terms mentioned in this book that are known to be trademarks or service marks have
been appropriately capitalized. Cisco Press or Cisco Systems, Inc., cannot attest to the
accuracy of this information. Use of a term in this book should not be regarded as
affecting the validity of any trademark or service mark.
Special Sales
For information about buying this title in bulk quantities, or for special sales opportunities
(which may include electronic versions; custom cover designs; and content particular to
your business, training goals, marketing focus, or branding interests), please contact our
corporate sales department at [email protected] or (800) 382-3419.
For government sales inquiries, please contact [email protected].
For questions about sales outside the U.S., please contact [email protected].
Feedback Information
At Cisco Press, our goal is to create in-depth technical books of the highest quality and
value. Each book is crafted with care and precision, undergoing rigorous development
that involves the unique expertise of members from the professional technical
community.
Readers’ feedback is a natural continuation of this process. If you have any comments
regarding how we could improve the quality of this book, or otherwise alter it to better
suit your needs, you can contact us through email at [email protected]. Please
make sure to include the book title and ISBN in your message.
We greatly appreciate your assistance.
Editor-in-Chief: Mark Taub
Alliances Manager, Cisco Press: Arezou Gol
4
Copyright Page
Managing Editor: Sandra Schroeder
Composition: codemantra
Americas Headquarters
Cisco Systems, Inc.
San Jose, CA
Asia Pacific Headquarters
Cisco Systems (USA) Pte. Ltd.
Singapore
Europe Headquarters
Cisco Systems International BV Amsterdam,
The Netherlands
Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers
are listed on the Cisco Website at www.cisco.com/go/offices.
Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its
5
Copyright Page
affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to this
URL: www.cisco.com/go/trademarks. Third party trademarks mentioned are the property
of their respective owners. The use of the word partner does not imply a partnership
relationship between Cisco and any other company. (1110R)
6
About This E-Book
About This E-Book
EPUB is an open, industry-standard format for e-books. However, support for EPUB and
its many features varies across reading devices and applications. Use your device or app
settings to customize the presentation to your liking. Settings that you can customize
often include font, font size, single or double column, landscape or portrait mode, and
figures that you can click or tap to enlarge. For additional information about the settings
and features on your reading device or app, visit the device manufacturer’s Web site.
Many titles include programming code or configuration examples. To optimize the
presentation of these elements, view the e-book in single-column, landscape mode and
adjust the font size to the smallest setting. In addition to presenting code and
configurations in the reflowable text format, we have included images of the code that
mimic the presentation found in the print book; therefore, where the reflowable format
may compromise the presentation of the code listing, you will see a “Click here to view
code image” link. Click the link to view the print-fidelity code image. To return to the
previous page viewed, click the Back button on your device or app.
7
About the Author
About the Author
Josh Loveless, CCIE No. 16638, is a systems engineering manager for Cisco Systems.
He has been with Cisco since 2012, providing architecture and support services for tier 1
service providers as well as for many of Cisco’s largest enterprise customers, specializing
in large-scale routing and switching designs. Prior to joining Cisco, he spent 15 years
working for large service providers and enterprises as both an engineer and an architect,
as well as providing training and architecture services to some of Cisco’s trusted partners.
Josh maintains two CCIE certifications, Routing and Switching and Service Provider.
Ray Blair, CCIE No. 7050, is a distinguished systems engineer and has been with Cisco
Systems since 1999. He uses his years of experience to align technology solutions with
business needs to ensure customer success. Ray started his career in 1988, designing
industrial monitoring and communication systems. Since that time, he has been involved
with server/database administration and the design, implementation, and management of
networks that included networking technologies from ATM to ZMODEM. He maintains
three CCIE certifications in Routing and Switching, Security, and Service Provider (No.
7050), and he is also a Certified Information Systems Security Professional (CISSP), and
a Certified Business Architect (No. 00298). Ray is coauthor of three Cisco Press books,
Cisco Secure Firewall Services Module, Tcl Scripting for Cisco IOS, and IP Multicast,
Volume 1. He speaks at many industry events and is a Cisco Live distinguished speaker.
Arvind Durai, CCIE No. 7016, is a director of solution integration for Cisco Advanced
Services. Arvind is a chief architect for advanced services for the West Enterprise
Region, an organization of around 100 consultants focused on customer success for
approximately 150 enterprise accounts. Over the past 18 years, Arvind has been
responsible for supporting major Cisco customers in the enterprise sector, including
financial, retail, manufacturing, e-commerce, state government, utility, and health care
sectors. Some of his focuses have been on security, multicast, network virtualization, data
center enterprise cloud adoption, automation, and software-defined infrastructure, and he
has authored several white papers on various technologies. He has been involved in
multicast designs for several enterprise customers in different verticals. He is also one of
the contributors to the framework for the Advanced Services Multicast Audit tool, which
helps customers assess their operational multicast network to industry best practices.
Arvind maintains two CCIE certifications, Routing and Switching and Security, and also
is a Certified Business Architect. He holds a bachelor of science degree in electronics and
communication, a master’s degree in electrical engineering, and a master’s degree in
business administration. He has coauthored four Cisco Press books: Cisco Secure
8
About the Author
Firewall Services Module, Virtual Routing in the Cloud, Tcl Scripting for Cisco IOS,
and IP Multicast:, Volume 1. He has also coauthored IEEE WAN smart grid architecture
and presented in many industry forums, such as IEEE and Cisco Live.
9
About the Technical Reviewers
About the Technical Reviewers
Nick Garner, CCIE No. 17871, is a solutions integration architect for Cisco Systems. He
has been in Cisco Advanced Services, supporting customers in both transactional and
subscription engagements, for 8 years. In his primary role, he has deployed and supported
large-scale data center designs for prominent clients in the San Francisco Bay Area. His
primary technical focus, outside of data center routing and switching designs, has been
security and multicast. Prior to joining Cisco, Nick worked for a large national financial
institution as a network security engineer. Nick maintains two CCIE certifications,
Routing and Switching and Security.
Yogeshwaran Raghunathan, CCIE No. 6583, is a senior solutions integration architect
on the Advanced Services team at Cisco Systems. Yogi holds an MBA and an
engineering degree in electronics and communication from CIT (Coimbatore, India). He
has 22 years of experience working in the networking industry, 17 of them with Cisco
Systems, supporting various service providers in North America. Yogi’s hands-on
experience in building and supporting large service provider networks has exposed him to
complex MPLS architectures, thus enabling different perspectives on the new world of
SDN and MPLS deployment. Yogi has in recent years been involved in design,
implementation, and planning for large web provider networks. He can be reached at
[email protected].
10
Dedication
Dedication
This book is dedicated to my wonderful family and to all my friends who have supported
my career throughout many difficult years.—Josh Loveless
This book is dedicated to my wife, Sonya, and my children, Sam, Riley, Sophie, and
Regan. You guys mean the world to me!—Ray Blair
This book is dedicated to my parents and family for their support and blessings.—Arvind
Durai
11
Acknowledgments
Acknowledgments
Josh Loveless: A special thank you goes to my coauthors, Ray Blair and Arvind Durai,
for the great work they have done completing this two-volume set on IP Multicast. I
would also like to thank the technical reviewers, Yogi and Nick, and all the editors at
Pearson for all the tireless work they put into making this book pop!
Ray Blair: As with everything else in my life, I thank my Lord and Savior for his faithful
leading that has brought me to this place. Thank you, Josh and Arvind, for partnering in
this endeavor, Nick and Yogi for your excellent reviews, and Pearson for your support.
Arvind Durai: Thank you, Monica and Akhhill, for your continuous support and
patience that helped me complete my fifth book.
Thank you, Ray and Josh, for making this journey of writing IP Multicast, Volume 1 and
Volume 2 a joyful ride.
A special thanks to Brett Bartow, Yogi Raghunathan, and Nick Garner for your valuable
contributions.
As always, thank you, God, for giving me guidance, opportunity, and support in all my
endeavors!
12
Contents at a Glance
Contents at a Glance
Introduction
Chapter 1 Interdomain Routing and Internet Multicast
Chapter 2 Multicast Scalability and Transport Diversification
Chapter 3 Multicast MPLS VPNs
Chapter 4 Multicast in Data Center Environments
Chapter 5 Multicast Design Solutions
Chapter 6 Advanced Multicast Troubleshooting
Index
13
Contents
Contents
Introduction
Chapter 1 Interdomain Routing and Internet Multicast
Introduction to Interdomain Multicast
What Is a Multicast Domain? A Refresher
PIM Domain Design Types
Domains by Group, or Group Scope
Domains by RP Scope
Overlapping Domains and Subdomains
Forwarding Between Domains
Autonomous System Borders and Multicast BGP
Configuring and Verifying MBGP for Multicast
Domain Borders and Configured Multicast Boundaries
Multicast Source Discovery Protocol
Understanding Source Actives (SA) and MSDP Mechanics
Configuring and Verifying MSDP
Basic MSDP Deployment Use Case
Intradomain versus Interdomain Design Models
Intra-AS Multidomain Design
Inter-AS and Internet Design
Protecting Domain Borders and Interdomain Resources
14
Contents
Firewalling IP Multicast
Controlling Domain Access through Filtering
Service Filtering at the Edge
Interdomain Multicast Without Active Source Learning
SSM
IPv6 with Embedded RP
Summary
References
Chapter 2 Multicast Scalability and Transport Diversification
Why Is Multicast Not Enabled Natively in a Public Cloud Environment?
Enterprise Adoption of Cloud Services
Cloud Connectivity to an Enterprise
Virtual Services in a Cloud
Service Reflection Feature
Use Case 1: Multicast-to-Multicast Destination Conversion
Use Case 2: Unicast-to-Multicast Destination Conversion
Use Case 3: Multicast-to-Unicast Destination Conversion
Multicast Traffic Engineering
Unicast Spoke-to-Spoke Intra-regional Communication
Unicast Spoke-to-Spoke Interregional Communication
Unicast Spoke-to-Central Hub Communication
15
Contents
Traffic Path of a Multicast Stream Sourced at the Central Hub
Intra-regional Multicast Flow Between Two Spokes
Enabling Multicast to the CSP Use Case 1
Point-to-Point VC Provided by the NSP
MPLS Layer 3 VPN Services for the NSP
Enabling Multicast to the CSP Use Case 2
Summary
Chapter 3 Multicast MPLS VPNs
Multicast in an MPLS VPN Network
Multicast Distribution Tree (MDT)
Default MDT
Data MDT
Multicast Tunnel Interface (MTI)
Multicast Signaling in the Core
Default MDT in Action
Data MDT Traffic Flow
Default MDT Example
MTI Example
Data MDT Example
Multicast LDP (MLDP)
FEC Elements
16
Contents
In-Band Signaling Operation
Out-of-Band (Overlay) Signaling Operation
Default MDT MLDP
Default MDT MLDP Root High Availability
MLDP in Action
Default MDT MLDP Example
Data MDT MLDP Example
Profiles
Migrating Between Profiles
Provider (P) Multicast Transport
PE–CE Multicast Routing
CE–CE Multicast Routing
PE–PE Ingress Replication
Multicast Extranet VPNs
Route Leaking
VRF Fallback
VRF Select
Fusion Router
VRF-Aware Service Infrastructure (VASI)
IPv6 MVPN
Bit Index Explicit Replication (BIER)
17
Contents
Summary
References
Chapter 4 Multicast in Data Center Environments
Multicast in a VPC Environment
Multicast Flow over a VPC
VXLAN
VTEP
VXLAN Flood and Learn
VXLAN with EVPN
Spine 1 Configuration
Leaf Configuration
Ingress Replication
Host-to-Host Multicast Communication in VXLAN
Layer 2 Communication Within the Boundary of the VNI
Layer 3 Multicast Communication
Multicast in ACI Data Center Networks
ACI Fabrics and Overlay Elements
Layer 2 IGMP Snooping in ACI
Layer 3 Multicast in ACI
Summary
Chapter 5 Multicast Design Solutions
18
Contents
Multicast-Enabled Clinical Networks
Accommodating Medical Device Communications Through Multicast
Multicast Considerations for Wireless Networks
Multicast in Multitenant Data Centers
ACI Multitenant Multicast
Multicast and Software-Defined Networking
LISP Map Resolver (MR)/Map Server (MS)
LISP PETRs/PITRs
LISP and Multicast
Multicast in Utility Networks
PMU
Radio over IP Design
Multicast-Enabled Markets
Multicast Design in a Market Data Environment
FSP Multicast Design
Brokerage Multicast Design
Service Provider Multicast
Service Provider PIM-Type Selection and RP Placement
IPTV Delivery over Multicast
Summary
References
19
Contents
Chapter 6 Advanced Multicast Troubleshooting
Troubleshooting Interdomain Multicast Networks
Troubleshooting PIM with Traffic Engineering
Troubleshooting MVPN
Verifying Multicast in VXLAN
Summary
Index
20
Command Syntax Conventions
Command Syntax Conventions
The conventions used to present command syntax in this book are the same conventions
used in the IOS Command Reference. The Command Reference describes these
conventions as follows:
Boldface indicates commands and keywords that are entered literally as shown. In
actual configuration examples and output (not general command syntax), boldface
indicates commands that are manually input by the user (such as a show command).
Italic indicates arguments for which you supply actual values.
Vertical bars (|) separate alternative, mutually exclusive elements.
Square brackets ([ ]) indicate an optional element.
Braces ({ }) indicate a required choice.
Braces within brackets ([{ }]) indicate a required choice within an optional element.
Note: This book covers multiple operating systems, and icons and router names indicate
the appropriate OS that is being referenced. IOS and IOS-XE use router names like R1
and R2 and are referenced by the IOS router icon. IOS-XR routers use router names like
XR1 and XR2 and are referenced by the IOS-XR router icon.
21
Reader Services
Reader Services
Register your copy at www.ciscopress.com/title/ISBN for convenient access to
downloads, updates, and corrections as they become available. To start the registration
process, go to www.ciscopress.com/register and log in or create an account*. Enter the
product ISBN 9781587144936 and click Submit. When the process is complete, you will
find any available bonus content under Registered Products.
*Be sure to check the box that you would like to hear from us to receive exclusive
discounts on future editions of this product.
22
Introduction
Introduction
IP Multicast, Volume 2 covers advanced IP Multicast designs and protocols specific to
Cisco Systems routers and switches. It includes pragmatic discussion of common features,
deployment models, and field practices for advanced IP Multicast networks. The
discussion culminates with commands and methodologies for implementing and
troubleshooting advanced Cisco IP Multicast networks.
23
Who Should Read This Book?
Who Should Read This Book?
IP Multicast, Volume 2 is intended for any professional supporting IP Multicast
networks. This book primarily targets the following groups, but network managers and
administrators will also find value from the included case studies and feature
explanations:
IP network engineers and architects
Network operations technicians
Network consultants
Security professionals
Collaboration specialists and architects
24
How This Book Is Organized
How This Book Is Organized
This book is organized into six chapters that cover the following topics:
Chapter 1, “Interdomain Routing and Internet Multicast”: This chapter explains
the fundamental requirements for interdomain multicast and the three pillars of
interdomain design: control plane for source identification, control plane for receiver
identification, and downstream control plane.
Chapter 2, “Multicast Scalability and Transport Diversification”: Transportation of
multicast messages requires consideration of several factors, especially when cloud
service providers do not support native multicast. This chapter introduces the key
concepts of cloud services and explains the elements required to support multicast
services.
Chapter 3, “Multicast MPLS VPNs”: Multicast VPNs provide the ability to logically
separate traffic on the same physical infrastructure. Most service providers and many
enterprise customers implement Multiprotocol Label Switching (MPLS) so they can
separate or isolate traffic into logical domains or groups, generally referred to as virtual
private networks (VPNs). This chapter discusses the options for implementing multicast
VPNs.
Chapter 4, “Multicast in Data Center Environments”: This chapter explains the use
of multicast in the data center. Understanding the nuances of how multicast functions in
myriad solutions is critical to the success of your organization. The goal of this chapter is
to provide insight into the operation of multicast, using the most popular methods for data
center implementation, including virtual port channel (VPC), Virtual Extensible LAN
(VXLAN), and Application Centric Infrastructure (ACI).
Chapter 5, “Multicast Design Solutions”: This chapter examines several archetypical
network design models. One of the models represents a specific network strategy that
meets a specific commercial purpose—a trade floor. Another model is a general design
for a specific industry, focusing on the deployment of multicast in a hospital environment.
The intent of this chapter is to provide a baseline for each type of design as well as
examples of best practices for multicast deployments.
Chapter 6, “Advanced Multicast Troubleshooting”: This chapter explains the basic
methodology for troubleshooting IP Multicast networks.
25
Chapter 1 Interdomain Routing and Internet Multicast
Chapter 1
Interdomain Routing and Internet Multicast
This chapter explains the fundamental requirements for interdomain multicast and the three pillars of interdomain design:
the control plane for source identification, the control plane for receiver identification, and the downstream control plane.
26
Chapter 1 Interdomain Routing and Internet Multicast
Each of the network entities (surrounded by an oval) shown in Figure 1-1 is known as an autonomous system (AS)—that is,
a network that has administrative and operational boundaries, with clear demarcations between itself and any other
network AS. Like IP addresses, autonomous systems are numbered, and the assignment of numbers is controlled by the
Internet Assigned Numbers Authority (IANA). Figure 1-2 shows the previous Internet example network represented as
simple AS bubbles using private autonomous system numbers (ASNs).
Figure 1-2 The Mcast Enterprises Network as a System of Connected Autonomous Systems
Note: ASNs, as defined by the IETF, are public, just like IP addresses. However, as with IP addresses, there is a private
number range that is reserved for use in non-public networks. The standard 16-bit private ASN range is 64512–65535,
which is defined by RFC 6996 (a 2013 update to RFC 1930). Even though Internet functions may be discussed, all
27
Chapter 1 Interdomain Routing and Internet Multicast
numbers used in this text (for IP addressing, ASNs, and multicast group numbers) are private to prevent confusion with
existing Internet services and to protect public interest.
Some routing information is shared between the interconnected ASs to provide a complete internetwork picture. Best-
effort forwarding implies that as routers look up destination information and traffic transits between ASs, each AS has its
own forwarding rules. To illustrate this concept, imagine that a home user is connected to ISP Red (AS 65001) and sends
an IP web request to a server within Mcast Enterprises (AS 65100). The enterprise does not have a direct connection to
ISP Red. Therefore, ISP Red must forward the packets to ISP Blue, and ISP Blue can then forward the traffic to the
enterprise AS with the server. ISP Red knows that ISP Blue can reach the enterprise web server at address 10.10.1.100
because ISP Blue shared that routing information with all the ASs it is connected to. ISP Red does not control the network
policy or functions of ISP Blue and must trust that the traffic can be successfully passed to the server. In this situation, ISP
Blue acts as a transit network. Figure 1-3 illustrates this request.
Some protocols are designed to make best-effort forwarding between ASs more precise. This set of protocols is maintained
by the IETF, an international organization that governs all protocols related to IP forwarding, including PIM and BGP. The
IETF—like the Internet—is an open society in which standards are formed collaboratively. This means there is no inherent
administrative mandate placed on network operating system vendors or networks connecting to the Internet to follow
protocol rules precisely or even in the same way.
This open concept is one of the more miraculous and special characteristics of the modern Internet. Networks and devices
that wish to communicate across network boundaries should use IETF-compliant software and configurations. Internet
routers that follow IETF protocol specifications should be able to forward IP packets to any destination on any permitted
IP network in the world. This assumes that every Internet-connected router shares some protocols with its neighbors, and
those protocols are properly implemented (as discussed further later in this chapter); each router updates neighbor ASs
with at least a summary of the routes it knows about. This does not assume that every network is configured or
administered in the same way.
For example, an AS is created using routing protocols and policies as borders and demarcation points. In fact, routing
protocols are specifically built for these two purposes. Internal routing is handled by an Interior Gateway Protocol (IGP).
Examples of IGPs include Open Shortest Path First (OSPF) and Intermediate System-to-Intermediate System (IS-IS).
28
Chapter 1 Interdomain Routing and Internet Multicast
Routers use a chosen IGP on all the routed links within the AS. When routing protocols need to share information with
another AS, an External Gateway Protocol (EGP) is used to provide demarcation. This allows you to use completely
separate routing policies and security between ASs that might be obstructive or unnecessary for internal links.
Border Gateway Protocol version 4 (BGPv4) is the EGP that connects all Internet autonomous systems together. Figure 1-
4 shows an expanded view of the Mcast Enterprises connectivity from Figure 1-3, with internal IGP connections and an
EGP link with ISP Blue. Mcast Enterprises shares a single summary route for all internal links to ISP Blue via BGP.
Figure 1-4 AS Demarcation: IGP Links Using OSPF and EGP Links Using BGP
The separation between internal (IGP) and external (EGP) routes provides several important benefits. The first is
protection of critical internal infrastructure from outside influence, securing the internal routing domain. Both security and
routing policies can be applied to the EGP neighborship with outside networks. Another clear benefit is the ability to better
engineer traffic. Best-effort forwarding may be acceptable for Internet connections and internetwork routing. However,
you definitely need more finite control over internal routes. In addition, if there are multiple external links that share
similar routing information, you may want to control external path selection or influence incoming path selection—without
compromising internal routing fidelity. Finally, you may choose to import only specific routes from external neighbors or
share only specific internal routes with outside neighbors. Selective route sharing provides administrators control over how
traffic will or will not pass through each AS.
Why is all this relevant to multicast? PIM is the IETF standard for Any-Source Multicast (ASM) and Source-Specific
Multicast (SSM) in IP networks. Many people refer to PIM as a multicast routing protocol. However, PIM is unlike any
IGP or EGP. It is less concerned with complex route sharing policy than with building loop-free forwarding topologies or
trees. PIM uses the information learned from IP unicast routing protocols to build these trees. PIM networking and
neighborships have neither an internal nor an external characteristic. PIM neighborships on a single router can exist
between both IGP neighbors and EGP neighbors. If multicast internetworking is required between two ASs, PIM is a
requirement. Without this relationship, a tree cannot be completed.
Crossing the administrative demarcation point from one AS to another means crossing into a network operating under a
completely different set of rules and with potentially limited shared unicast routing information. Even when all the routers
in two different networks are using PIM for forward multicasting, forming a forwarding tree across these networks using
PIM alone is virtually impossible because Reverse Path Forwarding (RPF) information will be incomplete. In addition, it is
29
Chapter 1 Interdomain Routing and Internet Multicast
still necessary to secure and create a prescriptive control plane for IP Multicast forwarding as you enter and exit each AS.
The following sections explore these concepts further and discuss how to best forward multicast application traffic across
internetworks and the Internet.
30
Chapter 1 Interdomain Routing and Internet Multicast
If the required entries are not in the unicast RIB and the RPF table, the router drops any multicast packets to prevent a
loop. The router doesn’t just make these RPF checks independently for every packet. RPF checks are performed by the
PIM process on the router, and they are also used for building the multicast forwarding tree. If a packet fails the RPF
check, the interface on which the packet is received is not added to any trees for the destination group. In fact, PIM uses
RPF in almost all tree-building activities.
In addition, PIM as a protocol is independent of any unicast routing protocols or unicast forwarding. Proper and efficient
multicast packet forwarding is PIM’s main purpose; in other words, PIM is designed for tree building and loop prevention.
As of this writing, most PIM domains run PIM Sparse-Mode (PIM–SM). A quick review of PIM–SM mechanics provides
additional aid in the exploration of multicast domains.
There are two types of forwarding trees in PIM–SM: the shortest-path tree (also known as a source tree) and the shared
tree. The source tree is a tree that flows from the source (the root of the tree) to the receivers (the leaves) via the shortest
(most efficient) network path. You can also think of the source as a server and the receivers as clients of that server. As
mentioned earlier, clients must be connected to the Layer 3 network to be included in the tree. This allows the router to see
the best path between the source and the receivers.
The subscribed clients become a “group,” and, in fact, clients use a multicast group address to perform the subscription.
Internet Group Management Protocol (IGMP) is the protocol that manages the client subscription process. IGMP shares
the IP group subscriptions with PIM. PIM then uses those groups to build the source tree and shares the tree information
with other PIM neighbors.
Note: This review is only meant to be high level as it is germane to understanding why additional protocols are required for
interdomain routing. If any of these topics are more than a simple review for you, you can find a deeper study of IGMP,
PIM mechanics, and tree building in additional texts, such as IP Multicast, Volume 1.
Keep in mind that there may be many locations in a network that have a receiver. Any time a router has receivers that are
reached by multiple interfaces, the tree must branch, and PIM must RPF check any received sources before adding
forwarding entries to the router’s Multicast Forwarding Information Base (MFIB). A source tree in a multicast forwarding
table is represented by an (S, G) entry (for source, group). The (S, G) entry contains RPF information such as the interface
closest to the source and the interfaces in the path of any downstream receivers.
The branches in the entry are displayed as a list of outgoing interfaces, called an outgoing interface list (OIL). This is
essentially how PIM builds the source tree within the forwarding table. Neighboring PIM routers share this information
with each other so that routers will independently build proper trees, both locally and across the network. Example 1-1
shows what a completed tree with incoming interfaces and OILs look like on an IOS-XE router by using the show ip
mroute command.
32
Chapter 1 Interdomain Routing and Internet Multicast
Because the definitions of multicast domains are fluid, it is important to define those types of domains that make logical
sense and those most commonly deployed. You also need to establish some guidelines and best practices around how best
to deploy them. In almost all cases, the best design is the one that matches the particular needs of the applications running
34
Chapter 1 Interdomain Routing and Internet Multicast
on the network. Using an application-centered approach means that administrators are free to define domains according to
need rather than a set prescriptive methodology.
The primary focus of defining a domain is drawing borders or boundaries around the desired reach of multicast
applications. To accomplish this, network architects use router configurations, IP Multicast groups and scoping, RP
placement strategies, and other policy to define and create domain boundaries. If the multicast overlay is very simple, then
the domain may also be very simple, even encompassing the entire AS (refer to Figure 1-6). This type of domain would
likely span an entire IGP network, with universal PIM neighbor relationships between all IGP routers. A single RP could
be used for all group mappings. In this type of domain, all multicast groups and sources would be available to all systems
within the domain.
These types of domains with AS-wide scope are becoming more and more rare in practice. Application and security
requirements often require tighter borders around specific flows for specific applications. An administrator could use
scoping in a much more effective way.
In many cases, the best way to scope a domain is by application. It is best practice for individual applications to use
different multicast groups across an AS. That means that you can isolate an application by group number and scope the
domain by group number. This is perhaps the most common method of bounding a domain in most enterprise networks.
It is also very common to have numerous applications with similar policy requirements within a network. They may need
the same security zoning, or the same quality of service (QoS), or perhaps similar traffic engineering. In these cases, you
can use group scopes to accomplish the proper grouping of applications for policy purposes. For example, a local-only set
of applications with a singular policy requirement could be summarized by a summary address. Two applications using
groups 239.20.2.100 and 239.20.2.200 could be localized by using summary address 239.20.0.0/16. To keep these
applications local to a specific geography, the domain should use a single RP or an Anycast RP pair for that specific
domain.
Figure 1-7 shows just such a model—a domain scoping in the Mcast Enterprises network using a combination of geography
and application type, as defined by a group scope. Assuming that this network is using PIM–SM, each domain must have at
least one active RP mapped to the groups in the scope that is tied to the local region-specific application or enterprise-
specific groups. (Operational simplicity would require separate nodes representing RPs for different scope.) In this case,
R1, R2, and R3 are RPs/border routers for their respective domains.
35
Chapter 1 Interdomain Routing and Internet Multicast
Chapter 5, “IP Multicast Design Considerations and Implementation,” in IP Multicast, Volume 1 explains domain scoping
at length and provides additional examples of how to scope a domain based on application, geography, or other properties.
Most IP Multicast problems that occur in networks happen because the networks haven’t been properly segmented. (This
is actually often true of unicast networks as well.) A well-defined scope that is based on applications can be very helpful in
achieving proper segmentation.
Domains by RP Scope
In some larger networks, the location and policy of the applications are of less concern than the resiliency of network
resources. If there are many applications and the multicast network is very busy, RP resources could simply be overloaded.
It may be necessary to scope on RP location rather than on application type or group.
Scoping by RP allows the network architect to spread multicast pathing resources across the network. It is not uncommon
to use Anycast RP groupings to manage such large deployments. For example, with the network shown in Figure 1-7,
instead of using each router as a single RP, all three routers could be grouped together for Anycast RP in a single large
domain. Figure 1-8 shows an updated design for Mcast Enterprises for just such a scenario.
36
Chapter 1 Interdomain Routing and Internet Multicast
If the network design requires multiple types of domains and accompanying policies, it is likely that a hybrid design model
is required. This type of design might include many multicast domains that overlap each other in certain places. There
could also be sub-domains of a much larger domain.
Let’s look one more time at the Mcast Enterprises network, this time using a hybrid design (see Figure 1-9). In this case, an
application is sourced at R2 with group 239.20.2.100. That application is meant to be a geographically local resource and
should not be shared with the larger network. In addition, R2 is also connected to a source for group 239.1.2.200, which
has an AS-wide scope.
Remember that each domain needs a domain-specific RP or RP pair. In this case, R2 is used as the local domain RP and
R1 as the RP for an AS-wide scoped domain. Because the 239.20/16 domain is essentially a subset of the private multicast
group space designated by 239.0.0.0/8, the 239.20/16 domain is a subdomain of the much larger domain that uses R1 as
the RP. This design also uses two additional subdomains, with R1 as local RP and R3 as local RP.
For further domain resiliency, it would not be unreasonable to use R1 as a backup or Anycast RP for any of the other
groups as well. In these cases, you would use very specific source network filters at the domain boundaries. Otherwise, you
may want to look more closely at how to accomplish forwarding between these domains. Table 1-1 breaks down the scope
and RP assignments for each domain and subdomain.
37
Chapter 1 Interdomain Routing and Internet Multicast
In a hybrid design scenario, such as the example shown in Figure 1-9, it may be necessary to forward multicast traffic
down a path that is not congruent with the unicast path; this is known as traffic engineering. In addition, Mcast
Enterprises may have some multicast applications that require forwarding to the Internet, as previously discussed, or
forwarding between the domains outlined in Figure 1-9, which is even more common. Additional protocols are required to
make this type of interdomain forwarding happen.
Now that you understand the three pillars of interdomain design and the types of domains typically deployed, you also
need to meet these pillar requirements in order to forward traffic from one domain to another domain. If any of these
elements are missing from the network design, forwarding between domains simply cannot occur. But how does a network
build an MRIB for remote groups or identify sources and receivers that are not connected to the local domain?
Each domain in the path needs to be supplemented with information from all the domains in between the source and the
receivers. There are essentially two ways to accomplish this: statically or dynamically. A static solution involves a network
administrator manually entering information into the edge of the network in order to complete tree building that connects
the remote root to the local branches. Consider, for example, two PIM–SM domains: one that contains the source of a flow
and one that contains the receivers.
The first static tree-building method requires the source network to statically nail the multicast flow to the external
interface, using an IGMP static-group join. Once this is accomplished, if it is a PIM–SM domain, the receiver network can
38
Chapter 1 Interdomain Routing and Internet Multicast
use the edge router as either a physical RP or a virtual RP to which routers in the network can map groups and complete
shared and source trees. This is shown in Figure 1-10, using the Mcast Enterprises network as the receiver network and ISP
Blue as the source network. Both networks are using PIM–SM for forwarding.
When the edge router is used for RP services, it can pick up the join from the remote network and automatically form the
shared and source trees as packets come in from the source network. If Mcast Enterprises does not wish to use the edge
router as an RP but instead uses a centralized enterprise RP like R1, tree building will fail as the edge router will not have
the shared-tree information necessary for forwarding. The second interdomain static tree-building method solves this
problem by using a PIM dense-mode proxy (see Figure 1-11), which normally provides a proxy for connecting a dense-
mode domain to a sparse-mode domain.
39
Chapter 1 Interdomain Routing and Internet Multicast
As discussed earlier, each Internet-connected organization has unique needs and requirements that drive the
implementation of IETF protocol policies and configurations. A unique Internet-connected network, with singular
administrative control, is commonly referred to as an autonomous system (AS). The Internet is essentially a collection of
networks, with ASs connected together in a massive, global nexus. A connection between any two ASs is an administrative
demarcation (border) point. As IP packets cross each AS border, a best-effort trust is implied; that is, the packet is
forwarded to the final destination with reasonable service.
Each border-connecting router needs to share routing information through a common protocol with its neighbors in any
other AS. This does not mean that the two border routers use the same protocols, configuration, or policy to communicate
with other routers inside the AS. It is a near certain guarantee that an ISP will have a completely different set of protocols
and configurations than would an enterprise network to which it provides service.
40
Chapter 1 Interdomain Routing and Internet Multicast
The AS border represents a policy- and protocol-based plane of separation between routing information that an
organization wishes to make public and routing information that it intends to keep private. Routing protocols used within
the AS are generally meant to be private.
The modern Internet today uses Border Gateway Protocol (BGP)—specifically version 4, or BGPv4—as the EGP protocol
for sharing forwarding information between AS border routers. Any IP routing protocol could be used internally to share
routes among intra-AS IP routers. Internal route sharing may or may not include routes learned by border routers via BGP.
Any acceptable IP-based IGP can be used within the AS.
Figure 1-12 expands on the AS view and more clearly illustrates the demarcation between internal and external routing
protocols. In this example, not only is there an AS demarcation between Mcast Enterprises and ISP Blue but Mcast
Enterprises has also implemented a BGP confederation internally. Both AS border routers share an eBGP neighborship. All
Mcast Enterprises routes are advertised by the BR to the Internet via SP3-1, the border router of ISP Blue. Because Mcast
Enterprises is using a confederation, all routes are advertised from a single BGP AS, AS 65100. In this scenario, BR
advertises a single summary prefix of 10.0.0.0/8 to the Internet via ISP Blue.
Note: BGP confederations are not required in this design. The authors added confederations in order to more fully
illustrate multicast BGP relationships.
As mentioned earlier, the practice of best-effort networking also applies to internetworks that are not a part of the Internet.
Large internetworks may wish to segregate network geographies for specific policy or traffic engineering requirements.
Many enterprises use BGP with private AS numbers to accomplish this task. In some ways, this mirrors the best-effort
service of the Internet but on a much smaller scale. However, what happens when multicast is added to tiered networks or
interconnected networks such as these? Can multicast packets naturally cross AS boundaries without additional
considerations?
Remember the three pillars of interdomain design. Autonomous system borders naturally create a problem for the first of
these requirements: the multicast control plane for source identification. Remember that the router must know a proper
path to any multicast source, either from the unicast RIB or learned (either statically or dynamically) through a specific
RPF exception.
41
Chapter 1 Interdomain Routing and Internet Multicast
There is no RPF information from the IGP at a domain boundary. Therefore, if PIM needs to build a forwarding tree across
a domain boundary, there are no valid paths on which to build an OIL. If any part of the multicast tree lies beyond the
boundary, you need to add a static or dynamic RPF entry to complete the tree at the border router.
As discussed in IP Multicast, Volume 1, static entries are configured by using the ip mroute command in IOS-XE and NX-
OS. For IOS-XR, you add a static entry by using the static-rpf command. However, adding static RPF entries for large
enterprises or ISPs is simply not practical. You need a way to transport this information across multiple ASs, in the same
way you transport routing information for unicast networks. Autonomous system border routers typically use BGP to
engineer traffic between ASs. The IETF added specific, multiprotocol extensions to BGPv4 for this very purpose.
RFC 2283 created multiprotocol BGPv4 extensions (the most current RFC is RFC 4760, with updates in RFC 7606). This
is often referred to as Multiprotocol Border Gateway Protocol (MBGP). MBGP uses the same underlying IP unicast
routing mechanics inherent in BGPv4 to forward routing and IP prefix information about many other types of protocols.
Whereas BGPv4 in its original form was meant to carry routing information and updates exclusively for IPv4 unicast,
MBGP supports IPv6, multicast, and label-switched virtual private networks (VPNs) (using MPLS VPN technology, as
discussed in Chapter 3, “Multicast MPLS VPNs”), and other types of networking protocols as well. In the case of
multicast, MBGP carries multicast-specific route prefix information against which routers can RPF check source and
destination multicast traffic. The best part of this protocol arrangement is that MBGP can use all the same tuning and
control parameters for multicast prefix information sharing that apply to regular IPv4 unicast routing in BGP.
The MBGP RFC accomplishes extends BGP reachability information by adding two additional path attributes,
MP_REACH_NLRI and MP_UNREACH_NLRI. NLRI, or Network Learning Reachability Information, is essentially a
descriptive name for the prefix, path, and attribute information shared between BGP speakers. These two additional
attributes create a simple way to learn and advertise multiple sets of routing information, individualized by an address
family. MBGP address families include IPv4 unicast, IPv6 unicast, Multiprotocol Label Switching labels, and, of course,
IPv4 and IPv6 multicast.
The main advantage of MBGP is that it allows AS-border and internal routers to support noncongruent (traffic engineered)
multicast topologies. This is discussed at length in Chapter 5 in IP Multicast, Volume 1. This concept is particularly
relevant to Internet multicast and interdomain multicast. Multicast NLRI reachability information can now be shared with
all the great filtering and route preferencing of standard BGP for unicast, allowing Internet providers to create a specific
feed for multicast traffic.
Configuring and operating MBGP is extraordinarily easy to do, especially if you already have a basic understanding of
BGP configurations for IPv4 unicast. The routing information and configuration is essentially identical standard BGPv4 for
unicast prefixes. The differences between a unicast BGP table and multicast BGP table occur because of specific filtering
or sharing policies implemented per address family, leading to potentially noncongruent tables.
MBGP configuration begins by separating BGP neighbor activations by address family. A non-MBGP configuration
typically consists of a series of neighbor statements with filter and routing parameters. Example 1-2 is a non-MBGP-
enabled configuration on the BR between the other routers in Mcast Enterprises and ISP Blue from Figure 1-12.
Example 1-2 Standard BGP Configuration
Click here to view code image
BR(config)# router bgp 65100
BR(config-router)# neighbor 10.0.0.1 remote-as 65100
BR(config-router)# neighbor 10.0.0.1 update-source Loopback0
BR(config-router)# neighbor 10.0.0.2 remote-as 65100
BR(config-router)# neighbor 10.0.0.2 update-source Loopback0
BR(config-router)# neighbor 10.0.0.3 remote-as 65100
42
Chapter 1 Interdomain Routing and Internet Multicast
BR(config-router)# neighbor 10.0.0.3 update-source Loopback0
BR(config-router)# neighbor 172.23.31.1 remote-as 65003
BR(config-router)# network 10.0.0.0
MBGP commands can be separated into two configuration types: BGP neighborship commands and BGP policy
commands. In IOS-XE and NX-OS, all neighbors and neighbor-related parameters (for example, remote-as, MD5
authentication, update-source, AS pathing info, timers, and so on) must be configured and established under the global
BGP routing process subconfiguration mode. BGP policy commands (such as route-map filters, network statements, and
redistribution), are no longer configured globally but by address family.
Let’s look more closely at an example of an MBGP configuration. Figure 1-13 shows a snapshot of just the external,
unicast, BGP connection between the BR and the border route in ISP Blue, SP3-1.
Sent Rcvd
Opens: 1 1
Notifications: 0 0
Updates: 9 1
Keepalives: 35 36
Route Refresh: 0 0
Total: 45 38
Default minimum time between advertisement runs is 0 seconds
For address family: IPv4 Unicast
Session: 10.0.0.1
BGP table version 18, neighbor version 18/0
Output queue size : 0
Index 4, Advertise bit 1
4 update-group member
Slow-peer detection is disabled
Slow-peer split-update-group dynamic is disabled
Sent Rcvd
Prefix activity: ---- ----
Prefixes Current: 4 0
Prefixes Total: 7 0
Implicit Withdraw: 0 0
Explicit Withdraw: 3 0
Used as bestpath: n/a 0
Used as multipath: n/a 0
Outbound Inbound
Local Policy Denied Prefixes: -------- -------
Total: 0 0
Number of NLRIs in the update sent: max 2, min 0
45
Chapter 1 Interdomain Routing and Internet Multicast
Last detected as dynamic slow peer: never
Dynamic slow peer recovered: never
Refresh Epoch: 1
Last Sent Refresh Start-of-rib: never
Last Sent Refresh End-of-rib: never
Last Received Refresh Start-of-rib: never
Last Received Refresh End-of-rib: never
Sent Rcvd
Refresh activity: ---- ----
Refresh Start-of-RIB 0 0
Refresh End-of-RIB 0 0
For address family: IPv4 Multicast
BGP table version 2, neighbor version 1/2
Output queue size : 0
Index 0, Advertise bit 0
Uses NEXT_HOP attribute for MBGP NLRIs
Slow-peer detection is disabled
Slow-peer split-update-group dynamic is disabled
Sent Rcvd
Prefix activity: ---- ----
Prefixes Current: 0 0
Prefixes Total: 0 0
Implicit Withdraw: 0 0
Explicit Withdraw: 0 0
Used as bestpath: n/a 0
Used as multipath: n/a 0
Outbound Inbound
Local Policy Denied Prefixes: -------- -------
Total: 0 0
Number of NLRIs in the update sent: max 0, min 0
Last detected as dynamic slow peer: never
As mentioned earlier, most autonomous systems use IGPs and EGPs to establish clear demarcation points and borders. The
easiest way to create a clear border around a PIM domain is to simply configure the router to not apply PIM–SM to border
interfaces. Without PIM–SM configuration for those interfaces, no neighborships form between the border router inside
the AS and the border router of the neighboring AS.
What does this mean for the PIM–SM domain border in an interdomain forwarding model? If PIM is not sharing join/leave
information between the neighboring border routers, the second critical element in the three pillars of interdomain design—
the multicast control plane for receiver identification—becomes a problem. Remember that—the router must know about
any legitimate receivers that have joined the group and where they are located in the network.
Here then, is the big question: Is a PIM–SM relationship required between ASs in order to perform multicast interdomain
forwarding? The short answer is yes! There are certainly ways to work around this requirement (such as by using protocol
rules), but the best recommendation is to configure PIM on any interdomain multicast interfaces in order to maintain PIM
neighborships with external PIM routers.
Why is this necessary? The biggest reason is that the local domain needs the join/prune PIM–SM messages for any
receivers that may not be part of the local domain. Remember that without the (*, G) information, the local RP cannot help
the network build a shared tree linking the source and receivers through the RP. Let’s examine this relationship in action in
an example.
Let’s limit the scope to just the PIM relationship between the BR in Mcast Enterprises and SP3-1. In this particular
47
Chapter 1 Interdomain Routing and Internet Multicast
instance, there is a receiver for an Internet multicast group, 239.120.1.1, connected to R2, as shown in Figure 1-14, with
additional PIM configuration on the interface on BR. SP3-1 does not have PIM configured on the interface facing the BR,
interface Ethernet0/0.
As you can see from the show ip mroute 239.120.1.1 command output in Example 1-7, router SP3-1 has no entries for
239.120.1.1. It does not have a PIM relationship to the BR in Mcast Enterprises, so there is no way for the BR to share
that information.
Example 1-7 No ip mroute Entries
Router(Config)#ip msdp [ vrf vrf-name ] peer { peer-name | peer-address } [ connect-source interface-type interface-
number ] [ remote-as as-number ]
Router(config)#no ip msdp [ vrf vrf-name ] peer { peer-name | peer-address }
Syntax Options Purpose
vrf (Optional) Supports the multicast VPN routing and forwarding (VRF) instance.
vrf-name (Optional) Specifies the name assigned to the VRF.
Specifies the Domain Name System (DNS) name or IP address of the router that is to be the
peer-name peer-address
MSDP peer.
connect-source (Optional) Specifies the interface type and number whose primary address becomes the source
interface-type interface-
IP address for the TCP connection. This interface is on the router being configured.
number
(Optional) Specifies the autonomous system number of the MSDP peer. This keyword and
remote-as as-number
argument are used for display purposes only.
RP/0/RP0/CPU0:router(config)#router msdp
RP/0/RP0/CPU0:router(config-msdp)#(no)originator-id type interface-path-id
RP/0/RP0/CPU0:router(config-msdp)#(no) peer peer-address
Syntax Options Purpose
type Specifies the interface type. For more information, use the question mark (?) online help function.
interface-path-id Specifies the physical interface or virtual interface.
peer-address Specifies the IP address or DNS name of the router that is to be the MSDP peer.
Example 1-14 shows a sample MSDP peering configuration on router SP3-2 to peer with the RP and border router, SP2-1
(172.22.0.1), in ISP Green, as depicted in the mock-Internet example.
Example 1-14 Basic MSDP Configuration, SP3-1 to SP2-1
53
Chapter 1 Interdomain Routing and Internet Multicast
Figure 1-15 Mock-Internet Map
As you know, when a source-connected router sees new packets coming from a source, it registers that source with its
local RP. If the local RP is enabled for MSDP, it completes the shared tree and, if validated, it creates a special state entry
called a source active (SA). The local RP then shares that SA information with any MSDP peer RPs in other domains.
A remote RP that receives this SA advertisement validates the source against its own RPF table. Once validated, the
remote RP uses the source and location information to calculate an interdomain forwarding tree, if it has subscribed
receivers for that source’s group in its local PIM–SM domain. It also forwards the SA entry to any other peers to which it
is connected, except for the peer from which it received the SA advertisement. This processing allows a remote RP to learn
about remote active sources, while facilitating the shared-tree and source-tree building process for the local domain,
completing multicast forwarding trees across multiple domains. This means that in Example 1-14, the RP in ISP Blue (SP3-
1) receives SA advertisements from the configured MSDP peer and RP in ISP Green (SP2-1, 172.22.0.1).
Aside from being necessary to interdomain multicast forwarding, MSDP has some inherent advantages. For one, because a
single RP for the entire Internet is not required, no one entity is responsible for the quality of local domain forwarding.
Each AS can implement its own RP strategy and connect to other domains in a best-effort manner, just as with unicast
routing. Administrators can also use policy to control MSDP sharing behavior, which means autonomous systems can use a
single domain strategy for both internal and external forwarding. Shared multicast trees always stay local to the domain,
even when join messages are shared with neighboring domains. When receivers are downstream from the remote domain, a
(*, G) entry is created only for that domain, with branches at the border edge interface. Because RP resources are only
required within the local domain, there are no global resource exhaustion problems or global reliability issues.
MSDP uses TCP port 639 for peering sessions. Also like BGP, MSDP has a specific state machine that allows it to listen
for peers, establish TCP connections, and maintain those connections while checking for accuracy and authorization.
Figure 1-16 illustrates the MSDP state machine described in RFC 3618.
Table 1-5 explains the Events (EX) and Actions (AX) of the state machine shown in Figure 1-16.
Table 1-5 MSDP State Machine Events and Actions
Event/Action Description 54
Chapter 1 Interdomain Routing and Internet Multicast
Event/Action Description
E1 MSDP peering is configured and enabled.
E2 The peer IP address is less than the MSDP source address.
E3 The peer IP address is greater than the MSDP source address.
E4 TCP peering is established (active/master side of the connection).
E5 TCP peering is established (active/master side of the connection).
E6 The connect retry timer has expired.
E7 MSDP peering is disabled.
E8 The hold timer has expired.
E9 A TLV format error has been detected in the MSDP packet.
E10 Any other error is detected.
A1 The peering process begins, with resources allocated and peer IP addresses compared.
A2 TCP is set to Active OPEN and the connect retry timer is set to [ConnectRetry-Period].
A3 TCP is set to Passive OPEN (listen).
The connect retry timer is deleted, and the keepalive TLV is sent, while the keepalive and hold timer are set
A4
to configured periods.
A5 The keepalive TLV is sent, while the keepalive and hold timer are set to configured periods.
A6 The TCP Active OPEN attempt is aborted, and MSDP resources for the peer are released.
A7 The TCP Passive OPEN attempt is aborted, and MSDP resources for the peer are released.
A8 The TCP connection is closed, and MSDP resources for the peer are released.
A9 The packet is dropped.
Figure 1-17, which includes configuration commands, shows how to enable basic MSDP peering between the RP router in
Mcast Enterprises (R1) and the RP for ISP Blue (SP3-2).
55
Chapter 1 Interdomain Routing and Internet Multicast
Example 1-15 MSDP State Machine: debug ip msdp peer
Click here to view code image
*Feb 20 21:33:40.390: MSDP(0): 172.23.0.2: Sending TCP connect
*Feb 20 21:33:40.392: %MSDP-5-PEER_UPDOWN: Session to peer 172.23.0.2 going up
*Feb 20 21:33:40.392: MSDP(0): 172.23.0.2: TCP connection established
*Feb 20 21:33:41.004: MSDP(0): Received 3-byte TCP segment from 172.23.0.2
*Feb 20 21:33:41.004: MSDP(0): Append 3 bytes to 0-byte msg 116 from 172.23.0.2, qs 1
*Feb 20 21:33:41.004: MSDP(0): 172.23.0.2: Received 3-byte msg 116 from peer
*Feb 20 21:33:41.004: MSDP(0): 172.23.0.2: Keepalive TLV
*Feb 20 21:33:41.218: MSDP(0): 172.23.0.2: Sending Keepalive message to peer
*Feb 20 21:33:42.224: MSDP(0): 172.23.0.2: Originating SA message
*Feb 20 21:33:42.224: MSDP(0): start_index = 0, mroute_cache_index = 0, Qlen = 0
*Feb 20 21:33:42.225: MSDP(0): 172.23.0.2: Building SA message from SA cache
*Feb 20 21:33:42.225: MSDP(0): start_index = 0, sa_cache_index = 0, Qlen = 0
*Feb 20 21:33:42.225: MSDP(0): Sent entire sa-cache, sa_cache_index = 0, Qlen = 0
As you can see from the highlighted portions of the debugging output in Example 1-15, MSDP exchanges TLV (Type-
Length-Value) between these TCP peers to maintain multicast source states. All MSDP routers send TLV messages using
the same basic packet format (see Figure 1-18).
Depending on the type of communication, the VALUE field in the TLV message may vary dramatically. As shown in
Figure 1-18, there are four basic, commonly used TLV messages types. MSDP uses these TLVs to exchange information
and maintain peer connections. The message type is indicated in the TYPE field of the packet. These are the four basic
message types:
SA message: Basic advertisement of SA information to a peer, including the RP address on which it was learned, the
source IP, and the group IP.
SA Request message: A request for any known SAs for a given group (usually this occurs because of configuration, in an
attempt to reduce state creation latency).
SA Response message: A response with SA information to an SA Request message.
Keepalive: Sent between peers when there are no current SA messages to maintain the TCP connection.
Each of these message types populates the VALUE field with different information necessary for MSDP processing. Figure
56
Chapter 1 Interdomain Routing and Internet Multicast
1-19 shows an expanded view of the VALUE field parameters for an SA message, and Figure 1-20 illustrates the VALUE
field parameters for a Keepalive message.
It is important to note that the TCP connections used between RP routers are dependent on the underlying IP unicast
network. This means there must be IP reachability between the peers, just as there would be for BGP. However, unlike
with BGP, there is no measuring of the hop count of external systems and no requirement that the RPs be directly
connected. Any MSDP peer must be a proper RPF peer, meaning that an RPF check is applied to received SA
advertisements to ensure loop-free forwarding across domains. There are three MSDP rules that require BGP ASN
checking for loop prevention. The rules that apply to RPF checks for SA messages are dependent on the BGP peerings
between the MSDP peers:
Rule 1: When the sending MSDP peer is also an internal MBGP peer, MSDP checks the BGP MRIB for the best
path to the RP that originated the SA message. If the MRIB contains a best path, the MSDP peer uses that information
to RPF check the originating RP of the SA. If there is no best path in the MRIB, the unicast RIB is checked for a proper
RPF path. If no path is found in either table, the RPF check fails. The IP address of the sending MSDP peer must be same
as the BGP neighbor address (not the next hop) in order to pass the RPF check.
Rule 2: When the sending MSDP peer is also an external MBGP peer, MSDP checks the BGP MRIB for the best
path to the RP that originated the SA message. If the MRIB contains a best path, the MSDP peer uses that information
to RPF check the originating RP of the SA. If there is no best path in the MRIB, the unicast RIB is checked for a proper
RPF path. A best path must be found in either of the tables, or the RPF check fails. After the best path is found, MSDP
checks the first autonomous system in the path to the RP. The check succeeds and the SA is accepted if the first AS in the
path to the RP is the same as the AS of the BGP peer (which is also the sending MSDP peer). Otherwise, the SA is
rejected.
Rule 3: When the sending MSDP peer is not an MBGP peer at all, MSDP checks the BGP MRIB for the best path
to the RP that originated the SA message. If the MRIB contains a best path, the MSDP peer uses that information to
RPF check the originating RP of the SA. If there is no best path in the MRIB, the unicast RIB is checked for a proper RPF
path. If no path is found in either table, the RPF check fails. When the previous check succeeds, MSDP then looks for a
57
Chapter 1 Interdomain Routing and Internet Multicast
best path to the MSDP peer that sent the SA message. If a path is not found in the MRIB, the peer searches the unicast
RIB. If a path is not found, the RPF check fails.
RPF checks are not performed in the following cases:
When the advertising MSDP peer is the only MSDP peer (which is the case if only a single MSDP peer or a default
MSDP peer is configured).
With mesh group peers.
When the advertising MSDP peer address is the originating RP address contained in the SA message.
The advertisement must pass one of the three rules; otherwise, MSDP fails the update and tosses out any received SA
information. MSDP should have a properly populated MBGP table for that interaction to work. Therefore, you can say
that it is a requirement for both MBGP and MSDP to be configured on each RP maintaining interdomain operations.
MSDP makes exceptions for IGP checking, as previously mentioned, but they are not commonly used, except for Anycast
RP. MSDP uses several additional underlying RPF checks to ensure a loop-free forwarding topology. Some of the most
important checks are discussed in subsequent sections of this chapter. RFC 3618 section 10.1.3 lists the primary SA
acceptance and forwarding rules for loop prevention. For a full understanding of MSDP loop prevention mechanisms, refer
to the RFCs that define MSDP, such as 3618. In addition, loop prevention is one of the primary purposes of PIM–SM, and
all standard PIM–SM loop-prevention checks are also deployed.
Source actives (SAs) are the key to MSDP operations. Recall that in PIM–SM mechanics, when the first-hop router (FHR)
receives a multicast packet from a source, its first order of business is to register the source’s IP address with the RP, while
encapsulating the first few packets in generic routing encapsulation (GRE) and sending those to the RP as well. The RP
then takes that source information and builds a shared tree, with the RP as the root, branching toward any known
receivers. It stands to reason, then, that the RP(s) in a given domain can essentially function as an authority on any active
sources sending within the domain.
That’s where MSDP comes in to play. If the RP is also running MSDP, the router takes that source state information and
builds a special table, a record of all the active sources in the network. This table is called the SA cache, and an entry in
the table is known as a source active, hence the name SA. MSDP was created to not only create the SA cache but share it
with other peers, as outlined earlier in this chapter.
This is why MSDP is so functional and was the original standard for the Anycast RP mechanism. Two or more RPs with
the same address collectively have the complete picture of all SAs in the network. If they share that SA information among
themselves, they all have a complete cache, which enables them to act in concert when necessary.
However, Anycast RP mechanics was not the original, functional intent of MSDP. Rather, it was a “discovery” of sorts,
made after its creation. MSDP was specifically designed to bridge IP Multicast domains over the public Internet. This
means that each ISP could independently control multicast update information without relying on another ISP as the
authority or relying on a single Internet-wide RP—which would have been an administrative and security nightmare.
It is important to note that MSDP in and of itself does not share multicast state information or create forwarding trees
between domains. That is still the job of PIM–SM, and PIM–SM is still an absolute requirement within a domain for those
functions. MSDP’s only role is to actively discover and share sources on the network. Let’s review MSDP mechanics,
including the process of updating the SA cache and sharing SA information with other MSDP peers.
The first step in the process of sharing SA information is for the FHR to register an active source with the RP. The RP,
running MSDP, creates an SA entry in the cache and immediately shares that entry with any of its known peers. Let’s take
a look at this in practice in the example network from Figure 1-15, where the router SP3-2 is the RP and MSDP source for
58
Chapter 1 Interdomain Routing and Internet Multicast
ISP Blue. You can use the border router in ISP Green, SP2-1, as the source by sending a ping sourced from its loopback
address, 172.22.0.1. Using the debug ip msdp details and debug ip pim rp 239.120.1.1 commands on SP3-2, you can
watch the active source being learned, populated into the SA cache, updating the PIM MRIB, and then being sent to the
MSDP peer R1. Example 1-16 displays the output of these commands.
Example 1-16 debug ip pim 239.120.1.1 and debug ip msdp details on RP SP3-2
Click here to view code image
*Feb 23 04:36:12.640: MSDP(0): Received 20-byte TCP segment from 172.22.0.1
*Feb 23 04:36:12.640: MSDP(0): Append 20 bytes to 0-byte msg 3432 from 172.22.0.1,
qs 1
*Feb 23 04:36:12.640: MSDP(0): WAVL Insert SA Source 172.22.0.1 Group 239.120.1.1 RP
172.22.0.1 Successful
*Feb 23 04:36:12.643: PIM(0): Join-list: (172.22.0.1/32, 239.120.1.1), S-bit set
*Feb 23 04:36:12.643: PIM(0): Check RP 172.23.0.2 into the (*, 239.120.1.1) entry
*Feb 23 04:36:12.643: PIM(0): Adding register decap tunnel (Tunnel1) as accepting
interface of (*, 239.120.1.1).
*Feb 23 04:36:12.643: PIM(0): Adding register decap tunnel (Tunnel1) as accepting
interface of (172.22.0.1, 239.120.1.1).
*Feb 23 04:36:12.643: PIM(0): Add Ethernet0/0/172.23.1.1 to (172.22.0.1,
239.120.1.1), Forward state, by PIM SG Join
*Feb 23 04:36:12.643: PIM(0): Insert (172.22.0.1,239.120.1.1) join in nbr
172.23.2.1's queue
*Feb 23 04:36:12.643: PIM(0): Building Join/Prune packet for nbr 172.23.2.1
*Feb 23 04:36:12.643: PIM(0): Adding v2 (172.22.0.1/32, 239.120.1.1), S-bit Join
*Feb 23 04:36:12.643: PIM(0): Send v2 join/prune to 172.23.2.1 (Ethernet0/1)
*Feb 23 04:36:12.658: PIM(0): Received v2 Join/Prune on Ethernet0/0 from 172.23.1.1,
to us
*Feb 23 04:37:22.773: MSDP(0): start_index = 0, mroute_cache_index = 0, Qlen = 0
*Feb 23 04:37:22.773: MSDP(0): Sent entire mroute table, mroute_cache_index = 0,
Qlen = 0
As you can see from the debugging output in Example 1-16, SP3-2 first learns of the MSDP SA from RP SP2-1, of ISP
Green, which also happens to be the source. SP3-2’s PIM process then adds the (*, G) entry and then the (S, G) entry, (*,
239.120.1.1) and (172.22.0.1, 239.120.1.1), respectively. Then, when the RP is finished creating the state, MSDP
immediately forwards the SA entry to all peers. There is no split-horizon type loop prevention in this process. You can see
the MSDP SA cache entry for source 172.22.0.1, or any entries for that matter, by issuing the command show ip msdp sa-
cache [x.x.x.x] (where the optional [x.x.x.x] is the IP address of either the group or the source you wish to examine).
Example 1-17 shows the output from this command on SP3-2, running IOS-XE.
Example 1-17 show ip msdp sa-cache on SP3-2
Click here to view code image
SP3-2# show ip msdp sa-cache
MSDP Source-Active Cache - 1 entries
(172.22.0.1, 239.120.1.1), RP 172.22.0.1, MBGP/AS 65002, 00:04:47/00:04:02, Peer
172.22.0.1
Learned from peer 172.22.0.1, RPF peer 172.22.0.1,
SAs received: 5, Encapsulated data received: 1
Note from this output that the SA cache entry includes the (S, G) state of the multicast flow, as well as the peer from which
it was learned and the MBGP AS in which that peer resides. This MBGP peer information in the entry comes from a
combination of the configured MSDP peer remote-as parameter and cross-checking that configuration against the actual
59
Chapter 1 Interdomain Routing and Internet Multicast
MBGP table. That is a loop-prevention mechanism built into MSDP because there is no split-horizon update control
mechanism. In fact, it goes a little deeper than what you see here on the surface.
Just like PIM, MSDP uses RPF checking to ensure that the MSDP peer and the SA entry are in the appropriate place in the
path. At the end of the SA cache entries is an RPF peer statement which indicates that the peer has been checked. You
know how a router RPF checks the (S, G) against the unicast RIB. But how exactly does a router RPF check an MSDP
peer? It consults the BGP table, looking for the MBGP next hop of the originating address; that is the RP that originated
the SA. The next-hop address becomes the RPF peer of the MSDP SA originator. In this example, that is router SP2-1,
with address 172.22.0.1. Any MSDP messages that are received on an interface—not the RPF peer for that originator—are
automatically dropped. This functionality is called peer-RPF flooding. If MBGP is not configured, the unicast IPv4 BGP
table is used instead. However, because of the peer-RPF flooding mechanism built into MSDP, MBGP or BGP is required
for proper cross-domain multicast forwarding. Without this information, you would essentially black hole all foreign
domain traffic, regardless of the PIM and MSDP relationships between domain edge peers.
Note: For MSDP peers within a single domain, such as those used for Anycast RP, there is no requirement for MBGP or
BGP prefixes. In this case, the peer-RPF flooding mechanism is automatically disabled by the router. There is also no
requirement for BGP or MBGP within an MSDP mesh group, as discussed later in this section.
In addition to these checks, there are two other checks that the MSDP-enabled RP performs before installing an SA entry
in the cache and enabling PIM to complete the (S, G) tree toward the source. The first check is to make sure that there is a
valid (*, G) entry in the MRIB table and that the entry has valid interfaces included in the OIL. If there are group
members, the RP sends the (S, G) joins toward the remote source. Once this occurs, the router at the AS boundary
encapsulates the multicast packets from the joined stream and tunnels them to the RP for initial shared tree forwarding.
When the downstream last-hop router (LHR) receives the packets, standard PIM mechanics apply, and eventually a source
tree is formed between the FHR (in this case, the domain border router closest to the source) and the LHR.
The LHR could be a router directly connected to receivers subscribed to the group via IGMP. However, the LHR could
also simply be the downstream edge of the domain where the domain is only acting as transit for the multicast traffic,
bridging the (*, G) and (S, G) trees between two unconnected domains. This is exactly what ISP Blue is doing in the Mcast
Enterprises network, where SP3-1 is the LHR of ISP Green’s multicast domain and AS. In these cases, there may be no
local (*, G) entry at the local RP until the remote, downstream domain registers receivers and joins the (*, G). Until then,
the SA entry is invalid, and the RP does not initiate PIM processing. When a host in the remote downstream domain
subscribes to the group, the LHR in that domain sends a (*, G) to its local RP. Because that RP and the transit domain RP
have an existing valid SA entry, both domains can then join, as needed, the (S, G) all the way back to the remote domain.
The originating RP keeps track of the source. As long as the source continues sending packets, the RP sends SA messages
every 60 seconds. These SA messages maintain the SA state of downstream peers. The MSDP SA cache has an expiration
timer for SA entries. The timer is variable, but for most operating systems, it is 150 seconds. (Some operating systems, such
as IOS-XR, offer a configuration option to change this timer setting.) If a valid, peer-RPF checked SA message is not
received before the entry timer expires, the SA entry is removed from the cache and is subsequently removed from any
further SA messages sent to peers.
The configuration tasks involved with MSDP are fairly simple and straightforward. You start by configuring the MSDP
peering between two RP routers. The configuration commands are shown earlier in this chapter for peer configuration on
IOS-XE. Table 1-6 details the configuration commands and options for basic peering on IOS-XE, IOS-XR, and NX-OS.
For IOS XR, it is important to note that all MSDP configuration commands are entered globally under the router msdp
configuration mode. At each peer configuration, you enter the msdp-peer configuration mode, as indicated by the * in the
IOS-XR section of the table.
Table 1-6 MSDP Peer Commands
Operating
Command 60
Operating Chapter 1 Interdomain Routing and Internet Multicast
Command
System
ip msdp peer{peer-name | peer-address} [connect-source type number]
IOS/XE [remote-as as-number]
peer peer-address
IOS XR *(config-msdp-peer)# remote-as as-number; connect-source type [interface-
path-id]
ip msdp peer peer-ip-address connect-source interface [remote-as as-
NX-OS number]
Note: It is very unlikely that NX-OS on any platform will be found at the AS edge of a domain. It is more common to see
NX-OS cross-domain forwarding at the data center edge. For this reason, many of the more sophisticated MSDP features
are not available in NX-OS. In addition, you must first enable the MSDP feature on a Nexus platform before it is
configurable. You use the feature msdp global configuration mode command to enable MSDP.
Note: MSDP configuration on IOS-XR requires, first, the installation and activation of the multicast package installation
envelope (PIE). After this is done, all PIM, multicast routing, and MSDP configuration commands are available for
execution. Remember that, compared to classic Cisco operating systems, IOS-XR commands are more structured, and
most parameters are entered one at a time rather than on a single line, as shown in Table 1-6.
Like their BGP counterparts, MSDP peers can have configured descriptions for easier identification, and they can use
password authentication with MD5 encryption to secure the TCP peering mechanism. Table 1-7 shows the commands
needed to implement peer descriptions.
Table 1-7 MSDP Peer Descriptions
Table 1-8 shows the commands needed to configure peer security through password authentication.
Table 1-8 MSDP Peer Password Authentication and Encryption
Operating
Command
System
ip msdp [ vrf vrf-name ] password peer { peer-name | peer-address }
IOS/XE [encryption-type] string
peer peer-address
IOS XR
(config-msdp-peer)# password { clear | encrypted } password
NX-OS ip msdp password peer-address password
Note: When a password is configured on a peer, encryption of TCP data packets is not implied. Only the password
exchange is encrypted using MD5 hashing. Each of the commands has an option for entering an unencrypted password.
That unencrypted password option (clear in IOS-XR and encryption-type 0 in IOS-XE) only enables the configuration
entry of a password in plaintext. Otherwise, the MD5 hash value of the password, which shows up after configuration, is
required. For first-time configuration of passwords, it is recommended that you use plaintext. In addition, it is important to
understand that changing a password after peer establishment does not immediately bring down an MSDP peering session.
61
Chapter 1 Interdomain Routing and Internet Multicast
Instead, the router continues to maintain the peering until the peer expiration timer has expired. Once this happens, the
password is used for authentication. If the peering RP has not been configured for authentication with the appropriate
password, then the local peer fails the TCP handshake process, and peering is not established.
Speaking of timers, it is possible to change the default TCP peer timers in MSDP, just as it is with BGP peers. The most
important timer is the MSDP peer keepalive timer. Recall from the discussion of the MSDP state machine that if there are
no SA messages to send to a peer, the TCP must still be maintained. This is done through the use of peer keepalives. If no
keepalive is received within the configured peering hold timer, the session is torn down. Each Cisco operating system has a
different set of defaults for these timers. Table 1-9 shows how to adjust the keepalive timer and the peer hold timer for
each operating system, along with default timer values. These commands are entered per peer.
Table 1-9 MSDP Peer Timer Commands
Operating
Command Default
System
ip msdp [ vrf vrf-name ] keepalive { peer- 60-second keepalive-
IOS/XE address | peer-name } keepalive-interval hold- interval, 75-second hold-
time-interval time-interval
IOS XR No equivalent command. N/A
60-second interval, 90-
NX-OS ip msdp keepalive peer-address interval timeout second timeout
In addition, by default, an MSDP peer waits a given number of seconds after an MSDP peer session is reset before
attempting to reestablish the MSDP connection. Table 1-10 shows the commands needed for each operating system to
change the default timer.
Table 1-10 Adjusting MSDP Peer Reset Timers
Note: With any peer timer, if a timer change is required for a given peering session, it is highly recommended that both
MSDP peers be configured with the same timer.
The commands introduced so far are used to configure and establish peering between two RPs acting as MSDP peers. The
commands affect how the peer TCP session will behave. Now let’s explore additional commands that affect how an
MSDP-configured RP behaves once a peer is established. These additional commands, tuning options for MSDP, and many
of the options are similar to the options available for BGP peers, such as entry timers, origination IDs, and entry filtering.
Perhaps the most important MSDP tuning knob allows you to create a mesh group of MSDP peers. In networking, a mesh
exists when there is a link between each pair of peers, such that every peer has a direct path to every other peer, without
taking unnecessary hops. In some multicast interdomain networks, such as intra-AS interdomain deployments, it is
common have MSDP peers connected together in a mesh. Such a design could significantly reduce the number of MSDP
messages that are needed between peers. Remember that any time an MSDP peer receives and validates an SA entry, it
forwards that entry to all its peers by default. If the peers are connected in a mesh, it is completely unnecessary for all the
mesh peers to duplicate messages that are already sent between peers on the network. The MSDP mesh group commands
configure RP routers to circumvent this behavior. An SA received from one member of the mesh group is not replicated
and sent to any other peers in the same mesh group, thus eliminating potentially redundant SA messages.
A mesh group has another potential advantage for interdomain operations. Because each RP has direct knowledge of the
other SAs in the mesh group, no MBGP is required for MSDP RPF checking. Thus, you can have MSDP without the added
62
Chapter 1 Interdomain Routing and Internet Multicast
complication of MBGP within an internal network scenario, such as Anycast RP. Table 1-11 shows the commands
required to configure a mesh group for MSDP.
Table 1-11 MSDP Mesh Group Commands
Operating
Command
System
ip msdp [ vrf vrf-name ] mesh-group mesh-name { peer-address | peer-
IOS/XE name }
peer peer-address
IOS XR
(config-msdp-peer)# mesh-group name
NX-OS ip msdp mesh-group peer-address name
Note: It is unlikely that you will find MSDP mesh groups outside an AS. Most Internet peering requires MBGP for Internet
multicast. Using mesh groups may be the most efficient way of bridging internal PIM domains using Anycast RP.
When an MSDP router creates a new SA entry, it includes the interface-configured IP address of the RP in the update as
the originator ID. The router uses the originator ID to perform the MSDP RPF check against the MSDP speaker. Usually,
the same interface is used as both the RP and the MSDP peering source. However, there are times when a logical RP is
required, and you need to change the MSDP originator ID to prevent an MSDP RPF failure. The originator-id command
allows an administrator to change the originator ID in advertised SA entries. By default, the originator ID should be the
address of the RP configured on the router. If there are multiple RPs configured on the same router, you need to set the
originator ID manually. If no originator ID is defined, routers use the highest IP configured as RP or, if there is no
configured RP, the highest loopback interface IP as the originator ID. Table 1-12 shows the originator ID commands. In
order to mitigate confusion, it is best practice to define the originator ID manually in all cases to ensure that the originator
ID and the MBGP source IP are the same.
Table 1-12 MSDP Originator ID Commands
If you have ever configured BGP, you know there needs to be a way to shut down a peer without removing it from
configuration. The shutdown command accomplishes this. A similar command exists for MSDP, allowing someone to shut
down the MSDP peering, closing the TCP session with the remote MSDP peer, without removing the peer from
configuration. This simplifies basic MSDP operations. Table 1-13 shows the MSDP shutdown command for each OS. The
shutdown command is issued within the appropriate configuration mode for the peer.
Table 1-13 MSDP Shutdown Commands
63
Chapter 1 Interdomain Routing and Internet Multicast
It may also be necessary at times to clear an MSDP peering session or other MSDP information. You do so by using the
clear command. Like its BGP counterpart, this is not a configuration command but is instead performed from the EXEC
mode of the router. The clear ip msdp peer command simply resets the MSDP TCP session for a specific peer, allowing
the router to immediately flush entries for the cleared peer. The clear ip msdp sa-cache command only flushes the SA
cache table without clearing peering sessions, allowing the router to rebuild the table as it receives ongoing updates.
MSDP peering security through authentication, discussed earlier in this chapter, should be a requirement for any external
peering sessions. MSDP also has built-in capabilities for blocking specific incoming and outgoing SA advertisements. This
is accomplished using SA filters. An SA filter is configured and works very similarly to a BGP filter. The difference is that
an SA filter can specify sources, groups, or RPs to permit or deny. Tables 1-14 and 1-15 show the commands to filter
inbound and outbound by MSDP peer.
Table 1-14 SA Filter In Commands
Operating
Command
System
ip msdp [ vrf vrf-name ] sa-filter in { peer-address | peer-name } [ list
IOS/XE access-list-name ] [ route-map map-name ] [ rp-list { access-list-range |
access-list-name } ] [ rp-route-map route-mapreference ]
peer peer-address
IOS XR (config-msdp-peer)# sa-filter in { list access-list-name | rp-list access-
list-name }
NX-OS ip msdp sa-policy peer-address policy-name in
Operating
Command
System
ip msdp [ vrf vrf-name ] sa-filter out { peer-address | peer-name } [ list
IOS/XE access-list-name ] [ route-map map-name ] [ rp-list { access-list-range |
access-list-name } ] [ rp-route-map route-map reference ]
peer peer-address
IOS XR (config-msdp-peer)# sa-filter out { list access-list-name | rp-list access-
list-name }
NX-OS ip msdp sa-policy peer-address policy-name out
It is also possible to protect router resources on an MSDP speaker by limiting the total number of SA advertisements that
can be accepted from a specific peer. The sa-limit command is used for this purpose. The command is peer specific and is
detailed for each OS in Table 1-16.
Table 1-16 MSDP SA Limit
Operating
Command
System
ip msdp [ vrf vrf-name ] sa-limit { peer-address | peer-name } [ sa-
IOS/XE limit ]
IOS XR No equivalent command. Use the filter commands to control SA cache resource usage.
NX-OS ip msdp sa-limit peer-address limit
Note: The mechanics of an SA limit are very straightforward. The router keeps a tally of SA messages received from the
configured peer. Once the limit is reached, additional SA advertisements from a peer are simply ignored. It is highly
64
Chapter 1 Interdomain Routing and Internet Multicast
recommended that you use this command. The appropriate limit should depend on the usage. A transit Internet MSDP peer
needs a very high limit, whereas an Internet-connected enterprise AS is likely to use a very small limit that protects the
enterprise MSDP speaker from Internet multicast SA leakage.
Many other MSDP commands are useful but are beyond the scope of this discussion. For more information about MSDP
configuration and operations, refer to the multicast command reference for the router and operating system in use. These
command references are available at www.cisco.com.
Let’s look at a very basic network scenario using MSDP and selective MSDP filters across an enterprise multicast domain.
In this use case, the network designer can achieve single-RP deployment for the enterprise and create filters local to the
enterprise multicast boundary by localizing the filters to the local domain.
The only consideration for this design is scalability of total number of MSDP peers. If the enterprise has hundreds of local
domains, scalability of mesh group to 100+ peers needs to be reviewed with a single RP. The diagram in Figure 1-21
illustrates the configuration example for this use case.
66
Chapter 1 Interdomain Routing and Internet Multicast
Click here to view code image
r4# show ip mroute 239.2.2.2
Group(s) 224.0.0.0/4
RP 1.1.1.1 (?), v2
Info source: 1.1.1.1 (?), via bootstrap, priority 0, holdtime 25
Uptime: 05:24:36, expires: 00:00:17
r4#
The MSDP peer group at R2, even though it has a local join for 239.2.2.2, does not receive the flow. R2 is part of the
global RP domain 1.1.1.1. This is illustrated in Example 1-21, using the commands show ip mroute and show ip msdp sa-
cache; no SA message is received.
Example 1-21 R2 MSDP Has No SA-Cache Entry for 239.2.2.2
Click here to view code image
r2# sh ip mroute 239.2.2.2
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
67
Chapter 1 Interdomain Routing and Internet Multicast
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
r2# sh ip msdp sa
r2# sh ip msdp sa-cache
MSDP Source-Active Cache - 0 entries
r2#
Next, the host connected to R1 sends packets to group 239.1.1.1, which functions like a global multicast group with
receivers at R2 and R4. Since there are no filters for 239.1.1.1, it functions as a global enterprise group.
The show ip mroute command output at R2 shows the flow for 239.1.1.1, as displayed in Example 1-22.
Example 1-22 Flow for Group 239.1.1.1
Intra-AS multidomain forwarding is accomplished in multiple ways. Some of the options for structuring domains within an
AS are discussed earlier in this chapter (in the section “What Is a Multicast Domain? A Refresher”). This section shows the
most likely configuration for Mcast Enterprises and examines the configuration of such a network more closely.
For this example, assume that there is one large enterprisewide domain that encompasses all Mcast Enterprises groups,
represented by the multicast supernet 239.0.0.0/10. In addition, Mcast Enterprises has individual domain scopes for each
of its three locations, using 239.10.0.0/16, 239.20.0.0/16, and 230.30.0.0/16, respectively. BGP is configured in a
confederation with the global ASN 65100.
Each of the three routers, R1, R2, and R3, is acting as the local RP for its respective domain. The loopback 0 interface of
R1 is also acting as the RP for the enterprisewide domain. There is external MBGP peering between the BR and SP3-1, as
well as MSDP peering between the loopbacks of R1 and SP3-2. Internally, BGP and MBGP connections are part of a BGP
confederation. There is no global RP; instead, each domain has a single RP for all multicast groups. To bridge the gaps
between domains, MSDP is configured between each peer in a mesh group called ENTERPRISE. This network design is
represented by the network diagram in Figure 1-22.
69
Chapter 1 Interdomain Routing and Internet Multicast
Figure 1-22 Network Diagram for the Mcast Enterprises Final Solution
Note: The design shown in Figure 1-22 depicts the intra-AS network as well as the previously configured external
connections for clarity.
Now that you have the design, you can configure the network. Figure 1-23 shows the physical topology of the network
with the connecting interfaces of each router.
Example 1-24 details the final configurations for each of these routers within the domain, using IOS-XE.
Example 1-24 Final Configurations for Mcast Enterprises
70
Chapter 1 Interdomain Routing and Internet Multicast
Click here to view code image
R1
ip multicast-routing
ip cef
!
interface Loopback0
ip address 10.0.0.1 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
ip address 10.1.4.1 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/1
ip address 10.10.0.1 255.255.255.0
ip pim sparse-mode
interface Ethernet0/2
ip address 10.1.2.1 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/3
ip address 10.1.3.1 255.255.255.0
ip pim sparse-mode
!
router ospf 10
network 10.0.0.0 0.255.255.255 area 0
router-id 10.0.0.1
!
router bgp 65101
bgp router-id 10.0.0.1
bgp log-neighbor-changes
bgp confederation identifier 65100
bgp confederation peers 65120 65130
neighbor 10.0.0.2 remote-as 65120
neighbor 10.0.0.2 ebgp-multihop 2
neighbor 10.0.0.2 update-source Loopback0
neighbor 10.0.0.3 remote-as 65103
neighbor 10.0.0.3 ebgp-multihop 2
neighbor 10.0.0.3 update-source Loopback0
neighbor 10.0.0.4 remote-as 65110
neighbor 10.0.0.4 update-source Loopback0
!
address-family ipv4
neighbor 10.0.0.2 activate
neighbor 10.0.0.3 activate
neighbor 10.0.0.4 activate
exit-address-family
!
address-family ipv4 multicast
neighbor 10.0.0.2 activate
neighbor 10.0.0.3 activate
neighbor 10.0.0.4 activate
exit-address-family
71
Chapter 1 Interdomain Routing and Internet Multicast
!
ip bgp-community new-format
!
ip pim rp-address 10.0.0.1 10 override
ip pim send-rp-announce Loopback0 scope 32 group-list 1
ip pim send-rp-discovery Loopback0 scope 32
ip msdp peer 172.23.0.2 connect-source Loopback0 remote-as 65003
ip msdp peer 10.0.0.2 connect-source Loopback0
ip msdp peer 10.0.0.3 connect-source Loopback0
ip msdp cache-sa-state
ip msdp mesh-group ENTERPRISE 10.0.0.2
ip msdp mesh-group ENTERPRISE 10.0.0.3
!
access-list 1 permit 239.0.0.0 0.63.255.255
access-list 10 permit 239.10.0.0 0.0.255.255
R2
ip multicast-routing
ip cef
!
interface Loopback0
ip address 10.0.0.2 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
ip address 10.20.2.1 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/1
ip address 10.1.2.2 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/3
ip address 10.2.3.2 255.255.255.0
ip pim sparse-mode
!
router ospf 10
network 10.0.0.0 0.255.255.255 area 0
router-id 10.0.0.2
!
router bgp 65102
bgp router-id 10.0.0.2
bgp log-neighbor-changes
bgp confederation identifier 65100
bgp confederation peers 65110 65130
neighbor 10.0.0.1 remote-as 65110
neighbor 10.0.0.1 ebgp-multihop 2
neighbor 10.0.0.1 update-source Loopback0
neighbor 10.0.0.3 remote-as 65130
neighbor 10.0.0.3 ebgp-multihop 2
neighbor 10.0.0.3 update-source Loopback0
!
address-family ipv4
neighbor 10.0.0.1 activate
neighbor 10.0.0.3 activate
72
Chapter 1 Interdomain Routing and Internet Multicast
exit-address-family
!
address-family ipv4 multicast
neighbor 10.0.0.1 activate
neighbor 10.0.0.3 activate
exit-address-family
!
ip bgp-community new-format
!
ip pim rp-address 10.0.0.2 20 override
ip msdp peer 10.0.0.3 connect-source Loopback0
ip msdp peer 10.0.0.1 connect-source Loopback0
ip msdp cache-sa-state
ip msdp mesh-group ENTERPRISE 10.0.0.3
ip msdp mesh-group ENTERPRISE 10.0.0.1
!
access-list 20 permit 239.20.0.0 0.0.255.255
R3
ip multicast-routing
ip cef
!
interface Loopback0
ip address 10.0.0.3 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
no ip address
ip pim sparse-mode
!
interface Ethernet0/1
ip address 10.1.3.3 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/2
ip address 10.2.3.3 255.255.255.0
ip pim sparse-mode
!
router ospf 10
network 10.0.0.0 0.255.255.255 area 0
router-id 10.0.0.3
!
router bgp 65103
bgp router-id 10.0.0.3
bgp log-neighbor-changes
bgp confederation identifier 65100
bgp confederation peers 65110 65120
neighbor 10.0.0.1 remote-as 65110
neighbor 10.0.0.1 ebgp-multihop 2
neighbor 10.0.0.1 update-source Loopback0
neighbor 10.0.0.2 remote-as 65120
neighbor 10.0.0.2 ebgp-multihop 2
neighbor 10.0.0.2 update-source Loopback0
!
73
Chapter 1 Interdomain Routing and Internet Multicast
address-family ipv4
neighbor 10.0.0.1 activate
neighbor 10.0.0.2 activate
exit-address-family
!
address-family ipv4 multicast
neighbor 10.0.0.1 activate
neighbor 10.0.0.2 activate
exit-address-family
!
ip bgp-community new-format
!
ip pim rp-address 10.0.0.3 30 override
ip msdp peer 10.0.0.1 connect-source Loopback0
ip msdp peer 10.0.0.2 connect-source Loopback0
ip msdp cache-sa-state
ip msdp mesh-group ENTERPRISE 10.0.0.1
ip msdp mesh-group ENTERPRISE 10.0.0.2
!
access-list 30 permit 239.30.0.0 0.0.255.255
BR
ip multicast-routing
ip cef
!
interface Loopback0
ip address 10.0.0.4 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
ip address 10.1.4.4 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/1
ip address 172.23.31.4 255.255.255.0
ip pim sparse-mode
!
router ospf 10
passive-interface Ethernet0/1
network 10.0.0.0 0.255.255.255 area 0
network 172.23.31.0 0.0.0.255 area 0
!
router bgp 65101
bgp router-id 10.0.0.4
bgp log-neighbor-changes
bgp confederation identifier 65100
bgp confederation peers 65110 65120 65130
neighbor 10.0.0.1 remote-as 65110
neighbor 10.0.0.1 update-source Loopback0
neighbor 172.23.31.1 remote-as 65003
!
address-family ipv4
network 10.0.0.0
neighbor 10.0.0.1 activate
74
Chapter 1 Interdomain Routing and Internet Multicast
neighbor 172.23.31.1 activate
neighbor 172.23.31.1 soft-reconfiguration inbound
exit-address-family
!
address-family ipv4 multicast
network 10.0.0.0
neighbor 10.0.0.1 activate
neighbor 172.23.31.1 activate
neighbor 172.23.31.1 soft-reconfiguration inbound
exit-address-family
!
ip bgp-community new-format
!
ip route 10.0.0.0 255.0.0.0 Null0
Note: Example 1-24 shows only the configuration commands that are relevant to the network diagram shown in Figure 1-
22. Additional configurations for connecting Mcast Enterprises to ISP Blue are covered earlier in this chapter.
In this configuration, each domain has its own RP. This RP structure does not isolate local domain groups, but it does
isolate domain resources. MBGP shares all necessary multicast RPF entries for the global domain. Because there is a full
OSPF and MBGP mesh, there is no requirement to add the remote-as command option to MSDP peering configuration
statements. The mesh group takes care of that while also reducing extra traffic on the internal network.
Note: The global domain 239.0.0.0/10 was chosen to keep configurations simple. In this configuration, there are no control
ACLs built in for the domains corresponding to AS numbers 65101 through 65103. This is why individual domain groups
are no longer isolated by the configuration. In practice, it is more appropriate to lock down the domains according to a
specific policy. This type of security policy is discussed later in this chapter.
You can verify that you have in fact achieved successful division of domains by using the show ip pim command in
conjunction with specific groups that are controlled by the different RPs. You can also use the show ip mroute command
to verify that the trees have been successfully built to the correct RPs. Look at R2, for example, and see if local group
239.20.2.100 is local to the RP on R2 and if global group 239.1.2.200 is using the RP on R1. Example 1-25 provides this
output.
Example 1-25 Verify Proper Interdomain Segregation
Click here to view code image
R2# show ip pim rp 239.20.2.100
Group: 239.20.2.100, RP: 10.0.0.2, next RP-reachable in 00:00:19
R2#
R2#
R2#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route
75
Chapter 1 Interdomain Routing and Internet Multicast
76
Chapter 1 Interdomain Routing and Internet Multicast
Inter-AS interdomain multicast is very similar to intra-AS. The key difference is, of course, the requirement that MSDP,
MBGP, and PIM be fully operational in each connected domain. The edge of the AS is the control point for cross-domain
traffic.
We have already examined configurations for the Mcast Enterprises network. You can complete the solution by
configuring each of the ISP routers in the complete network. Figure 1-25 shows the interface details for the ISP routers.
SP3-2:
ip multicast-routing
ip cef
78
Chapter 1 Interdomain Routing and Internet Multicast
!
interface Loopback0
ip address 172.23.0.2 255.255.255.255
!
interface Ethernet0/0
ip address 172.23.1.2 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/1
ip address 172.23.2.2 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/2
ip address 172.21.13.2 255.255.255.0
ip pim sparse-mode
!
router ospf 1
passive-interface Ethernet0/1
passive-interface Ethernet0/2
network 172.21.0.0 0.0.255.255 area 0
network 172.23.0.0 0.0.255.255 area 0
!
router bgp 65003
bgp log-neighbor-changes
neighbor 172.21.13.1 remote-as 65001
neighbor 172.23.0.1 remote-as 65003
neighbor 172.23.0.1 update-source Loopback0
neighbor 172.23.2.1 remote-as 65002
!
address-family ipv4
network 172.23.0.0
neighbor 172.21.13.1 activate
neighbor 172.23.0.1 activate
neighbor 172.23.2.1 activate
exit-address-family
!
address-family ipv4 multicast
network 172.23.0.0
neighbor 172.21.13.1 activate
neighbor 172.23.0.1 activate
neighbor 172.23.2.1 activate
exit-address-family
!
ip bgp-community new-format
!
ip pim rp-address 172.23.0.2
ip msdp peer 10.0.0.1 connect-source Loopback0 remote-as 65100
ip msdp peer 172.22.0.1 connect-source Loopback0 remote-as 65002
ip msdp peer 172.21.0.2 connect-source Loopback0 remote-as 65001
ip msdp cache-sa-state
ip route 172.23.0.0 255.255.0.0 Null0
SP2-1:
79
Chapter 1 Interdomain Routing and Internet Multicast
ip multicast-routing
ip cef
!
interface Loopback0
ip address 172.22.0.1 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
ip address 172.23.2.1 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/1
ip address 172.21.12.1 255.255.255.0
ip pim sparse-mode
!
router bgp 65002
bgp log-neighbor-changes
neighbor 172.21.12.2 remote-as 65001
neighbor 172.23.2.2 remote-as 65003
!
address-family ipv4
network 172.22.0.0
neighbor 172.21.12.2 activate
neighbor 172.21.12.2 soft-reconfiguration inbound
neighbor 172.23.2.2 activate
neighbor 172.23.2.2 soft-reconfiguration inbound
exit-address-family
!
address-family ipv4 multicast
network 172.22.0.0
neighbor 172.21.12.2 activate
neighbor 172.21.12.2 soft-reconfiguration inbound
neighbor 172.23.2.2 activate
neighbor 172.23.2.2 soft-reconfiguration inbound
exit-address-family
!
ip bgp-community new-format
!
ip msdp peer 172.23.0.2 connect-source Loopback0 remote-as 65003
ip msdp peer 172.21.0.2 connect-source Loopback0 remote-as 65001
ip msdp cache-sa-state
ip route 172.22.0.0 255.255.0.0 Null0
SP1-1:
ip multicast-routing
ip cef
!
interface Loopback0
ip address 172.21.0.1 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
ip address 172.21.100.1 255.255.255.0
80
Chapter 1 Interdomain Routing and Internet Multicast
ip pim sparse-mode
!
interface Ethernet0/1
ip address 172.21.1.1 255.255.255.0
ip pim sparse-mode
!
router ospf 1
network 172.21.0.0 0.0.255.255 area 0
!
router bgp 65001
bgp log-neighbor-changes
neighbor 172.21.0.2 remote-as 65001
neighbor 172.21.0.2 update-source Loopback0
!
address-family ipv4
network 172.21.0.0
neighbor 172.21.0.2 activate
exit-address-family
!
address-family ipv4 multicast
network 172.21.0.0
neighbor 172.21.0.2 activate
exit-address-family
!
ip bgp-community new-format
!
ip pim rp-address 172.21.0.2
ip route 172.21.0.0 255.255.0.0 Null0
SP1-2:
ip multicast-routing
ip cef
no ipv6 cef
!
interface Loopback0
ip address 172.21.0.2 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
ip address 172.21.13.1 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/1
ip address 172.21.12.2 255.255.255.0
ip pim sparse-mode
!
interface Ethernet0/2
ip address 172.21.1.2 255.255.255.0
ip pim sparse-mode
!
router ospf 1
passive-interface Ethernet0/0
passive-interface Ethernet0/1
81
Chapter 1 Interdomain Routing and Internet Multicast
passive-interface Ethernet1/0
network 172.21.0.0 0.0.255.255 area 0
!
router bgp 65001
bgp log-neighbor-changes
neighbor 172.21.0.1 remote-as 65001
neighbor 172.21.0.1 update-source Loopback0
neighbor 172.21.12.1 remote-as 65002
neighbor 172.21.13.2 remote-as 65003
!
address-family ipv4
network 172.21.0.0
neighbor 172.21.0.1 activate
neighbor 172.21.12.1 activate
neighbor 172.21.13.2 activate
exit-address-family
!
address-family ipv4 multicast
network 172.21.0.0
neighbor 172.21.0.1 activate
neighbor 172.21.12.1 activate
neighbor 172.21.13.2 activate
exit-address-family
!
ip bgp-community new-format
!
ip pim rp-address 172.21.0.2
ip msdp peer 172.22.0.1 connect-source Loopback0 remote-as 65002
ip msdp peer 172.23.0.2 connect-source Loopback0 remote-as 65003
ip msdp cache-sa-state
ip route 172.21.0.0 255.255.0.0 Null0
Now that the ISP networks are configured to carry multicast across the mock Internet, you should be able to connect a
client to any of the ISPs and receive multicast from a server in the Mcast Enterprises network. Use Figure 1-26 as the end-
to-end visual for this exercise.
82
Chapter 1 Interdomain Routing and Internet Multicast
Figure 1-26 Internet Multicast from the Mcast Enterprises Server to the ISP-Red Connected Client
Make Server 2 with IP address 10.20.2.200 a source for group 239.1.2.200 (which resides in the global Mcast Enterprise
domain) by simply using the ping command from the server’s terminal. Notice the ping replies from the client connected to
ISP-1 with IP address 172.21.100.2. If successful, there should be a complete shared tree and a complete source tree at
each RP in the path (R1, SP3-2, and SP1-2), which you can see by using the show ip mroute command on each router.
Example 1-29 shows the execution of the ping on Server 2 and the command output from the RPs.
Example 1-29 Completed Multicast Tree from the Mcast Enterprise Server to the ISP-1 Connected Client
Click here to view code image
Server 2
Server2# ping 239.1.2.200
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.2.200, timeout is 2 seconds:
R1
R1# show ip mroute 239.1.2.200
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route
83
Chapter 1 Interdomain Routing and Internet Multicast
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 239.1.2.200), 00:17:07/stopped, RP 10.0.0.1, flags: SP
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list: Null
SP3-2:
SP3-2# show ip mroute 239.1.2.200
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
84
Chapter 1 Interdomain Routing and Internet Multicast
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 239.1.2.200), 00:11:51/00:03:23, RP 172.21.0.2, flags: S
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Ethernet0/2, Forward/Sparse, 00:11:51/00:03:23
(10.20.2.200, 239.1.2.200), 00:09:12/00:02:06, flags: MT
Incoming interface: Ethernet0/0, RPF nbr 172.21.13.2, Mbgp
Outgoing interface list:
Ethernet0/2, Forward/Sparse, 00:09:12/00:03:23
Success! As you can see, inter-AS interdomain multicast is fairly simple to understand and implement. These basic
principles can be applied across the global Internet or across any multidomain network in which AS boundaries exist.
No Internet-connected organization provides Internet users unfettered access to internal resources. This maxim is not just
true for common unicast IP traffic but is also true, sometimes especially so, for multicast traffic. Remember that multicast
traffic is still, at its core, IP traffic, and so it is vulnerable to nearly any standard IP attack vector. IP multicast may even be
more vulnerable to exploit than unicast. This increased vulnerability occurs because of the nature of multicast traffic, its
reliance on an underlying unicast network, and the additional service protocols enabled/required by multicast.
Multicast packets are very different from unicast packets. Intended receivers could be nearly anywhere geographically. In
a unicast framework, generally the sender has a very specific receiver—one that can be authorized for communications
through a two-way TCP exchange. No such mechanism exists for multicast. Senders rarely know who is subscribed to a
feed or where they may be located. This is known as a push data model, and there is rarely an exchange of data between
senders and receivers. It is very difficult to authenticate, encrypt, or control distribution in a push data model. Remember
that the main purpose of multicast is increased efficiency, and you must make sacrifices in certain other network aspects—
such as control, centralized resources, and security—to achieve it.
Furthermore, if the underlying IP unicast network is not secure, the multicast overlay is equally vulnerable to exploit or
error. Any time a multicast domain is implemented, every effort should be taken to secure the underlying unicast
infrastructure, and additional measures should be taken for reliability and resilience. An unsecured unicast network makes
for an unsecured multicast network.
Chapter 5 in IP Multicast, Volume 1 discusses at length how to internally secure a multicast domain, as well as how to
protect the domain border. However, most of those protections prevent leakage of multicast messages inside or outside the
domain or protect domain resources, such as RP memory, from overuse. Such protections are even more important in a
multidomain scenario, especially if internal domains are exposed to Internet resources. It is strongly suggested that you
review the material from IP Multicast, Volume 1 to ensure that internal domain resources are protected as well as possible.
In addition to these measures, further considerations for the domain border should be accounted for. One such
consideration is the use of firewalls in the multicast domain. Firewalls can simultaneously protect both the unicast and
multicast infrastructures inside a zone, domain, or autonomous system. You may have noticed that there is a transparent
firewall (FW1) in all the network designs for Mcast Enterprises. It is best practice to always have a firewall separating
secure zones, such as the public Internet from internal network resources. Also consider additional services that are
required for or enabled by multicast. This is especially true at the domain border. Clearly, if you are going to connect
multicast domains to other domains outside your immediate control, you must be very serious about securing that domain
and the domain border. The following sections examine some of these items more closely.
Firewalling IP Multicast
85
Chapter 1 Interdomain Routing and Internet Multicast
There are two ways to implement traffic handling in a network firewall: in L2 transparent mode or in L3 routed mode.
Each has different implications on multicast traffic handling. For example, routed mode on a Cisco ASA, by default, does
not allow multicast traffic to pass between interfaces, even when explicitly allowed by an ACL. Additional multicast
configuration is required.
Note: An in-depth discussion of the ASA and Firepower firewall is beyond the scope of this text. For more information
about multicast support on these devices, please look to Cisco’s published command references, design guides, and
configuration guides. They provide specific, in-depth sections dedicated to IP multicast and firewall configuration.
In transparent mode, firewalling multicast is much easier. L2 transparent mode can natively pass IP multicast traffic
without additional configuration. There are two major considerations when using transparent mode to secure a multicast
domain:
Multicast usage on any management interfaces of the firewall: Many network configuration and control tools can use
multicast for more efficient communications across the management plane of the network. If this is a requirement for your
transparent firewall, you need to explicitly permit multicast traffic on the management interface. Typically, the
management interface is a fully functional IP routed interface that is walled off from the other interfaces on the firewall,
and consequently it needs this additional configuration for multicast operation.
The passing of multicast packets between firewall zones: In an L2 configuration, this should be very simple to
accomplish. Each firewall manufacturer has different configuration parameters for permitting multicast traffic across the
zone boundaries. An architect or engineer needs to understand the role and capabilities of each firewall and what is
required for multicast to work both through the firewall and to the firewall.
As a rule, firewalling multicast between critical security zones is recommended, just as for any other traffic. Multicast is
still IP, which carries with it many of the same vulnerabilities as regular unicast traffic. Additional multicast-specific
vulnerabilities may also apply. These should be mitigated to the extent possible to secure vital infrastructure traffic. For
configuring multicast on Cisco firewalls (ASA and Firepower), look for multicast security toolkits and product-specific
design guides at www.cisco.com.
Perhaps the most important way to protect a multicast domain with external sources or receivers is to filter out certain
prefixes from participation. Remember that MSDP speakers are also RPs, which play a critical role in the network. You
need to protect MSDP routers from resource exhaustion and attack vectors such as denial of service (DoS). Proper filtering
of MSDP messages can also limit this type of exposure to vulnerability.
IP Multicast, Volume 1 discusses methods of closing a domain completely to outside multicast or PIM traffic. That, of
course, does not work the same in an interdomain scenario. Properly filtering traffic is, therefore, a must. There are many
ways to implement interdomain filtering, and the best security strategy incorporates all of them. You should focus on three
essential types of filters that should be deployed in every domain that allows cross-domain communications:
Domain boundary filters
MSDP filters
MBGP filters
The first and most obvious type of filter is applied at the domain border router, specifically on PIM border interfaces. PIM
uses boundary lists to block unwanted sources and groups. Table 1-17 shows the commands for configuring boundary lists.
Table 1-17 Multicast Boundary Commands
Operating Command
86
Operating Command Chapter 1 Interdomain Routing and Internet Multicast
System
ip multicast boundary access-list [ filter-autorp | block source | in |
IOS/XE out ]
IOS XR (config-mcast-default-ipv4-if)# boundary access-list
NX-OS ip pim jp-policy policy-name [ in | out ]
Note: The boundary command acts differently on each OS, and the NX-OS equivalent is to set up a PIM join/prune
policy. The in and out keywords specify the direction in which the filter applies on an interface. For NX-OS and IOS-XE,
when the direction is not specified, both directions are assumed by default. For more information about this command in
each OS, please refer to the most recent published command reference.
These commands allow PIM to permit or deny tree building for specific (S, G) pairs at a specified interface. When an (S,
G) is denied, PIM does not allow that interface to become a part of the source tree. Additional consideration for IOS-XE
can be taken for shared trees, (*, G), by using an all-0s host ID in the access control list (ACL). However, this is only part
of a good boundary filter policy. In addition to implementing the boundary command, border interface ACLs should also
include basic filtering for packets destined to group addresses inside a domain. This is especially relevant on Internet- or
extranet-facing interfaces. Be sure your border ACLs block incoming multicast for any unauthorized group access.
Firewalls can also help in this regard.
Even with this protection at the edge, your network could still be vulnerable to other types of multicast-based attacks. For
example, if the network edge is ever misconfigured and MBGP is not properly filtered, the domain could quickly become
transit for traffic in which it should not be a forwarding participant. Resource exhaustion is another common attack vector
to which an administrator should pay special attention. You should consider standard BGP protections and protecting RP
resources on MSDP speakers that peer externally.
Standard MBGP filters can prevent learning unwanted source prefixes from specific networks. Remember that MBGP
prefix entries are simply RPF entries for a router to check against. Network engineers should configure filters that prevent
improper RPF checks. This type of filtering functions and is configured in exactly the same manner as any other BGP
filtering. You can simply deny incoming or outgoing advertisements of unwanted source prefixes. For more information on
BGP prefix filtering, refer to current published documentation at www.cisco.com.
Let’s look at this type of filtering in practice on the Mcast Enterprise network, with the added elements shown in Figure 1-
26 (in the previous section). In this scenario, the sole purpose of Mcast Enterprise’s participation in Internet interdomain
multicast is to make a multicast service available to the public Internet. The server connected to R2 with IP address
10.20.2.200 is the source of this traffic, and receivers can be located on the public Internet. The public stream from this
server is using group 239.1.2.200.
There is no reason for any outside, public, multicast stream to reach receivers within the Mcast Enterprises AS. Therefore,
Mcast Enterprises should implement the following filters to protect the domain and other internal infrastructure:
Place an inbound ACL blocking all incoming multicast traffic from the Internet on BR interface E0/0. (Be sure to allow
all PIM routers on 224.0.0.13.)
Place an ACL that allows multicast traffic only for group 239.1.2.200 and all PIM routers on the same BR interface in an
outbound direction.
Place an inbound route filter on the BR’s MBGP peer with SP3-2 that prevents any external source prefixes from being
learned.
Place an MBGP advertisement filter on the same peering on the BR, allowing advertisement of only the source prefix
required for group 239.1.2.200 (in this case, a summary route for 10.2.0.0/16).
Place an inbound MSDP SA filter on router R1 to prevent any inbound SA learning.
87
Chapter 1 Interdomain Routing and Internet Multicast
Place an outbound MSDP SA filter on R1 to allow the sharing of the SA cache entry for (10.20.2.200, 239.1.2.200) only.
Example 1-30 shows these additional configuration elements for routers BR and R1, using simple access list and route
maps for simplicity.
Example 1-30 Filtering Configuration for Public Internet Services
Click here to view code image
BR
interface Ethernet0/1
ip address 172.23.31.4 255.255.255.0
ip access-group MCAST_IN in
ip access-group MCAST_OUT out
ip pim sparse-mode
!
router bgp 65101
!
address-family ipv4 multicast
neighbor 172.23.31.1 route-map MBGP_IN in
neighbor 172.23.31.1 route-map MBGP_OUT out
!
ip access-list MCAST_IN
permit ip host 224.0.0.13 any
deny ip 224.0.0.0 15.255.255.255 any
permit ip any any
ip access-list standard MCAST-OUT
permit 239.1.2.200
permit 224.0.0.13
deny 224.0.0.0 15.255.255.255
permit any
ip access-list standard PUBLIC_SOURCE
permit 10.2.0.0 0.0.255.255
!
!
route-map MBGP_OUT permit 10
match ip address PUBLIC_SOURCE
!
route-map MBGP_OUT deny 20
!
route-map MBGP_IN deny 10
R1
ip msdp peer 172.23.0.2 connect-source Loopback0 remote-as 65003
ip msdp sa-filter in 172.23.0.2 route-map MSDP_IN
ip msdp sa-filter out 172.23.0.2 list MSDP_OUT
!
ip access-list extended MSDP_OUT
permit ip any host 239.1.2.200
!
!
route-map MSDP_IN deny 10
Note: The transparent firewall discussed earlier can act as a secondary safeguard against unwanted inbound multicast
88
Chapter 1 Interdomain Routing and Internet Multicast
traffic if the filtering on the BR fails. In addition, the firewall can be implemented to help prevent internal multicast
leakage to the BR and beyond, protecting sensitive internal communications. An optional ACL could also be added at R1
to prevent the other internal domains of Mcast Enterprises from leaking to the firewall, thus easing the processing burden
on the firewall ASICs. In addition, the router ACLs given in this example are rudimentary in nature. In practice, these
elements would be added to much more effective and extensive ACLs that include standard AS border security elements.
As you can see, there are many places and ways to enable domain filtering. Each one serves a specific purpose and should
be considered part of a holistic protection strategy. Do not assume, for example, that protecting the domain border is
enough. Domain resources such as RP memory on MSDP speakers must also be protected.
Another element of filtering that will become obvious as it is implemented is the need to have concise and useful domain
scoping. Because the network in the example is properly scoped, writing policy to protect applications is relatively
straightforward. Poor scoping can make filtering extremely difficult—maybe even impossible—in a very large multidomain
implementation. Therefore, scoping should not only be considered an essential element of good domain and application
design but also an essential element of multicast security policy.
The final step in securing the multidomain multicast solution is to lock down any services at the edge. This keeps the
domain secure from resource overutilization and unauthorized service access attempts. You should consider killing IGMP
packets from external networks, limiting unwanted unicast TCP ACKs to a source, and using Time-to-Live (TTL) scoping
for multicast packets.
Locking down IGMP and eliminating TCP Acknowledgements (ACKs) is very straightforward. These should be added to
any inbound border ACLs at the edge of the AS or domain. TTL scoping is discussed at length in Chapter 5 in IP
Multicast, Volume 1.
The only other thing to consider for service filtering at the edge is the use of Session Advertisement Protocol (SAP). SAP is
a legacy protocol that harkens back to the days of MBONE. SAP and its sister protocol Session Description Protocol
(SDP) were used to provide directory information about a service that was offered via IP multicast. The idea was to
advertise these services, making it easier for clients to subscribe to. However, through the course of time, the Internet
community found it was simply easier and safer to hard-code addresses.
In more recent years, SAP and SDP have become an attack vector for multicast specific DDoS assaults. There is no reason
to have SAP running in your network. Many Cisco operating systems shut it down by default. However, the authors feel it
is better to be certain and deconfigure SAP services on multicast-enabled interfaces in the network, regardless of default
settings. This should be considered a crucial and important step for any multicast design, regardless of domain scope. The
IOS-XE command for disabling SAP is no ip sap listen, and it is entered at the interface configuration mode prompt. For
the corresponding command in other operating systems, please see the command references at www.cisco.com.
As mentioned earlier in this chapter, most multicast networks are built using Any-Source Multicast (ASM), with PIM
Sparse-Mode (PIM–SM) acting as the multicast tree-builder. It is obvious that PIM–SM with MBGP and MSDP is the de
facto standard for Internet-based interdomain multicast. Is there a way, though, to achieve similar results without all the
additional source learning and active source sharing among domain RPs?
The answer is a resounding yes! Interdomain multicast can also be achieved by using the Source-Specific Multicast (SSM)
model without all the headaches of MBGP and MSDP. In addition, IPv6 includes ways to implement an ASM PIM–SM
model that does not require MSDP or MBGP. Let’s very quickly examine the differences in how to implement interdomain
multicast much more simply using these models.
SSM
89
Chapter 1 Interdomain Routing and Internet Multicast
Remember the three pillars of interdomain design? Here they are, listed again for your convenience:
The multicast control plane for source identification: The router must know a proper path to any multicast source,
either from the unicast RIB or learned (either statically or dynamically) through a specific RPF exception.
The multicast control plane for receiver identification: The router must know about any legitimate receivers that have
joined the group and where they are located in the network.
The downstream multicast control plane and MRIB: The router must know when a source is actively sending packets
for a given group. PIM–SM domains must also be able to build a shared tree from the local domain’s RP, even when the
source has registered to a remote RP in a different domain.
In an SSM-enabled domain, the third pillar is addressed inherently by the nature of the SSM PIM implementation. When a
receiver wants to join an SSM group, it must not only specify the group address but also specify the specific source it
wishes to hear from. This means that every time a receiver subscribes to a group, the last-hop router (LHR; the router
connected to the receiver) already knows where the source is located. It will only ever build a source tree directly to the
source of the stream. This means there is no RP required! It also means that SA caching is not needed for this type of
communication, and no shared trees are required either.
If the source is in another domain, PIM routers simply share the (S, G) join directly toward the source, regardless of its
location. If the domains are completely within an AS, it is also very unlikely that MBGP is necessary to carry RPF
information for sources as the source is generally part of a prefix entry in the IGP-based RIB of each internal router.
Multicast domains can still be segregated by scoping and border ACLs (which should be a requirement of any domain
border, regardless of PIM type), ensuring that you have security in place for multicast traffic.
Note: BGP or MBGP is still required, of course, if you are crossing IGP boundaries. Using MBGP may, in fact, still be the
best way of controlling source RPF checks between domains, but its use should be dictated by the overall design.
The SSM interdomain model is therefore substantially easier to implement. Consider what would happen to the intra-AS
design at Mcast Enterprises. Figure 1-27 redraws the final solution Mcast Enterprises network shown in Figure 1-22 but
using SSM. As you can see, this is a much simpler design, with no MBGP, no MSDP, and no RPs.
Using IPv6 for interdomain multicast is very simple. For the most part, the configuration of the networks previously shown
is identical, only using IPv6 addressing and IPv6 address families to complete the configurations. IPv6-based intra-AS
interdomain multicast does have a big advantage over its IPv4 counterpart. IPv6 can simplify the deployment of RPs and
eliminate the need for MSDP by using embedded RP.
The embedded RP function of IPv6 allows the address of the RP to be embedded in an IPv6 multicast message. When a
downstream router or routers see the group address, the RP information is extracted, and a shared tree is immediately built.
In this way, a single centrally controlled RP can provide RP services for multiple domains. This solution works for both
interdomain and intra-domain multicast.
The format of the embedded RP address is shown in Figure 1-28 and includes the following:
91
Chapter 1 Interdomain Routing and Internet Multicast
Scope:
0000–0: Reserved
0001–1: Node-local scope
0010–2: Link-local scope
0011–3: Unassigned
0100–4: Unassigned
0101–5: Site-local scope
0110–6: Unassigned
0111–7: Unassigned
1000–8: Organization-local scope
1001–9: Unassigned
1010–A: Unassigned
1011–B: Unassigned
1100–C: Unassigned
1101–D: Unassigned
1110–E: Global scope
1111–F: Reserved
RIID (RP Interface ID): Anything except 0.
Plen (prefix length): Indicates the number of bits in the network prefix field and must not be equal to 0 or greater than
64.
To embed the RP address in the message, the prefix must begin with FF70::/12, as shown in Figure 1-29.
You need to copy the number of bits from the network prefix as defined by the value of plen. Finally, the RIID value is
appended to the last four least significant bits, as shown in Figure 1-30.
92
Chapter 1 Interdomain Routing and Internet Multicast
93
Chapter 1 Interdomain Routing and Internet Multicast
!
interface Ethernet0/2
no ip address
load-interval 30
ipv6 address 2001:192:168:41::1/64
ipv6 enable
ipv6 ospf 65000 area 0
!
ipv6 pim rp-address 2001:192::1
!
ipv6 router ospf 65000
router-id 192.168.0.1
hostname R2
ipv6 unicast-routing
ipv6 multicast-routing
!
interface Loopback0
ip address 192.168.0.2 255.255.255.255
ipv6 address 2001:192:168::2/128
ipv6 enable
ipv6 ospf 65000 area 0
!
interface Ethernet0/0
no ip address
ipv6 address 2001:192:168:21::2/64
ipv6 enable
ipv6 ospf 65000 area 0
!
interface Ethernet0/1
no ip address
ipv6 address 2001:192:168:32::2/64
ipv6 enable
ipv6 ospf 65000 area 0
!
interface Ethernet0/2
no ip address
ipv6 address 2001:192:168:52::2/64
ipv6 enable
ipv6 ospf 65000 area 0
!
ipv6 router ospf 65000
router-id 192.168.0.2
hostname R3
ipv6 unicast-routing
ipv6 multicast-routing
!
interface Loopback0
ip address 192.168.0.3 255.255.255.255
ipv6 address 2001:192:168::3/128
94
Chapter 1 Interdomain Routing and Internet Multicast
ipv6 enable
ipv6 ospf 65000 area 0
!
interface Ethernet0/0
no ip address
load-interval 30
ipv6 address 2001:192:168:31::3/64
ipv6 enable
ipv6 ospf 65000 area 0
!
interface Ethernet0/1
no ip address
ipv6 address 2001:192:168:32::3/64
ipv6 enable
ipv6 ospf 65000 area 0
!
interface Ethernet0/2
no ip address
ipv6 address 2001:192:168:63::3/64
ipv6 enable
ipv6 mld join-group FF73:105:2001:192::1
ipv6 ospf 65000 area 0
!
ipv6 router ospf 65000
router-id 192.168.0.3
As you can see from the configurations in Example 1-31, there isn’t anything too fancy. The highlighted commands on R1
defines R1 as the RP using the loopback 0 interface, with the ipv6 pim rp-address 2001:192::1 command, and the second
is on R3, which statically defines a join group by using the ipv6 mld join-group FF73:105:2001:192::1 command.
Note: The ipv6 mld join-group command should be used only temporarily, for troubleshooting purposes only.
You may have noticed that the only router with an RP mapping is R1. Because you are embedding the RP information in
the multicast message, it is not necessary to define an RP on every router.
From R2, you can watch the behavior in action by using a simple ping command. As shown in Example 1-32, you can ping
the FF73:105:2001:192::1 address configured as a join group on R3.
Example 1-32 Embedded RP Example
Summary
This chapter reviews the fundamental requirements for interdomain forwarding of IP multicast flows. An understanding of
PIM domains and how they are built on the three pillars of interdomain design is critical for architecting this type of
forwarding. Remember that these are the three pillars:
The multicast control plane for source identification: The router must know a proper path to any multicast source,
either from the unicast RIB or learned (either statically or dynamically) through a specific RPF exception.
The multicast control plane for receiver identification: The router must know about any legitimate receivers that have
96
Chapter 1 Interdomain Routing and Internet Multicast
joined the group and where they are located in the network.
The downstream multicast control plane and MRIB: The router must know when a source is actively sending packets
for a given group. PIM–SM domains must also be able to build a shared tree from the local domain’s RP, even when the
source has registered to a remote RP in a different domain.
Multicast BGP, PIM, and MSDP satisfy the requirements of the three pillars. With these protocols, you should be able to
configure any multidomain or interdomain network, including designs that are both internal and cross the public Internet.
This chapter also reviews ways to eliminate the use of MSDP by using SSM or IPv6 embedded RP within the network.
References
RFC 3306
RFC 7371
RFC 5771
RFC 3956
RFC 4607
RFC 3446
RFC 3618
RFC 7606
RFC 4760
RFC 2283
RFC 1930
RFC 6996
97
Chapter 2 Multicast Scalability and Transport Diversification
Chapter 2
Multicast Scalability and Transport Diversification
Public cloud services are very commonly used in enterprise networks, and it is therefore important to understand how multicast
messages are carried to and from a cloud service provider. Transportation of multicast messages requires consideration of several
factors, especially when the cloud service providers do not support native multicast. This chapter introduces the key concepts of cloud
services and explains the elements required to support multicast services.
Enterprise customers tend to adopt cloud services for a number of reasons, including agile provisioning, lower cost of investment,
reduced operational and capital expense, closeness to geocentric user population, and speed of developing a product. Figure 2-1
illustrates the most common types of cloud models.
Connectivity for an enterprise to a cloud service provider can be achieved in three different ways:
Internet-based connectivity
Direct connectivity to the cloud provider
Cloud broker–based connectivity to the cloud provider
With Internet-based connectivity, the enterprise is connected to the cloud service provider via an Internet connection, as shown in
Figure 2-2. The enterprise normally uses an encrypted virtual private network (VPN) service to the virtual data center located at the
cloud provider. The enterprise customer may leverage the VPN-based service managed by the cloud service provider or a self-managed
service.
99
Chapter 2 Multicast Scalability and Transport Diversification
Many enterprise architects prefer having control of the services in their tenant space within the cloud infrastructure. The concept of
network functions virtualization (NFV) becomes a feasible solution especially when using IaaS cloud service.
NFV involves implementing in software network service elements such as routing, load balancing, VPN services, WAN optimization,
and firewalls. Each of these services is referred to as a virtual network function (VNF). The NFV framework stitches together VNFs by
using service chaining. In this way, provisioning of the network services can be aligned with service elements. The NFV elements can be
100
Chapter 2 Multicast Scalability and Transport Diversification
automated using the same workflow related to the application services, thus making them easier to manage. These cloud services can
then provide a solution for enterprise features to be available in an IaaS infrastructure.
Using a CSR 1000v device as a VNF element provides a customer with a rich set of features in the cloud. The CSR 1000v is an IOS-XE
software router based on the ASR 1001. The CSR 1000v runs within a virtual machine, which can be deployed on x86 server hardware.
There are a lot of commonalities between the system architecture for the CSR 1000v and the ASR 1000. As shown in Table 2-1, a CSR
1000v provides an enterprise feature footprint in the cloud.
Table 2-1 Enterprise Feature Footprint Covered by a CSR 1000v
Note: Use the Cisco feature navigator tool, at www.cisco.com/go/fn, to see the latest features available with the CSR 1000v.
Service reflection is a multicast feature that provides a translation service for multicast traffic. It allows you to translate externally
received multicast or unicast destination addresses to multicast or unicast addresses. This feature offers the advantage of completely
isolating the external multicast source information. The end receivers in the destination network can receive identical feeds from two
ingress points in the network. The end host can then subscribe to two different multicast feeds that have identical information. The
ability to select a particular multicast stream is dependent on the capability of the host, but provides a solution for highly available
multicast.
Cisco multicast service reflection runs on Cisco IOS software that processes packets forwarded by Cisco IOS software to the Vif1
101
Chapter 2 Multicast Scalability and Transport Diversification
interface. The Vif1 interface is similar to a loopback interface; it is a logical IP interface that is always up when the router is active. The
Vif1 interface has its own unique subnet, which must be advertised in the routing protocol. The Vif1 interface provides private-to-public
mgroup mapping and the source of the translated packet. The Vif1 interface is key to the functionality of service reflections. Unlike IP
Multicast Network Address Translation (NAT), which only translates the source IP address, service reflection translates both source and
destination addresses. Figure 2-6 illustrates a multicast-to-multicast destination conversion use case.
Data:
Output removed for brevity…
By using the show ip mroute command output at R2, the 224.1.1.1 stream is converted with service reflection to 239.1.1.1, and the
incoming interface (IIF) for 239.1.1.1 is the VIF, as shown in Example 2-4.
Example 2-4 show ip mroute at R2
Example 2-6 shows the unicast flow 10.5.1.3 converted to multicast flow 239.1.1.1 at R2. The VIF interface configuration at R2 (for
service reflection) with the same subnet as 10.5.1.x redirects the packet to the VIF interface. At the VIF interface, 10.5.1.3 is converted
to 239.1.1.1 and sourced from 10.5.1.2.
Example 2-6 show interface vif1 at R2
Click here to view code image
R2# show run int vif1
Building configuration...
Current configuration : 203 bytes
!
interface Vif1
ip address 10.5.1.1 255.255.255.0
ip service reflect Ethernet0/0 destination 10.5.1.3 to 239.1.1.1 mask-len 32 source 10.5.1.2
ip pim sparse-mode
end
Prior to enabling the multicast source, you need to use the IGMP join group configuration for 239.1.1.1 to ensure that R4 has no (S, G)
for 239.1.1.1 by using show ip mroute (see Example 2-7).
Example 2-7 show ip mroute at R4
105
Chapter 2 Multicast Scalability and Transport Diversification
Click here to view code image
R4# show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 239.1.1.1), 00:00:02/00:02:57, RP 192.168.2.2, flags: SJCL
Incoming interface: Ethernet0/0, RPF nbr 10.1.3.1
Outgoing interface list:
Loopback0, Forward/Sparse, 00:00:02/00:02:57
(*, 224.0.1.40), 00:00:02/00:02:57, RP 192.168.2.2, flags: SJCL
Incoming interface: Ethernet0/0, RPF nbr 10.1.3.1
Outgoing interface list:
Loopback0, Forward/Sparse, 00:00:02/00:02:57
There is no (S, G) entry for the multicast flow of 239.1.1.1 at R4.
Next, you enable the unicast stream from 10.1.1.1 and check the sniffer between the source (10.1.1.1) and R2, as shown in Example 2-
8. The unicast stream is generated via an Internet Control Message Protocol (ICMP) ping.
Example 2-8 Sniffer Capture of the Unicast Stream Before Conversion
Click here to view code image
=============================================================================
22:38:22.425 PST Thu Jan 5 2017 Relative Time: 27.684999
Packet 31 of 76 In: Ethernet0/0
Ethernet Packet: 114 bytes
Dest Addr: AABB.CC00.0200, Source Addr: AABB.CC00.0100
Protocol: 0x0800
106
Chapter 2 Multicast Scalability and Transport Diversification
The show ip mroute command output at R2 shows that the 239.1.1.1 source entry with IIF has the Vif1 interface. The VIF interface
converts the 10.1.1.1 unicast stream into a multicast stream with group address 239.1.1.1, as shown in Example 2-9.
Example 2-9 show ip mroute at R2
Click here to view code image
R2# show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
108
Chapter 2 Multicast Scalability and Transport Diversification
Click here to view code image
=============================================================================
22:49:43.470 PST Thu Jan 5 2017 Relative Time: 11:48.729999
Packet 1534 of 4447 In: Ethernet0/0
Ethernet Packet: 298 bytes
Dest Addr: 0100.5E01.0101, Source Addr: AABB.CC00.0100
Protocol: 0x0800
IP Version: 0x4, HdrLen: 0x5, TOS: 0x00
Length: 284, ID: 0x0000, Flags-Offset: 0x0000
TTL: 60, Protocol: 17 (UDP), Checksum: 0xE4A8 (OK)
Source: 10.1.1.1, Dest: 224.1.1.1
UDP Src Port: 0 (Reserved), Dest Port: 0 (Reserved)
Length: 264, Checksum: 0x64B5 (OK)
Data:
0 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
20 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
40 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
60 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
80 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
100 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
120 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
140 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
160 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
180 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
200 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
220 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 ....................
240 : 0000 0000 0000 0000 0000 0000 0000 0000 ................
In Example 2-13, the current multicast state at R2 shows the (*, G) and (S, G) entries for the 224.1.1.1 multicast group.
Example 2-13 show ip mroute at R2
IP Multicast, Volume 1, Chapter 5, “IP Multicast Design Considerations and Implementation,” provides a good overview of various
methods of multicast traffic engineering, including the following:
Multipath feature
ip mroute statements or MBGP for multicast path selections
Feature Usage
110
Feature Usage Chapter 2 Multicast Scalability and Transport Diversification
Say that multiple unicast paths exist between two routers, and the administrator wants to load-split the multicast traffic. (The
default behavior of RPF is to choose the highest IP address as next hop for all the (S, G) flow entries.) Configuring load
splitting with the ip multicast multipath command causes the system to load-split multicast traffic across multiple equal-
cost paths based on the source address, using the S-hash algorithm. This feature load-splits the traffic and does not load-
Multipath balance the traffic. Based on the S-hash algorithm, the multicast stream from a source uses only one path. The PIM joins are
feature distributed over the different equal-cost multipath (ECMP) links based on a hash of the source address. This enables streams
to be divided across different network paths. The S-hash method is used to achieve a diverse path for multicast data flow
that is split between two multicast groups to achieve redundancy in transport of the real-time packets. The redundant flow
for the same data stream is achieved using an intelligent application that encapsulates the same data in two separate
multicast streams.
ip mroute Say that two equal-cost paths exist between two routers, and the administrator wishes to force the multicast traffic through
statements one path. The administrator can use a static ip mroute statement to force the Reverse Path Forwarding (RPF) check
or MBGP through an interface of choice.
for In a large network with redundant links, to achieve the separation of the multicast traffic from the unicast, a dynamic way is
multicast more desirable. This is achieved by using the Border Gateway Protocol (BGP) multicast address family. With BGP address
path families, the multicast network needs to be advertised, and the next-hop prefix needs to be resolved via a recursive lookup
selections in the Interior Gateway Protocol (IGP) to find the upstream RPF interface.
Multicast
Multicast support using a tunnel overlay infrastructure can be used to add a non-enterprise-controlled network segment that
via
does not support multicast.
tunnels
Multicast within the tunnel infrastructure is a key feature in current cloud deployments to support multicast features over non-enterprise
segments. Multicast across point-to-point GRE tunnels is simple, and the only consideration is the RPF interface. The selection for the
MRIB interface should be the tunnel interface. The other overlay solution is Dynamic Multipoint VPN (DMVPN).
DMVPN is a Cisco IOS software solution for building scalable IPsec VPNs. DMVPN uses a centralized architecture to provide easier
implementation and management for deployments that require granular access controls for diverse user communities, including mobile
workers, telecommuters, and extranet users. The use of multipoint generic routing encapsulation (mGRE) tunnels with Next Hop
Resolution Protocol (NHRP) creates an overlay solution that can be used for adding various features such as encryption, policy-based
routing (PBR), and quality of service (QoS). The use of the NHRP feature with mGRE allows for direct spoke-to-spoke communication,
which is useful for voice over IP traffic or any other communication except multicast. With the DMVPN overlay solution, the multicast
communication must pass through the hub, which manages replication for the spokes. The replication at the hub is a design
consideration that needs to be contemplated for multicast designs on an overlay network. For example, if the multicast stream needs to
be transmitted to 100 spokes and the stream size is 200MB, the collective multicast stream after replication is 200MB × 100 spokes,
which is 2GB. In this case, the outbound WAN link needs to accommodate 2GB multicast, and the platform at the hub should be able to
replicate those multicast flows. The WAN link also needs to accommodate the replicated multicast stream, and this should be
considered during the capacity planning stage. Figure 2-9 illustrates the considerations.
111
Chapter 2 Multicast Scalability and Transport Diversification
Figure 2-10 Design to Facilitate Regional Multicast Replication and Unicast Interregional Direct Communication
The expected communication in this scenario includes the following:
Unicast (applicable to all branches in the region):
Branch A can communicate directly to Branch B (direct spoke-to-spoke communication within the region)
112
Chapter 2 Multicast Scalability and Transport Diversification
Branch A can communicate directly to Branch C without going through the hub (direct spoke-to-spoke interregional communication)
Branch A can directly communicate with the hub (regional spoke communication with the central data center connected to the hub)
Multicast (applicable to all branches in the region):
Branch C can send or receive multicast (region-specific multicast) to Branch D (localized replication at a regional hub)
Branches A and C can send or receive multicast but must traverse through the central hub and can be part of the global multicast
(interregional multicast communication via the DMVPN hub)
Branches A, B, C, and D can receive global multicast from the hub (with a data center connected to the global hub then to a regional
hub for multicast communication)
Figure 2-11 provides a configuration example of this DMVPN’s overlay design.
Figure 2-11 Lab Topology Design to Facilitate Regional Multicast Replication and Unicast Spoke-to-Spoke Communication
Example 2-16 gives the configuration for DMVPN with multicast at the regional hub R3.
Example 2-16 Configuration Snapshot of Regional Hub R3
To show unicast spoke-to-spoke intra-regional communication, this section reviews the NHRP configuration at the regional hub. The
NHRP mapping shows regional hub 10.0.1.3. (10.0.1.3 is the tunnel IP address at R3 and is the next-hop server.) Example 2-18 uses the
show ip nhrp command on R5 to reveal the spoke-to-spoke communication stats.
Example 2-18 show ip nhrp on R5
In Example 2-21, R7 (located in another region) initiates a ping to R5 (where loopback 100 at R5 is 10.0.100.101). Observe the
dynamic tunnel created for spoke-to-spoke communication.
Example 2-21 Dynamic Tunnel Creation on R7
From R5, a ping to 10.0.100.100 (loopback 100 at R2) shows a region communicating with the central hub through a dynamic tunnel
created within to the same NHRP domain. The highlighted output in Example 2-22 shows the creation of the dynamic tunnel for the
unicast flow. 10.0.0.1 is the tunnel IP address at R2, 10.0.1.3 is the tunnel IP address at R3, 10.0.1.5 is the tunnel IP address at R5, and
10.0.1.6 is the tunnel IP address at R6.
Example 2-22 Dynamic Tunnel Usage
Click here to view code image
R5# ping 10.0.100.100 source lo 100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.100.100, timeout is 2 seconds:
Packet sent with a source address of 10.0.100.101
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
R5# show ip nhrp
10.0.0.1/32 via 10.0.0.1
Tunnel3 created 00:00:02, expire 00:05:57
Type: dynamic, Flags: router nhop rib
NBMA address: 192.168.2.2
10.0.1.3/32 via 10.0.1.3
Tunnel3 created 3d09h, never expire
Type: static, Flags: used
NBMA address: 192.168.3.33
10.0.1.6/32 via 10.0.1.6
Tunnel3 created 00:04:20, expire 01:55:39
Type: dynamic, Flags: router used nhop rib
NBMA address: 192.168.6.6
10.0.100.100/32 via 10.0.0.1
Tunnel3 created 00:00:02, expire 00:05:57
Type: dynamic, Flags: router rib nho
NBMA address: 192.168.2.2
10.0.100.101/32 via 10.0.1.5
Tunnel3 created 00:04:20, expire 01:55:39
Type: dynamic, Flags: router unique local
NBMA address: 192.168.5.5
(no-socket)
10.0.100.102/32 via 10.0.1.6
Tunnel3 created 00:04:20, expire 01:55:39
Type: dynamic, Flags: router rib nho
NBMA address: 192.168.6.6
The source from the central hub transmits a multicast stream (239.1.1.1), and the receiver for this flow is at R5 (a regional spoke
router). In Example 2-23, the show ip mroute command shows R5 receiving the multicast flow from the hub.
Example 2-23 R5 Receiving the Flow
118
Chapter 2 Multicast Scalability and Transport Diversification
Click here to view code image
R5# show ip mroute 239.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
Looking at the multicast stream from the source at R5 (239.192.1.1) to the receiver at R6, the show ip mroute command at R6 shows
the formation of (*, G) and (S, G) states. Example 2-24 shows the output of R6.
Example 2-24 R6 mroute Table
Click here to view code image
R6# show ip mroute 239.192.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 239.192.1.1), 4d03h/stopped, RP 10.0.100.103, flags: SJCLF
Incoming interface: Tunnel3, RPF nbr 10.0.1.3
Outgoing interface list:
Loopback100, Forward/Sparse, 4d03h/00:02:44
119
Chapter 2 Multicast Scalability and Transport Diversification
(10.0.100.101, 239.192.1.1), 00:00:10/00:02:49, flags: LJT
Incoming interface: Tunnel3, RPF nbr 10.0.1.3
Outgoing interface list:
Loopback100, Forward/Sparse, 00:00:10/00:02:49
120
Chapter 2 Multicast Scalability and Transport Diversification
One of the best ways to implement QoS in an overlay infrastructure is with DMVPN. With DMVPN, you can do traffic shaping at the
hub interface on a per-spoke or per-spoke-group basis. Using dynamic QoS policies, you can configure multiple branch locations in one
QoS template from the hub. The application of the policy to the spoke is dynamic when the spoke comes up.
Note: This book does not aim to review the details of the current or future DMVPN implementations. Instead, it provides an
introduction to mGRE, which simplifies administration on the hub or spoke tunnels.
This section reviews how to enable multicast in a cloud service with a cloud broker. Figure 2-12 shows the first use case.
With direct VC (dedicated circuit point-to-point provided by the NSP), it is very simple to enable multicast. Segment A needs to
terminate at the colocation (COLO) facility provided at the carrier-neutral facility. The termination is in the COLO facility, where the
enterprise hosts network devices that offer enterprise-class features.
Segment B provides the transport of multicast from the COLO facility to the CSP. Multicast feature support is not available in the CSP
network. In this case, the design engineer uses the service reflection feature and converts the multicast stream to a unicast stream.
Instead of using a direct VC connection, the customer may leverage the SP-provided MPLS Layer 3 VPN service. Multicast support
with the MPLS-VPN SP needs to be verified. If it is supported, the customer can leverage the MPLS-VPN SP for transporting multicast
traffic. Segment A needs to terminate at the COLO facility provided at the carrier-neutral location. The termination is in the COLO
facility where the enterprise hosts network devices for providing enterprise-class features. The termination is based on IP features that
extend the connection to the MPLS-VPN cloud with control plane features such as QoS and routing protocols.
Segment B controls the transport of multicast traffic from the COLO facility to the CSP. Multicast feature support is generally not
available in the CSP. A similar design principle of using the service reflection feature to convert multicast to a unicast stream can be
used here.
121
Chapter 2 Multicast Scalability and Transport Diversification
In this use case, the enterprise data center does not have connectivity to a cloud broker. The connectivity to the CSP is direct via an
NSP. Connectivity for Segment A has two options: direct VC access and Internet access. Figure 2-13 illustrates this use case.
With direct VC access, the enterprise requires network services hosted in the virtual private cloud (VPC). This is accomplished by using
a VNF routing instance (such as a CSR 1000v). The enterprise can either buy these instances or rent them from the CSP. Here are two
options for the transport of multicast across the NSP:
Conversion of multicast to unicast can be done at the enterprise data center, using the service reflection feature. In this case, VNF in
the cloud is not necessary for multicast.
GRE tunnels can be used to transport multicast from the enterprise data center to the CSR 1000v hosted in the VPC. At the CSR
1000v, the administrator can use the service reflection feature to convert multicast to unicast. Using CSR 1000v provides visibility and
support of enterprise-centric features.
Note: There’s a third, non-Cisco-specific option, which is not scalable. It is possible to terminate GRE on the compute instance at both
ends. For example, Linux supports GRE tunnels directly in the AWS VPC. However, with this solution, you lose visibility of and
support for enterprise-centric features.
The connectivity for Segment A from the enterprise to the CSP is via the Internet. Multicast support is possible through two options:
Create an overlay GRE network from the data center to the enterprise VPC located at the CSP. The tunnel endpoints are between a
data center router and a CSR 1000v hosted at the enterprise VPC. Once the traffic hits the VPC, the administrator can use the multicast
service reflection feature to convert the multicast feed to unicast.
If the enterprise does not host a CSR 1000v instance at the VPC, the conversion of multicast to unicast is done at the enterprise data
center router.
Summary
Enterprises are more and more adopting the use of public cloud services. It is therefore important to understand the techniques and
considerations involved in transporting multicast to and from a cloud service provider—especially when the cloud service providers do
not support native multicast. This chapter reviews different cloud models that an enterprise may use (including SaaS, IaaS, and PaaS)
and the various network access types to the cloud. Service reflection is an important feature used to convert multicast to a unicast
stream or vice versa. Using NFV with a CSR 1000v is a powerful way for an enterprise to provide enterprise network features in an IaaS
public cloud environment. Because multicast functionality is currently not supported in public cloud, it is important to understand cloud
network access types and how to use NFV or physical hardware with enterprise features such as service reflection and DMVPN to help
provide multicast service in the public IaaS environment.
122
Chapter 3 Multicast MPLS VPNs
Chapter 3
Multicast MPLS VPNs
The ability to logically separate traffic on the same physical infrastructure has been possible for many years. Most service providers (SPs) and
many enterprise customers implement Multiprotocol Label Switching (MPLS) in order to be able to separate or isolate traffic into logical
domains or groups, generally referred to as a virtual private network (VPN). A VPN can separate traffic by customer, job function, security
level, and so on. MPLS uses an underlying protocol called Label Distribution Protocol (LDP) to encapsulate messages destined for a particular
VPN. A VPN can be made up of several types of devices, including switches, routers, and firewalls. Isolating messages on a switch creates a
virtual local area network (VLAN); on a router, Virtual Routing and Forwarding (VRF) instances are used; and a firewall separates traffic by
using virtual contexts. The litmus test for virtualization really boils down to one simple question: Do you have the ability to support overlapping
IP address space?
This chapter discusses the function of multicast in an MPLS VPN environment, focusing on routing or VRF. In order to establish a foundation, a
clear definition of terms is necessary:
A provider (P) device is also referred to as a label-switched router (LSR). A P device runs an Interior Gateway Protocol (IGP) and Label
Distribution Protocol (LDP). You may also find the term P device used regarding the provider signaling in the core of a network.
A provider edge (PE) device is also referred to as an edge LSR (eLSR). A PE device not only runs IGP and LDP but also runs Multiprotocol
Border Gateway Protocol (MP-BGP). A PE device imposes, removes, and/or swaps MPLS labels.
A customer edge (CE) device is a Layer 3 (L3) element that connects to a PE device for routing information exchange between the customer
and the provider.
Customer (C) or overlay refers to the customer network, messages, traffic flows, and so on. C also refers to customer signaling.
Note: Tag Distribution Protocol (TDP) is not covered in this chapter as it has not been the protocol of choice for many years.
Note: For additional information on LDP, see RFC 5036, “LDP Specification.”
Figure 3-1 shows how these elements are connected. It is used throughout the chapter to explain the concepts of multicast VPNs.
The first method developed for transporting multicast messages within an MPLS VPN network was GRE, which is widely used in current
multicast deployments. Eric Rosen along with several others developed this strategy for transporting multicast messages over a unicast MPLS
VPN infrastructure. Today, the Rosen method encompasses two fundamental techniques for multicast transport, the first using GRE and the
second using MLDP. What differentiates the Rosen methodology is the concept of a default multicast distribution tree (MDT) and a data MDT.
What defines the Rosen method is the use of an overlay to provide multicast over multicast. This method is also referred to as default MDT,
default MDT-GRE, the Rosen model, the draft Rosen model, and Profile 0. (We like to keep things confusing as it adds to job security.)
Note: For additional information, see “Cisco Systems’ Solution for Multicast in MPLS/BGP IP VPNs” (https://tools.ietf.org/html/draft-rosen-
vpn-mcast-15). Also see RFC 7441, “Encoding Multipoint LDP Forwarding Equivalence Classes (FECs) in the NLRI of BGP MCAST-VPN
Routes.”
Default MDT
The default MDT includes all the PE routers that participate in a specific VPN. This is accomplished by using the same route distinguisher (RD)
or through the use of importing and exporting routes with a route target (RT). If a group of routers within a VPN are exchanging unicast routing
information, they are also in the same default MDT, and all receive the same unicast messages.
The default MDT is used as a mechanism for the PE routers within a VPN to exchange multicast messages with each other. For this to occur, the
underlying infrastructure, which includes the IGP, MP-BGP, and Protocol Independent Multicast (PIM), must all be functioning correctly. CE
equipment uses the default MDT to exchange multicast control messages. Messages are encapsulated over the core network using IP in IP with
a GRE header, as shown in Figure 3-2, when using the default MDT.
Note: Example 3-12 shows the packet capture and provides additional detail.
Consider the default MDT as full-mesh or as a Multipoint-to-Multipoint (MP2MP) tree. When the default MDT has been configured, it is in an
active state, or always operational, and is used to transmit PIM control messages (hello, join, and prune) between routers. Any time a multicast
message is sent to the default MDT, all multicast routers participating in that VPN receive that message, as shown in Figure 3-3.
Data MDT
The Data MDT is the mechanism used to eliminate multicast data messages from being sent to every PE that participates in a specific VPN;
instead, only those PE routers interested in the multicast forwarding tree for the specific group receive those multicast messages. Consider the
data MDT as a Point-to-Multipoint (P2MP) tree with the source or root of the tree at the ingress PE device (that is, the PE router closest to the
125
Chapter 3 Multicast MPLS VPNs
source of the multicast stream generated from the customer network).
The data MDT is not a unique or standalone multicast implementation mechanism but is a subset of the default MDT. As shown in Figure 3-4,
the multicast sender is in the top-right corner, and the receiver is near the bottom-right side of the diagram. The data MDT establishes a separate
tunnel from PE source to PE receiver(s) in the fashion that you would expect from a proper multicast implementation. In addition, replication of
the multicast stream can also occur on a P device.
The MTI is the tunnel interface that is automatically created for each multicast VPN. With the GRE MDT model, multicast traffic within each
VPN is transported using a GRE tunnel. As shown in Example 3-5, when you use the show vrf RED command for IOS-XE, two GRE tunnels
are created. In this case, one is for the default MDT and the other for the data MDT. (The Data MDT is within the defined threshold.)
Example 3-5 Multicast Tunnel Interface Using IOS-XE
Click here to view code image
R4# show vrf RED
Name Default RD Protocols Interfaces
RED 65000:1 ipv4,ipv6 Et0/2
Tu1
Tu2
With IOS-XR, you can view the tunnel interface by using the show mfib vrf RED interface command. Example 3-6 shows the output from R4.
Example 3-6 show mfib Command Output
Click here to view code image
RP/0/0/CPU0:R4# show mfib vrf RED interface
Wed Feb 15 23:50:00.358 UTC
Interface : GigabitEthernet0/0/0/2 (Enabled)
SW Mcast pkts in : 23982, SW Mcast pkts out : 11376
TTL Threshold : 0
Ref Count : 6
Interface : mdtRED (Enabled)
SW Mcast pkts in : 0, SW Mcast pkts out : 0
TTL Threshold : 0
Ref Count : 11
For IOS-XR, the tunnel interface is mdtRED or mdt[VRF name].
Both the default MDT and the data MDT require that the provider network transport multicast messages. Three options can be used to build
PIM in the provider network:
PIM Sparse-Mode (PIM-SM)
Bidirectional PIM (Bidir-PIM)
PIM Source-Specific Multicast (PIM-SSM) or just SSM
This book only covers the configuration of PIM-SSM, which is the recommended method. PIM-SM and Bidir-PIM are explained in detail in IP
Multicast, Volume 1.
You may recall that the requirement for PIM-SSM is that all the clients must use Internet Group Management Protocol version 3 (IGMPv3) and
that there are no requirements for a rendezvous point (RP). Fortunately, all Cisco devices that support MPLS also support IGMPv3.
Implementing PIM-SSM is very simple when multicast routing is enabled, as shown in the following configurations.
On IOS-XE devices, use the following command: ip pim ssm default
Example 3-7 is a sample configuration using IOS-XR. SSM is enabled by default for the 232.0.0.0/8 range. You may choose to specify a range
by using an access control list (ACL), as shown.
127
Chapter 3 Multicast MPLS VPNs
Example 3-7 SSM Configuration Example
Click here to view code image
router pim
address-family ipv4
ssm range SSS-RANGE
!
ipv4 access-list SSM-RANGE
10 permit ipv4 232.0.0.0/8 any
What could be easier? No RP propagation mechanism, and you do not have even have to build in RP high availability! Remember that you must
enable PIM-SSM on all P and PE devices in the network.
SSM is the only one of the three methods that does not have the ability to auto-discover the PIM neighbors in the default MDT using PIM.
Therefore, you need to use another auto-discovery mechanism: the Border Gateway Protocol (BGP) address family MDT subsequent address
family identifiers (MDT-SAFI). After the neighbors have been discovered, SSM can then be used to send PIM messages to those devices. This is
configured under router bgp, using the MDT address family as shown in Example 3-8, which is for IOS-XE.
Example 3-8 BGP MDT Address Family
Click here to view code image
router bgp 65000
!
address-family ipv4 mdt
neighbor 192.168.0.1 activate
neighbor 192.168.0.1 send-community both
neighbor 192.168.0.2 activate
neighbor 192.168.0.2 send-community both
exit-address-family
Example 3-9 shows the same configuration using IOS-XR. Neighbor groups and session groups are used here for configuration simplicity, and
only pertinent information is shown. Remember also that a route policy must be configured in order to send or receive BGP information.
Example 3-9 BGP MDT Configuration Using IOS-XR
Click here to view code image
route-policy ALLOW-ALL
pass
end-policy
!
router bgp 65000
bgp router-id 192.168.0.4
address-family ipv4 unicast
!
address-family vpnv4 unicast
!
address-family ipv4 mdt
!
session-group AS65000
remote-as 65000
update-source Loopback0
!
neighbor-group AS65000
use session-group AS65000
address-family ipv4 unicast
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
address-family vpnv4 unicast
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
128
Chapter 3 Multicast MPLS VPNs
!
address-family ipv4 mdt
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
neighbor 192.168.0.1
use neighbor-group AS65000
!
neighbor 192.168.0.2
use neighbor-group AS65000
Because route reflectors are used in this network, R4 (ingress PE connected to the receiver) only needs to establish an adjacency to those
devices.
You verify the BGP adjacency within a specific VRF by using the show ip bgp ipv4 mdt vrf RED command, as shown for IOS-XE in Example
3-10.
Example 3-10 Verifying MDT BGP Adjacency Using IOS-XE
129
Chapter 3 Multicast MPLS VPNs
*> 192.168.0.4/96 0.0.0.0 0 i
*>i192.168.0.5/96 192.168.0.5 100 0 i
* i 192.168.0.5 100 0 i
*>i192.168.0.6/96 192.168.0.6 100 0 i
* i 192.168.0.6 100 0 i
Processed 4 prefixes, 7 paths
Because the previous command was executed on R4, notice the loopback addresses of the other routers within the RED VPN.
Figure 3-5 provides a better understanding of the interaction between devices and the behavior of multicast in an MPLS environment.
The IP address of H20 (sender) is 172.16.12.20, the IP address of H24 (receiver) is 172.16.16.24, and the RP has an IP address in R11 of
172.16.3.11. H20 is sending a multicast stream to 224.1.1.20 at about 75 packets per second.
The sender (H20) begins to send traffic to the multicast group 224.1.1.20, but in this example, no receivers have registered to accept the stream.
The gateway router (R12) forwards a special register message to the RP, registering the (*, G) and the gateway as a leaf in the tree. When H24
requests the multicast stream of 224.1.1.20, R16 sends a join message to the RP (R11). After a series of multicast messages, the tree switches to
the source tree (S, G) (172.16.12.20, 224.1.1.20).
Note: Refer to Chapter 3, “IP Multicast at Layer 3,” in IP Multicast, Volume 1 for a detailed description of the shared-to-source tree process.
All the previous communication from the RED VPN occurs over the default MDT, which is sent to every PE device in the RED VPN.
Multicast traffic is now being sent from R4, the ingress router, to all PE devices in the RED VPN. The ingress device (R4) monitors the rate of
the session. When the preconfigured threshold is crossed, the ingress router (R4) sends a PIM messages to all PE routers in the RED VPN,
indicating the (S, G) (C-(S, G)) entry and the provider (S, G) (P-(S, G)) entry. The egress PE routers in the RED VPN that are interested in
receiving the multicast stream send a join message to the ingress PE (R4). R4 then switches to the data MDT in three seconds unless it is
configured for an immediate switchover.
To observe the behavior of the default MDT, this section shows how to reconfigure R4 to switch over to the data MDT at the highest
configurable rate by using the mdt data threshold 4294967 command, as shown for IOS-XE in Example 3-12 and IOS-XR in Example 3-13.
Example 3-12 MDT Data Threshold Configuration Example Using IOS-XE
Click here to view code image
vrf definition RED
130
Chapter 3 Multicast MPLS VPNs
rd 65000:1
!
address-family ipv4
mdt default 232.0.0.1
mdt data 232.0.1.0 0.0.0.255 threshold 4294967
mdt data threshold 4294967
route-target export 65000:1
route-target import 65000:1
exit-address-family
Example 3-13 MDT Data Threshold Configuration Example Using IOS-XR
Click here to view code image
route-policy PIM-Default
set core-tree pim-default
end-policy
vrf RED
address-family ipv4
rpf topology route-policy PIM-Default
interface GigabitEthernet0/0/0/2
multicast-routing
vrf RED
address-family ipv4
interface all enable
mdt default ipv4 232.0.0.1
mdt data 232.0.1.0/24 threshold 4294967
!
!
!
This keeps all multicast streams in the default MDT, unless of course a stream exceeds the threshold of 4,294,967 Kbps, which is not likely in
this environment. You could also remove the mdt data commands altogether as an alternative solution.
As shown in Figure 3-3, all routers in the same VPN are connected using the same default MDT. You can observe this behavior by using the
show ip pim vrf RED neighbor command for IOS-XE, as shown in Example 3-14.
Example 3-14 Showing the PIM Neighbor for VRF RED Using IOS-XE
Click here to view code image
R4# show ip pim vrf RED neighbor
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
P - Proxy Capable, S - State Refresh Capable, G - GenID Capable,
L - DR Load-balancing Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
172.16.4.12 Ethernet0/2 00:21:10/00:01:43 v2 1 / DR S P G
192.168.0.3 Tunnel1 00:20:19/00:01:35 v2 1 / S P G
192.168.0.6 Tunnel1 00:20:19/00:01:38 v2 1 / DR S P G
192.168.0.5 Tunnel1 00:20:19/00:01:37 v2 1 / S P G
The equivalent command for IOS-XR is show pim vrf RED neighbor, as shown in Example 3-15.
Example 3-15 Showing the PIM Neighbor for VRF RED Using IOS-XE
Click here to view code image
RP/0/0/CPU0:R4# show pim vrf RED neighbor
Thu Feb 16 00:26:01.310 UTC
131
Chapter 3 Multicast MPLS VPNs
PIM neighbors in VRF RED
Flag: B - Bidir capable, P - Proxy capable, DR - Designated Router,
E - ECMP Redirect capable
* indicates the neighbor created for this router
Neighbor Address Interface Uptime Expires DR pri Flags
172.16.4.4* GigabitEthernet0/0/0/2 3d23h 00:01:19 1 B E
172.16.4.12 GigabitEthernet0/0/0/2 3d00h 00:01:22 1 (DR) P
192.168.0.3 mdtRED 3d23h 00:01:24 1
192.168.0.4* mdtRED 3d23h 00:01:40 1
192.168.0.5 mdtRED 3d23h 00:01:32 1
192.168.0.6 mdtRED 3d23h 00:01:31 1 (DR)
From the output in Examples 3-14 and 3-15, notice that R3, R5, and R6 are PIM adjacent neighbors over Tunnel 1 for IOS-XE and mdtRED
for IOS-XR, and there is also a PIM neighbor on the VRF RED interface, which is CE router R12.
You can look at a traffic capture between R5 and R9, as shown in Example 3-16, to get a tremendous amount of information about this
exchange.
Example 3-16 Packet Capture on the Link Between R5 and R9
Click here to view code image
Ethernet Packet: 578 bytes
Dest Addr: 0100.5E00.0001, Source Addr: AABB.CC00.0920
Protocol: 0x0800
IP Version: 0x4, HdrLen: 0x5, TOS: 0x00
Length: 564, ID: 0xCA86, Flags-Offset: 0x0000
TTL: 252, Protocol: 47, Checksum: 0x4966 (OK)
Source: 192.168.0.4, Dest: 232.0.0.1
GRE Present: 0x0 ( Chksum:0, Rsrvd:0, Key:0, SeqNum:0 )
Reserved0: 0x000, Version: 0x0, Protocol: 0x0800
IP Version: 0x4, HdrLen: 0x5, TOS: 0x00
Length: 540, ID: 0x0000, Flags-Offset: 0x0000
TTL: 58, Protocol: 17 (UDP), Checksum: 0xE593 (OK)
Source: 172.16.12.20, Dest: 224.1.1.20
UDP Src Port: 37777, Dest Port: 7050
Length: 520, Checksum: 0xB384 ERROR: F071
Data: *removed for brevity*
The first item to note is that R5 is not in the receiving path between H20 and H24. Why is R5 even seeing the multicast stream? The behavior of
the default MDT is to send multicast messages to every PE in the VPN.
Starting at the top, notice that the L2 destination address is 0100.5E00.0001. This L2 multicast address maps to the default MDT multicast IP
address 232.0.0.1. The IP header shows the source IP address 192.168.0.4, which is the loopback IP address of R4. In the next section is a GRE
header, and below that is the original IP Multicast message. The encapsulated IP datagram shows the source address 172.16.12.20 (H20),
destined to the multicast IP address 224.1.1.20. Finally, the UDP section shows the source port 37777 and the destination port 7050.
Very simply, the IP Multicast traffic from the RED VRF is encapsulated using the configured default MDT for VRF RED, hence the GRE
tunnel.
MTI Example
You can determine what MTI the multicast stream is using by checking the state of the multicast routing table in the RED VRF for 224.1.1.20
by using the show ip mroute command for IOS-XE. Example 3-17 displays this output for VRF RED on R4.
Example 3-17 Verifying MTI Using IOS-XE
132
Chapter 3 Multicast MPLS VPNs
Click here to view code image
R4# show ip mroute vrf RED 224.1.1.20
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 224.1.1.20), 01:05:53/stopped, RP 172.16.3.11, flags: SP
Incoming interface: Tunnel1, RPF nbr 192.168.0.3
Outgoing interface list: Null
(172.16.12.20, 224.1.1.20), 00:00:39/00:02:20, flags: T
Incoming interface: Ethernet0/2, RPF nbr 172.16.4.12
Outgoing interface list:
Tunnel1, Forward/Sparse, 00:00:39/00:02:55
In the output in Example 3-17, notice that the multicast stream (S, G) in (172.16.12.20, 224.1.1.20) is incoming on Ethernet0/2 and outgoing on
Tunnel 1.
You can look at the MTI of R4 by using the show interfaces tunnel 1 command and compare that with the previous packet capture. The tunnel
source IP address matches with 192.168.0.4, the tunnel in multi-GRE/IP. Notice that the five-minute output rate also matches. Example 3-18
shows this output.
Example 3-18 Verifying MTI Details with show interfaces tunnel
Click here to view code image
R4# show interfaces tunnel 1
Tunnel1 is up, line protocol is up
Hardware is Tunnel
Interface is unnumbered. Using address of Loopback0 (192.168.0.4)
MTU 17916 bytes, BW 100 Kbit/sec, DLY 50000 usec,
reliability 255/255, txload 255/255, rxload 1/255
Encapsulation TUNNEL, loopback not set
Keepalive not set
Tunnel linestate evaluation up
Tunnel source 192.168.0.4 (Loopback0)
Tunnel Subblocks:
src-track:
Tunnel1 source tracking subblock associated with Loopback0
Set of tunnels with source Loopback0, 2 members (includes iterators), on interface <OK>
Tunnel protocol/transport multi-GRE/IP
Key disabled, sequencing disabled
Checksumming of packets disabled
Tunnel TTL 255, Fast tunneling enabled
Tunnel transport MTU 1476 bytes
Tunnel transmit bandwidth 8000 (kbps)
Tunnel receive bandwidth 8000 (kbps)
Last input 00:00:02, output 00:00:05, output hang never
Last clearing of "show interface" counters 1d02h
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
133
Chapter 3 Multicast MPLS VPNs
Queueing strategy: fifo
Output queue: 0/0 (size/max)
5 minute input rate 0 bits/sec, 0 packets/sec
5 minute output rate 337000 bits/sec, 75 packets/sec
39021 packets input, 3028900 bytes, 0 no buffer
Received 0 broadcasts (3221 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
183556 packets output, 99558146 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
You can use the show pim vrf RED mdt interface and show pim vrf RED mdt interface commands for IOS-XR as shown in Example 3-19.
Example 3-19 Verifying MTI Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show pim vrf RED mdt interface
Fri Feb 17 23:04:12.788 UTC
GroupAddress Interface Source Vrf
232.0.0.1 mdtRED Loopback0 RED
This section examines the behavior of the data MDT, using the same example shown in Figure 3-4. R11 is the RP, H20 is the sender, and H24 is
the receiver for multicast group 224.1.1.20. In this scenario, H20 is sending traffic to group 224.1.1.20. Just as in any PIM-SM configuration,
R12 registers with the RP (R12). Because there are currently no receivers interested in the traffic, R12 does not forward any multicast
messages. When H24 joins the 224.1.1.20 group, it sends the join message to the RP (R12). With an interested receiver, R12 begins to forward
134
Chapter 3 Multicast MPLS VPNs
the traffic to R4 (PE).
R4 sends multicast messages to the default MDT. This means that every router participating in the RED multicast VPN receives the message,
including R3, R5, and R6. Both R3 and R5 drop the traffic because they do not have any receivers interested in the multicast stream. R6, on the
other hand, has an interested receiver downstream, which is H24. R6 sends a join message to R4, as shown by the packet capture between R6
and R10. Also note that this is not a GRE encapsulated packet but is sent natively, as shown in Example 3-20.
Example 3-20 Native Default MDT
Click here to view code image
Ethernet Packet: 68 bytes
Dest Addr: 0100.5E00.000D, Source Addr: AABB.CC00.0600
Protocol: 0x0800
IP Version: 0x4, HdrLen: 0x5, TOS: 0xC0 (Prec=Internet Contrl)
Length: 54, ID: 0x9913, Flags-Offset: 0x0000
TTL: 1, Protocol: 103 (PIM), Checksum: 0x14D2 (OK)
Source: 192.168.106.6, Dest: 224.0.0.13
PIM Ver:2 , Type:Join/Prune , Reserved: 0 , Checksum : 0x008B (OK)
Addr Family: IP , Enc Type: 0 , Uni Address: 192.168.106.10
Reserved: 0 , Num Groups: 1 , HoldTime: 210
Addr Family: IP , Enc Type: 0 , Reserved: 0 , Mask Len: 32
Group Address:232.0.1.0
Num Joined Sources: 1 , Num Pruned Sources: 0
Joined/Pruned Srcs: Addr Family: IP , Enc Type: 0 , Reserved: 0
S: 1 , W: 0, R:0, Mask Len: 32
Source Address:192.168.0.4
R4 continues to send multicast messages to the default MDT (232.0.0.1) until the threshold is reached, at which time it moves the stream from
the default MDT to the data MDT, which in this example is the multicast address 232.0.1.0.
It is important to understand that there are two multicast operations happening simultaneously:
The CE overlay
The P/PE underlay
Example 3-21 provides some practical configuration, debugging information, and packet captures that assist in explaining the process in greater
detail. In order to switch to the data MDT, you need to reconfigure R4 to switch over to the data MDT at a lower rate. As shown in Example 3-
21, which uses IOS-XE, you can use 2 Kbps and configure R4 with the mdt data threshold 2 command.
Example 3-21 Configuring the Data MDT Threshold Using IOS-XE
Click here to view code image
vrf definition RED
rd 65000:1
!
address-family ipv4
mdt default 232.0.0.1
mdt data 232.0.1.0 0.0.0.255 threshold 2
mdt data threshold 2
Example 3-22 shows the same configuration using IOS-XR.
Example 3-22 Configuring the Data MDT Threshold Using IOS-XR
Click here to view code image
multicast-routing
!
vrf RED
address-family ipv4
135
Chapter 3 Multicast MPLS VPNs
interface all enable
mdt default ipv4 232.0.0.1
mdt data 232.0.1.0/24 threshold 2
The expectation now is that the multicast stream sent from H20 will use the following path: R12, R4, R8, R10, R6, R16, H24. The switchover
process takes three seconds.
You can verify the existence of the default MDT with the show ip pim vrf RED mdt command on R4, as shown in Example 3-23, which uses
IOS-XE. In this example, the * implies the default MDT.
Example 3-23 PIM Default MDT with IOS-XE
Click here to view code image
R4# show ip pim vrf RED mdt
* implies mdt is the default MDT, # is (*,*) Wildcard,
> is non-(*,*) Wildcard
MDT Group/Num Interface Source VRF
* 232.0.0.1 Tunnel1 Loopback0 RED
Example 3-24 shows the cache entries for the default MDT for IOS-XE, using the show pim vrf RED mdt cache command.
Example 3-24 PIM Default MDT Cache
Click here to view code image
RP/0/0/CPU0:R4# show pim vrf RED mdt cache
Fri Feb 17 21:41:52.136 UTC
136
Chapter 3 Multicast MPLS VPNs
Example 3-27 PIM VRF Data MDT IP Address
Click here to view code image
RP/0/0/CPU0:R4# show pim vrf RED mdt cache
Fri Feb 17 23:10:08.583 UTC
Core Source Cust (Source, Group) Core Data Expires
192.168.0.4 (172.16.12.20, 224.1.1.20) 232.0.1.6 00:02:56
Now that the SPT has been established, there is optimal multicast traffic flow from source to receiver. Looking at a packet capture between R6
and R10 with the IOS-XE example, you verify that the data MDT is in operation, as shown in Example 3-28.
Example 3-28 Data MDT in Operation
Click here to view code image
Ethernet Packet: 578 bytes
Dest Addr: 0100.5E00.0100, Source Addr: AABB.CC00.0A20
Protocol: 0x0800
IP Version: 0x4, HdrLen: 0x5, TOS: 0x00
Length: 564, ID: 0x4447, Flags-Offset: 0x0000
TTL: 253, Protocol: 47, Checksum: 0xCDA6 (OK)
Source: 192.168.0.4, Dest: 232.0.1.0
GRE Present: 0x0 ( Chksum:0, Rsrvd:0, Key:0, SeqNum:0 )
Reserved0: 0x000, Version: 0x0, Protocol: 0x0800
IP Version: 0x4, HdrLen: 0x5, TOS: 0x00
Length: 540, ID: 0x0000, Flags-Offset: 0x0000
TTL: 58, Protocol: 17 (UDP), Checksum: 0xE593 (OK)
Source: 172.16.12.20, Dest: 224.1.1.20
UDP Src Port: 37777, Dest Port: 7050
Length: 520, Checksum: 0xB384 ERROR: EF72
Data: *removed for brevity*
From the packet capture, you can see that the destination IP address is 232.0.1.0, which is the first multicast IP address defined in the pool. Also
notice the GRE information and the encapsulated IP Multicast header that is part of VRF RED. Remember that the data MDT is now using SPT
for optimal routing.
You can use yet another IOS-XE command to view the functionality of the default MDT and the data MDT with the command show ip mfib vrf
RED 224.1.1.20, as shown in Example 3-29.
Example 3-29 Data MDT Packet Count Using IOS-XE
Click here to view code image
R4# show ip mfib vrf RED 224.1.1.20
Entry Flags: C - Directly Connected, S - Signal, IA - Inherit A flag,
ET - Data Rate Exceeds Threshold, K - Keepalive
DDE - Data Driven Event, HW - Hardware Installed
ME - MoFRR ECMP entry, MNE - MoFRR Non-ECMP entry, MP - MFIB
MoFRR Primary, RP - MRIB MoFRR Primary, P - MoFRR Primary
MS - MoFRR Entry in Sync, MC - MoFRR entry in MoFRR Client.
I/O Item Flags: IC - Internal Copy, NP - Not platform switched,
NS - Negate Signalling, SP - Signal Present,
A - Accept, F - Forward, RA - MRIB Accept, RF - MRIB Forward,
MA - MFIB Accept, A2 - Accept backup,
RA2 - MRIB Accept backup, MA2 - MFIB Accept backup
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kbits per second
Other counts: Total/RPF failed/Other drops
137
Chapter 3 Multicast MPLS VPNs
I/O Item Counts: FS Pkt Count/PS Pkt Count
VRF RED
(*,224.1.1.20) Flags: C
SW Forwarding: 0/0/0/0, Other: 0/0/0
Tunnel1, MDT/232.0.0.1 Flags: A
(172.16.12.20,224.1.1.20) Flags: ET
SW Forwarding: 5547/77/540/328, Other: 0/0/0
Ethernet0/2 Flags: A
Tunnel1, MDT/232.0.1.0 Flags: F NS
Pkts: 4784/0
With IOS-XR the equivalent command is show mfib vrf RED route 224.1.1.20, as shown in Example 3-30.
Example 3-30 Data MDT Packet Count Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mfib vrf RED route 224.1.1.20
Thu Feb 16 00:47:58.929 UTC
IP Multicast Forwarding Information Base
Entry flags: C - Directly-Connected Check, S - Signal, D - Drop,
IA - Inherit Accept, IF - Inherit From, EID - Encap ID,
ME - MDT Encap, MD - MDT Decap, MT - MDT Threshold Crossed,
MH - MDT interface handle, CD - Conditional Decap,
DT - MDT Decap True, EX - Extranet, RPFID - RPF ID Set,
MoFE - MoFRR Enabled, MoFS - MoFRR State, X - VXLAN
Interface flags: F - Forward, A - Accept, IC - Internal Copy,
NS - Negate Signal, DP - Don't Preserve, SP - Signal Present,
EG - Egress, EI - Encapsulation Interface, MI - MDT Interface,
EX - Extranet, A2 - Secondary Accept
Forwarding/Replication Counts: Packets in/Packets out/Bytes out
Failure Counts: RPF / TTL / Empty Olist / Encap RL / Other
(172.16.12.20,224.1.1.20), Flags:
Up: 01:09:21
Last Used: 00:00:00
SW Forwarding Counts: 165863/165863/89566020
SW Replication Counts: 165863/0/0
SW Failure Counts: 0/0/0/0/0
mdtRED Flags: F NS MI, Up:00:34:52
GigabitEthernet0/0/0/2 Flags: A, Up:00:34:52
The commands in the preceding examples provide a wealth of information for troubleshooting, such as the packet count and the specific tunnel
interface.
MLDP is an extension to LDP used to facilitate the transportation of multicast messages in an MPLS network. MLDP supports P2MP and
MP2MP label-switched paths (LSPs). With MLDP, you can use the same encapsulation method as with the unicast messages, which reduces the
complexity of the network. MLDP is a true pull-model implementation in that the PE closest to the receiver is the device to initiate the LSP.
Figure 3-6 illustrates the MLDP topology. Receivers send traffic upstream, toward the root, and the source sends traffic downstream.
138
Chapter 3 Multicast MPLS VPNs
FEC Elements
The Forwarding Equivalence Class (FEC) describes a set of packets with a similar characteristic for forwarding and that are bound to the same
outbound MPLS label. With MLDP, the multicast information is transmitted for the control plane to function correctly. MLDP uses the label
mapping message to create the MP-LSP hop-by-hop to the root ingress PE device. This path is established using IGP to find the most efficient
path to the root, using the appropriate routing metrics. The label mapping message carries additional information known as TLVs. The FEC TLV
contains FEC elements, which actually define the set of packets that will use the LSP. This FEC TLV for multicast contains the following
information:
Tree type: Point-to-point or bidirectional tree
Address family: The type of stream (IPv4 or IPv6) the tree is replicating, which defines the root address type.
Address length: The length of the root address.
Root node address: The actual root address of the MP-LSP within the MPLS core (IPv4 or IPv6).
Opaque value: The stream information that uniquely identifies this tree to the root. The opaque value contains additional information that
defines the (S, G), PIM-SSM Transit, or it can be an LSP identifier to define the default/data MDTs in a multicast VPN (MVPN) application.
Currently, four multicast mechanisms are supported, each with a unique opaque value:
IPv4 PIM-SSM transit: This allows global PIM-SSM streams to be transported across the MPLS core. The opaque value contains the actual
(S, G), which reside in the global mroute table of the ingress and egress PE routers.
IPv6 PIM-SSM transit: This is similar to IPv4 PIM-SSM but for IPv6 streams in the global table.
Multicast VPN: This allows VPNv4 traffic to be transported across the default MDT or the data MDT using label switching. The current
method is to use mGRE tunneling (which is reviewed earlier in this chapter). Use MLDP to replace the mGRE tunnel with an MP-LSP tunnel.
139
Chapter 3 Multicast MPLS VPNs
Multicast VPN is independent of the underlying tunnel mechanism.
Direct MDT or VPNv4 transit: This opaque value allows VPNv4 streams to be directly built without the need for the default MDT to exist.
This is useful for high-bandwidth streams with selective PEs requiring the multicast stream. Currently, this is not a supported feature.
140
Chapter 3 Multicast MPLS VPNs
Figure 3-8 Out-of-Band Signaling Operation
If only one root is configured, what happens when that device fails? No multicast messages flow across the network. There are two options to
address this. The first is to use anycast root node redundancy (RNR). This is accomplished by configuring a primary root and a backup root,
which involves creating an additional loopback on the primary root and the backup root that advertise the same IP address. The difference is
that the IP address of the primary root advertises the route with a longer mask. For example, referring to Figure 3-5, you could add a loopback
address to R8 as 192.168.0.254 with subnet mask 255.255.255.255 (/32) and also add a loopback address to R9 with the same address
192.168.0.254, but with the shorter subnet mask 255.255.255.254 (/31).
Remember that routes are chosen by longest match first. This makes R8 the root, and in the event of a failure, R9 takes over. The issue with this
implementation is that when R8 fails, you have to wait for IGP to reconverge to make R9 the new root.
An alternative solution is to configure two devices as the root. Figure 3-9 shows an example in which R8 and R9 are chosen as the root. The
upside to this implementation is that immediate failover occurs in the event that the root is unavailable, but the downside is that it creates
additional MLDP state in the core of the network. As the network architect, you need to determine if the additional overhead is worth the
reduced failover time.
MLDP in Action
141
Chapter 3 Multicast MPLS VPNs
This section shows how to configure VRF BLU for MLDP, with R8 and R9 as the root of the tree. In this example, VRF BLU is configured for
auto-RP, using R13 as the RP. H22 (172.17.14.22) is configured to send traffic to multicast address 224.2.2.22. H25 and H26 are set up as
receivers, as shown in Figure 3-9.
As shown in Examples 3-31 and 3-32, PIM MPLS is configured for Loopback0, the VPN ID is 65000:2, and two roots are configured for high
availability: R8 (192.168.0.8) and R9 (192.168.0.9).
Note: All core devices must also be configured to support MPLS MLDP.
Example 3-31 Default MDT MLDP Configuration Using IOS-XE
Click here to view code image
PE routers#
vrf definition BLU
rd 65000:2
vpn id 65000:2
!
address-family ipv4
mdt default mpls mldp 192.168.0.8
mdt default mpls mldp 192.168.0.9
route-target export 65000:2
route-target import 65000:2
exit-address-family
Example 3-32 Default MDT MLDP Configuration Using IOS-XR
Click here to view code image
PE routers#
mpls ldp
mldp
!
vrf BLU
vpn id 65000:2
address-family ipv4 unicast
import route-target
65000:2
export route-target
65000:2
!
route-policy Data-MDT-mLDP
set core-tree mldp
end-policy
!
multicast-routing
vrf BLU
address-family ipv4
interface all enable
mdt default mldp ipv4 192.168.0.8
mdt default mldp ipv4 192.168.0.9
!
router pim
vrf BLU
address-family ipv4
rpf topology route-policy Data-MDT-mLDP
interface GigabitEthernet0/0/0/1
You need to verify MLDP neighbor relationships by using the show mpls mldp neighbors command, as shown in Examples 3-33 and 3-34,
from R7. This is one of the few commands that are consistent across IOS-XE and IOS-XR.
Example 3-33 MPLS MLDP Neighbors Using IOS-XE
142
Chapter 3 Multicast MPLS VPNs
Click here to view code image
R7# show mpls mldp neighbors
MLDP peer ID : 192.168.0.1:0, uptime 1w2d Up,
Target Adj : No
Session hndl : 1
Upstream count : 0
Branch count : 0
Path count : 1
Path(s) : 192.168.71.1 LDP Ethernet0/3
Nhop count : 0
MLDP peer ID : 192.168.0.8:0, uptime 1w2d Up,
Target Adj : No
Session hndl : 3
Upstream count : 1
Branch count : 0
Path count : 1
Path(s) : 192.168.87.8 LDP Ethernet0/1
Nhop count : 1
Nhop list : 192.168.87.8
MLDP peer ID : 192.168.0.9:0, uptime 1w2d Up,
Target Adj : No
Session hndl : 4
Upstream count : 1
Branch count : 0
Path count : 1
Path(s) : 192.168.97.9 LDP Ethernet0/0
Nhop count : 1
Nhop list : 192.168.97.9
MLDP peer ID : 192.168.0.3:0, uptime 5d21h Up,
Target Adj : No
Session hndl : 6
Upstream count : 0
Branch count : 2
Path count : 1
Path(s) : 192.168.73.3 LDP Ethernet0/2
Nhop count : 0
Example 3-34 MPLS MLDP Neighbors Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R7# show mpls mldp neighbors
Sat Feb 18 22:26:00.867 UTC
mLDP neighbor database
MLDP peer ID : 192.168.0.1:0, uptime 00:31:53 Up,
Capabilities : Typed Wildcard FEC, P2MP, MP2MP
Target Adj : No
Upstream count : 0
Branch count : 0
Label map timer : never
Policy filter in :
Path count : 1
Path(s) : 192.168.71.1 GigabitEthernet0/0/0/3 LDP
Adj list : 192.168.71.1 GigabitEthernet0/0/0/3
Peer addr list : 192.168.71.1
: 192.168.0.1
MLDP peer ID : 192.168.0.3:0, uptime 00:31:53 Up,
Capabilities : Typed Wildcard FEC, P2MP, MP2MP
143
Chapter 3 Multicast MPLS VPNs
Target Adj : No
Upstream count : 0
Branch count : 2
Label map timer : never
Policy filter in :
Path count : 1
Path(s) : 192.168.73.3 GigabitEthernet0/0/0/2 LDP
Adj list : 192.168.73.3 GigabitEthernet0/0/0/2
Peer addr list : 192.168.73.3
: 192.168.0.3
MLDP peer ID : 192.168.0.8:0, uptime 00:31:53 Up,
Capabilities : Typed Wildcard FEC, P2MP, MP2MP
Target Adj : No
Upstream count : 1
Branch count : 1
Label map timer : never
Policy filter in :
Path count : 1
Path(s) : 192.168.87.8 GigabitEthernet0/0/0/1 LDP
Adj list : 192.168.87.8 GigabitEthernet0/0/0/1
Peer addr listw : 192.168.108.8
: 192.168.87.8
: 192.168.84.8
: 192.168.82.8
: 192.168.0.8
MLDP peer ID : 192.168.0.9:0, uptime 00:31:53 Up,
Capabilities : Typed Wildcard FEC, P2MP, MP2MP
Target Adj : No
Upstream count : 1
Branch count : 1
Label map timer : never
Policy filter in :
Path count : 1
Path(s) : 192.168.97.9 GigabitEthernet0/0/0/0 LDP
Adj list : 192.168.97.9 GigabitEthernet0/0/0/0
Peer addr list : 192.168.97.9
: 192.168.109.9
: 192.168.95.9
: 192.168.0.9
The output in Examples 3-33 and 3-34 shows that R7 has established an MLDP neighbor relationships. You can run the show mpls mldp
neighbors command from any P or PE device to determine the appropriate neighbors.
You also need to verify that the root of the default MDT trees are established by using the show mpls mldp root command for both IOS-XE
and IOS-XR, as shown in Examples 3-35 and 3-36.
Example 3-35 MPLS MLDP Root Using IOS-XE
Click here to view code image
R7# show mpls mldp root
Root node : 192.168.0.8
Metric : 11
Distance : 110
Interface : Ethernet0/1 (via unicast RT)
FEC count : 1
Path count : 1
Path(s) : 192.168.87.8 LDP nbr: 192.168.0.8:0 Ethernet0/1
Root node : 192.168.0.9
Metric : 11
144
Chapter 3 Multicast MPLS VPNs
Distance : 110
Interface : Ethernet0/0 (via unicast RT)
FEC count : 1
Path count : 1
Path(s) : 192.168.97.9 LDP nbr: 192.168.0.9:0 Ethernet0/0
Example 3-36 MPLS MLDP Root Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R7# show mpls mldp root
Sat Feb 18 22:27:11.592 UTC
mLDP root database
Root node : 192.168.0.8
Metric : 11
Distance : 110
FEC count : 1
Path count : 1
Path(s) : 192.168.87.8 LDP nbr: 192.168.0.8:0
Root node : 192.168.0.9
Metric : 11
Distance : 110
FEC count : 1
Path count : 1
Path(s) : 192.168.97.9 LDP nbr: 192.168.0.9:0
One item that is interesting to note is that the P devices are all aware of where the roots are in the tree. The output in Example 3-36 is from R7,
one of the P devices.
With the command show mpls mldp bindings, you can view the labels associated with the MP2MP trees, as shown in Examples 3-37 and 3-38.
Example 3-37 MPLS MLDP Bindings Using IOS-XE
Click here to view code image
R7# show mpls mldp bindings
System ID: 7
Type: MP2MP, Root Node: 192.168.0.8, Opaque Len: 14
Opaque value: [mdt 65000:2 0]
lsr: 192.168.0.8:0, remote binding[U]: 32, local binding[D]: 32 active
lsr: 192.168.0.3:0, local binding[U]: 31, remote binding[D]: 38
System ID: 8
Type: MP2MP, Root Node: 192.168.0.9, Opaque Len: 14
Opaque value: [mdt 65000:2 0]
lsr: 192.168.0.9:0, remote binding[U]: 35, local binding[D]: 34 active
lsr: 192.168.0.3:0, local binding[U]: 33, remote binding[D]: 39
Example 3-38 MPLS MLDP Bindings Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R7# show mpls mldp bindings
Sat Feb 18 22:42:44.968 UTC
mLDP MPLS Bindings database
LSP-ID: 0x00001 Paths: 3 Flags:
0x00001 MP2MP 192.168.0.9 [mdt 65000:2 0]
Local Label: 24017 Remote: 24018 NH: 192.168.97.9 Inft: GigabitEthernet0/0/0/0 Active
Local Label: 24015 Remote: 24023 NH: 192.168.73.3 Inft: GigabitEthernet0/0/0/2
Local Label: 24019 Remote: 24018 NH: 192.168.87.8 Inft: GigabitEthernet0/0/0/1
LSP-ID: 0x00002 Paths: 3 Flags:
0x00002 MP2MP 192.168.0.8 [mdt 65000:2 0]
145
Chapter 3 Multicast MPLS VPNs
Local Label: 24018 Remote: 24017 NH: 192.168.87.8 Inft: GigabitEthernet0/0/0/1 Active
Local Label: 24016 Remote: 24022 NH: 192.168.73.3 Inft: GigabitEthernet0/0/0/2
Local Label: 24020 Remote: 24019 NH: 192.168.97.9 Inft: GigabitEthernet0/0/0/0
Traffic is being generated from the sender, H22 (172.17.14.22), to the multicast address 224.2.2.22, with source port 7060 and destination port
38888. There are two receivers, H25 and H26. On R4 there are both (*, G) and (S, G) entries, using the show ip mroute vrf BLU 224.2.2.22
IOS-XE command, as shown in Example 3-39.
Example 3-39 MPLS MLDP Bindings Using IOS-XE
Click here to view code image
R4# show ip mroute vrf BLU 224.2.2.22
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 224.2.2.22), 00:06:38/stopped, RP 172.17.3.13, flags: SP
Incoming interface: Lspvif1, RPF nbr 192.168.0.3
Outgoing interface list: Null
(172.17.14.22, 224.2.2.22), 00:03:04/00:03:25, flags: T
Incoming interface: Ethernet0/1, RPF nbr 172.17.4.14
Outgoing interface list:
Lspvif1, Forward/Sparse, 00:03:04/00:03:10
Notice that the (*, G) entry has a null outgoing interface list, which is expected because it is in a pruned state. The (S, G) outgoing interface
entry is using Lspvif1, which is an LSP virtual interface (LSP-VIF).
For IOS-XR, you use the show mrib vrf BLU route 224.2.2.22 command, as shown in Example 3-40.
Example 3-40 MPLS MLDP Bindings Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mrib vrf BLU route 224.2.2.22
Sat Feb 18 22:59:50.937 UTC
IP Multicast Routing Information Base
Entry flags: L - Domain-Local Source, E - External Source to the Domain,
C - Directly-Connected Check, S - Signal, IA - Inherit Accept,
IF - Inherit From, D - Drop, ME - MDT Encap, EID - Encap ID,
MD - MDT Decap, MT - MDT Threshold Crossed, MH - MDT interface handle
CD - Conditional Decap, MPLS - MPLS Decap, EX - Extranet
MoFE - MoFRR Enabled, MoFS - MoFRR State, MoFP - MoFRR Primary
MoFB - MoFRR Backup, RPFID - RPF ID Set, X - VXLAN
Interface flags: F - Forward, A - Accept, IC - Internal Copy,
NS - Negate Signal, DP - Don't Preserve, SP - Signal Present,
II - Internal Interest, ID - Internal Disinterest, LI - Local Interest,
LD - Local Disinterest, DI - Decapsulation Interface
EI - Encapsulation Interface, MI - MDT Interface, LVIF - MPLS Encap,
146
Chapter 3 Multicast MPLS VPNs
EX - Extranet, A2 - Secondary Accept, MT - MDT Threshold Crossed,
MA - Data MDT Assigned, LMI - mLDP MDT Interface, TMI - P2MP-TE MDT Interface
IRMI - IR MDT Interface
(172.17.14.22,224.2.2.22) RPF nbr: 172.17.4.14 Flags: RPF
Up: 00:40:17
Incoming Interface List
GigabitEthernet0/0/0/1 Flags: A, Up: 00:40:17
Outgoing Interface List
LmdtBLU Flags: F NS LMI, Up: 00:40:17
With the output shown in Example 3-40, only the (S, G) entry is listed with the outgoing interface LmdtBLU.
To determine the MPLS labels assigned the data MDT entries, you use the IOS-XE show mpls mldp database command, as shown in Example
3-41.
Example 3-41 MPLS MLDP Database Using IOS-XE
Click here to view code image
R4# show mpls mldp database
* Indicates MLDP recursive forwarding is enabled
LSM ID : 1 (RNR LSM ID: 2) Type: MP2MP Uptime : 1w2d
FEC Root : 192.168.0.8
Opaque decoded : [mdt 65000:2 0]
Opaque length : 11 bytes
Opaque value : 02 000B 0650000000000200000000
RNR active LSP : (this entry)
Candidate RNR ID(s): 6
Upstream client(s) :
192.168.0.8:0 [Active]
Expires : Never Path Set ID : 1
Out Label (U) : 31 Interface : Ethernet0/0*
Local Label (D): 38 Next Hop : 192.168.84.8
Replication client(s):
MDT (VRF BLU)
Uptime : 1w2d Path Set ID : 2
Interface : Lspvif1
LSM ID : 6 (RNR LSM ID: 2) Type: MP2MP Uptime : 02:01:58
FEC Root : 192.168.0.9
Opaque decoded : [mdt 65000:2 0]
Opaque length : 11 bytes
Opaque value : 02 000B 0650000000000200000000
RNR active LSP : 1 (root: 192.168.0.8)
Upstream client(s) :
192.168.0.8:0 [Active]
Expires : Never Path Set ID : 6
Out Label (U) : 34 Interface : Ethernet0/0*
Local Label (D): 39 Next Hop : 192.168.84.8
Replication client(s):
MDT (VRF BLU)
Uptime : 02:01:58 Path Set ID : 7
Interface : Lspvif1
The output shown in Example 3-41 also shows that the active RNR is R8 (192.168.0.8). The Out Label (U), which is upstream toward the root,
has the label 31 assigned, and the (D) downstream label, away from the root, has the value 38.
You can validate that multicast traffic is using the upstream label with the show mpls forwarding-table labels 31 command, as shown with
Example 3-42 on R8.
Example 3-42 MPLS Forwarding Table Using IOS-XE
147
Chapter 3 Multicast MPLS VPNs
Click here to view code image
R8# show mpls forwarding-table labels 31
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
31 33 [mdt 65000:2 0] 16652686 Et0/0 192.168.108.10
32 [mdt 65000:2 0] 16652686 Et0/1 192.168.87.7
From the output in Example 3-42, notice that the number of label-switched bytes is 16,652,686 and that the multicast stream is being replicated
to R7 through Et0/1 and R10 through Et0/0. Because R8 is a P device, multicast replication is occurring in the core.
For IOS-XR, you use a series of commands to validate the behavior, starting with the show mpls mldp database command in Example 3-43.
Example 3-43 MPLS MLDP Database Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mpls mldp database
Sat Feb 18 23:07:07.917 UTC
mLDP database
LSM-ID: 0x00001 (RNR LSM-ID: 0x00002) Type: MP2MP Uptime: 01:31:35
FEC Root : 192.168.0.8
Opaque decoded : [mdt 65000:2 0]
RNR active LSP : (this entry)
Candidate RNR ID(s): 00000003
Upstream neighbor(s) :
192.168.0.8:0 [Active] Uptime: 01:29:54
Next Hop : 192.168.84.8
Interface : GigabitEthernet0/0/0/0
Local Label (D) : 24024 Remote Label (U): 24016
Downstream client(s):
PIM MDT Uptime: 01:31:35
Egress intf : LmdtBLU
Table ID : IPv4: 0xe0000011 IPv6: 0xe0800011
HLI : 0x00002
RPF ID : 1
Local Label : 24000 (internal)
LSM-ID: 0x00003 (RNR LSM-ID: 0x00002) Type: MP2MP Uptime: 01:31:35
FEC Root : 192.168.0.9
Opaque decoded : [mdt 65000:2 0]
RNR active LSP : LSM-ID: 0x00001 (root: 192.168.0.8)
Candidate RNR ID(s): 00000003
Upstream neighbor(s) :
192.168.0.8:0 [Active] Uptime: 01:29:54
Next Hop : 192.168.84.8
Interface : GigabitEthernet0/0/0/0
Local Label (D) : 24025 Remote Label (U): 24015
Downstream client(s):
PIM MDT Uptime: 01:31:35
Egress intf : LmdtBLU
Table ID : IPv4: 0xe0000011 IPv6: 0xe0800011
RPF ID : 1
Local Label : 24001 (internal)
RPF ID : 1
Local Label : 24001 (internal)
Notice that the multicast type is MP2MP, which is indicative of the default MDT, and you can see the labels associated. Use the command show
mpls mldp database root 192.168.0.8 to show only the trees that have been rooted on R4.
The show mpls mldp bindings IOS-XR command verifies the label associated with each tree, as shown in Example 3-44.
Example 3-44 MPLS MLDP Bindings Using IOS-XR
Click here to view code image
148
Chapter 3 Multicast MPLS VPNs
RP/0/0/CPU0:R4# show mpls mldp bindings
Sat Feb 18 23:32:34.197 UTC
mLDP MPLS Bindings database
LSP-ID: 0x00001 Paths: 2 Flags: Pk
0x00001 MP2MP 192.168.0.8 [mdt 65000:2 0]
Local Label: 24020 Remote: 24016 NH: 192.168.84.8 Inft: GigabitEthernet0/0/0/0 Active
Local Label: 24000 Remote: 1048577 Inft: LmdtBLU RPF-ID: 1 TIDv4/v6: 0xE0000011/0xE0800011
LSP-ID: 0x00003 Paths: 2 Flags: Pk
0x00003 MP2MP 192.168.0.9 [mdt 65000:2 0]
Local Label: 24021 Remote: 24015 NH: 192.168.84.8 Inft: GigabitEthernet0/0/0/0 Active
Local Label: 24001 Remote: 1048577 Inft: LmdtBLU RPF-ID: 1 TIDv4/v6: 0xE0000011/0xE0800011
Yes, everything is working as expected—but not very efficiently because multicast messages are being sent to every PE in the VPN. If you look
at a packet capture between R7 and R3, where there are not any multicast receivers, you see the multicast messages shown in Example 3-45.
Example 3-45 MPLS MLDP Packet Capture
Click here to view code image
Ethernet Packet: 558 bytes
Dest Addr: AABB.CC00.0300, Source Addr: AABB.CC00.0720
Protocol : 0x8847
MPLS Label: 38, CoS: 0, Bottom: 1, TTL: 56
IP Version: 0x4, HdrLen: 0x5, TOS: 0x00
Length: 540, ID: 0x0000, Flags-Offset: 0x0000
TTL: 58, Protocol: 17 (UDP), Checksum: 0xE291 (OK)
Source: 172.17.14.22, Dest: 224.2.2.22
UDP Src Port: 38888, Dest Port: 7060
Length: 520, Checksum: 0xAC21 (OK)
Data: *removed for brevity*
The multicast everywhere problem can be solved with the data MDT by using MLDP. In Figure 3-10 the commands from Example 3-46 for
IOS-XE and Example 3-47 for IOS-XR are added to every PE router that participates in the BLU VPN. The goal is to build the most efficient
multicast transport method.
150
Chapter 3 Multicast MPLS VPNs
Local Label (D): 38 Next Hop : 192.168.84.8
Replication client(s):
MDT (VRF BLU)
Uptime : 1w2d Path Set ID : 2
Interface : Lspvif1
LSM ID : 6 (RNR LSM ID: 2) Type: MP2MP Uptime : 03:38:05
FEC Root : 192.168.0.9
Opaque decoded : [mdt 65000:2 0]
Opaque length : 11 bytes
Opaque value : 02 000B 0650000000000200000000
RNR active LSP : 1 (root: 192.168.0.8)
Upstream client(s) :
192.168.0.8:0 [Active]
Expires : Never Path Set ID : 6
Out Label (U) : 34 Interface : Ethernet0/0*
Local Label (D): 39 Next Hop : 192.168.84.8
Replication client(s):
MDT (VRF BLU)
Uptime : 03:38:05 Path Set ID : 7
Interface : Lspvif1
The output in Example 3-48 shows some very valuable information. The tree type is P2MP, and the FEC root shows that R4 is the root of the
tree. This is as it should be because R4 is originating the multicast stream. Notice that the Opaque decoded value is [mdt 65000:2 1]. The last
value in the string is a 1, which indicates the change from the default MDT to the data MDT. Finally, the Out label (D) value is 36. The
downstream (from a multicast perspective) router R8 shows that the incoming label is 36, and the outgoing labels are 35 and 38 because H25
and H26 are both receivers.
As shown in Example 3-49, you can use the show mpls forwarding-table labels 36 IOS-XE command to see the associated labels for the
P2MP interface.
Example 3-49 MPLS MLDP P2MP Using IOS-XE
Click here to view code image
R8# show mpls forwarding-table labels 36
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
36 38 [mdt 65000:2 1] 63819576 Et0/0 192.168.108.10
35 [mdt 65000:2 1] 63819576 Et0/1 192.168.87.7
With IOS-XR, you get a similar output with the show mpls mldp database p2mp root 192.168.0.4 command, as shown in Example 3-50.
Example 3-50 MPLS MLDP P2MP Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mpls mldp database p2mp root 192.168.0.4
Sun Feb 19 00:17:56.420 UTC
mLDP database
LSM-ID: 0x00004 Type: P2MP Uptime: 00:08:50
FEC Root : 192.168.0.4 (we are the root)
Opaque decoded : [mdt 65000:2 1]
Upstream neighbor(s) :
None
Downstream client(s):
LDP 192.168.0.8:0 Uptime: 00:08:50
Next Hop : 192.168.84.8
Interface : GigabitEthernet0/0/0/0
Remote label (D) : 24022
PIM MDT Uptime: 00:08:50
Egress intf : LmdtBLU
Table ID : IPv4: 0xe0000011 IPv6: 0xe0800011
HLI : 0x00004
151
Chapter 3 Multicast MPLS VPNs
RPF ID : 1
Ingress : Yes
Local Label : 24026 (internal)
You can see the label bindings in IOS-XR with the show mpls mldp bindings command, as shown in Example 3-51.
Example 3-51 MPLS MLDP Bindings Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mpls mldp bindings
Sun Feb 19 00:22:42.951 UTC
mLDP MPLS Bindings database
LSP-ID: 0x00001 Paths: 2 Flags: Pk
0x00001 MP2MP 192.168.0.8 [mdt 65000:2 0]
Local Label: 24020 Remote: 24016 NH: 192.168.84.8 Inft: GigabitEthernet0/0/0/0 Active
Local Label: 24000 Remote: 1048577 Inft: LmdtBLU RPF-ID: 1 TIDv4/v6: 0xE0000011/0xE0800011
LSP-ID: 0x00003 Paths: 2 Flags: Pk
0x00003 MP2MP 192.168.0.9 [mdt 65000:2 0]
Local Label: 24021 Remote: 24015 NH: 192.168.84.8 Inft: GigabitEthernet0/0/0/0 Active
Local Label: 24001 Remote: 1048577 Inft: LmdtBLU RPF-ID: 1 TIDv4/v6: 0xE0000011/0xE0800011
LSP-ID: 0x00008 Paths: 2 Flags:
0x00008 P2MP 192.168.0.3 [mdt 65000:2 2]
Local Label: 24029 Active
Remote Label: 1048577 Inft: LmdtBLU RPF-ID: 1 TIDv4/v6: 0xE0000011/0xE0800011
LSP-ID: 0x00004 Paths: 2 Flags:
0x00004 P2MP 192.168.0.4 [mdt 65000:2 1]
Local Label: 24026 Remote: 1048577 Inft: LmdtBLU RPF-ID: 1 TIDv4/v6: 0xE0000011/0xE0800011
Remote Label: 24022 NH: 192.168.84.8 Inft: GigabitEthernet0/0/0/0
LSP-ID: 0x00007 Paths: 2 Flags:
0x00007 P2MP 192.168.0.3 [mdt 65000:2 1]
Local Label: 24028 Active
Remote Label: 1048577 Inft: LmdtBLU RPF-ID: 1 TIDv4/v6: 0xE0000011/0xE0800011
To verify the number of clients, you can use the show mpls mldp database summary command in IOS-XE, as shown in Example 3-52.
Example 3-52 MPLS MLDP Database Summary Using IOS-XE
Click here to view code image
R4# show mpls mldp database summary
LSM ID Type Root Decoded Opaque Value Client Cnt.
7 P2MP 192.168.0.4 [mdt 65000:2 1] 2
1 MP2MP 192.168.0.8 [mdt 65000:2 0] 1
6 MP2MP 192.168.0.9 [mdt 65000:2 0] 1
The output from R4 in Example 3-52 appears to clients in the data MDT.
A very similar command is used for IOS-XR, as shown in Example 3-53.
Example 3-53 MPLS MLDP Database Brief Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mpls mldp database brief
Sun Feb 19 00:23:21.638 UTC
LSM ID Type Root Up Down Decoded Opaque Value
0x00007 P2MP 192.168.0.3 1 1 [mdt 65000:2 1]
152
Chapter 3 Multicast MPLS VPNs
0x00004 P2MP 192.168.0.4 0 2 [mdt 65000:2 1]
0x00008 P2MP 192.168.0.3 1 1 [mdt 65000:2 2]
0x00001 MP2MP 192.168.0.8 1 1 [mdt 65000:2 0]
0x00003 MP2MP 192.168.0.9 1 1 [mdt 65000:2 0]
You can look at R7 to validate that multicast messages are going only to the correct locations. The show mpls mldp database summary IOS-
XE command reveals only one client, as shown in Example 3-54.
Example 3-54 MPLS MLDP Database Summary Validation Using IOS-XR
Click here to view code image
R7# show mpls mldp database summary
LSM ID Type Root Decoded Opaque Value Client Cnt.
9 P2MP 192.168.0.4 [mdt 65000:2 1] 1
7 MP2MP 192.168.0.8 [mdt 65000:2 0] 1
8 MP2MP 192.168.0.9 [mdt 65000:2 0] 1
You can validate the outgoing interface on R7 with the show mpls forwarding-table | include 65000:2 1 IOS-XE command, as shown in
Example 3-55.
Example 3-55 MPLS Forwarding Table Using IOS-XE
Click here to view code image
R7# show mpls forwarding-table | inc 65000:2 1
35 37 [mdt 65000:2 1] 85357260 Et0/0 192.168.97.9
The only outgoing interface for the multicast stream is R9 (192.168.97.9), which is in the downstream direction, toward H25. It is no longer
flooding multicast to every PE in the BLU VPN. Traffic is flowing according to Figure 3-5. Mission accomplished!
Profiles
Profiles are essentially the different methods to implement MVPN. Each profile provides unique capabilities, whether they are core transport or
signaling methods or how the interaction with the customer is accomplished. Using profile numbers is much easier than having to say, “I am
implementing partitioned MDT with MLDP and MP2MP transport using BGP-AD in the core with BGP C-multicast signaling” when you could
just say “I am using Profile 14.”
Currently you can select from 27 different profiles. Previous sections of this chapter mention only 2 of these implementation strategies. As you
can imagine, it is well beyond the scope of this book to explain them all, but here is a list of the available options:
Profile 0 Default MDT - GRE - PIM C-mcast Signaling
Profile 1 Default MDT - MLDP MP2MP PIM C-mcast Signaling
Profile 2 Partitioned MDT - MLDP MP2MP - PIM C-mcast Signaling
Profile 3 Default MDT - GRE - BGP-AD - PIM C-mcast Signaling
Profile 4 Partitioned MDT - MLDP MP2MP - BGP-AD - PIM C-mcast Signaling
Profile 5 Partitioned MDT - MLDP P2MP - BGP-AD - PIM C-mcast Signaling
Profile 6 VRF MLDP - In-band Signaling
Profile 7 Global MLDP In-band Signaling
Profile 8 Global Static - P2MP-TE
Profile 9 Default MDT - MLDP - MP2MP - BGP-AD - PIM C-mcast Signaling
Profile 10 VRF Static - P2MP TE - BGP-AD
Profile 11 Default MDT - GRE - BGP-AD - BGP C-mcast Signaling
153
Chapter 3 Multicast MPLS VPNs
Profile 12 Default MDT - MLDP - P2MP - BGP-AD - BGP C-mcast Signaling
Profile 13 Default MDT - MLDP - MP2MP - BGP-AD - BGP C-mcast Signaling
Profile 14 Partitioned MDT - MLDP P2MP - BGP-AD - BGP C-mast Signaling
Profile 15 Partitioned MDT - MLDP MP2MP - BGP-AD - BGP C-mast Signaling
Profile 16 Default MDT Static - P2MP TE - BGP-AD - BGP C-mcast Signaling
Profile 17 Default MDT - MLDP - P2MP - BGP-AD - PIM C-mcast Signaling
Profile 18 Default Static MDT - P2MP TE - BGP-AD - PIM C-mcast Signaling
Profile 19 Default MDT - IR - BGP-AD - PIM C-mcast Signaling
Profile 20 Default MDT - P2MP-TE - BGP-AD - PIM - C-mcast Signaling
Profile 21 Default MDT - IR - BGP-AD - BGP - C-mcast Signaling
Profile 22 Default MDT - P2MP-TE - BGP-AD BGP - C-mcast Signaling
Profile 23 Partitioned MDT - IR - BGP-AD - PIM C-mcast Signaling
Profile 24 Partitioned MDT - P2MP-TE - BGP-AD - PIM C-mcast Signaling
Profile 25 Partitioned MDT - IR - BGP-AD - BGP C-mcast Signaling
Profile 26 Partitioned MDT - P2MP TE - BGP-AD - BGP C-mcast Signaling
Note: Where do you start looking for information on all these profiles? The Cisco website is a good place to find information on each profile:
www.cisco.com/c/en/us/support/docs/ip/multicast/118985-configure-mcast-00.html.
Another implementation that is worth noting is partitioned MDT, which is a more efficient method of multicast communication across the
provider network than using the default MDT method. The partitioned MDT technique does not require the use of the default MDT, which
means a reduction in the amount of unwanted multicast traffic sent to PE devices within a VPN. Today, Profile 2 is only supported using IOS-
XR, but if you have both IOS-XE and IOS-XR devices in your network, you can use Profile 4, which uses BGP-Auto Discovery (AD) for
signaling.
Luc De Ghein from Cisco TAC created an overview framework that really helps with understanding the different profiles (see Figure 3-11).
154
Chapter 3 Multicast MPLS VPNs
Table 3-1 MVPN Operating System Support Matrix
Next, you need to determine what you need to accomplish and the available resources. Items to consider include the following:
Core and customer multicast requirements
Scalability—that is, how many core and customer routes will be needed
Hardware limitations
Operational experience, which may determine which requirements you can support
High-availability requirements
If one of your requirements is high availability, consider using a profile that supports MLDP. With this implementation, you can take advantage
of Loop-Free Alternate (LFA) routing and minimize the complexity of traffic engineering tunnels.
After you have made a decision on which profile to implement, you should build a lab and test, test, test to become familiar with the operation
and troubleshooting of your particular environment.
With IOS-XE, you can configure support for multiple profiles within the address family of the VRF definition. In Example 3-56, both Profile 0
and Profile 1 are supported simultaneously, but Profile 0 is preferred unless the mdt preference mldp command is configured, in which case
Profile 1 is selected. Using two profiles at once certainly helps to ease the migration from some profiles, but the capabilities are limited.
Example 3-56 Migrating Between Profiles Using IOS-XE
Click here to view code image
155
Chapter 3 Multicast MPLS VPNs
vrf definition RED
rd 65000:1
vpn id 65000:1
!
address-family ipv4
mdt preference mldp
mdt default mpls mldp 192.168.0.8
mdt default mpls mldp 192.168.0.9
mdt default 232.0.0.1
mdt data 232.0.1.0 0.0.0.255 threshold 2
mdt data threshold 2
exit-address-family
IOS-XR, on the other hand, allows you to map individual flows to a unique profile by using route policy language (RPL). This is a very powerful
feature, as shown in Example 3-57.
Example 3-57 Migrating Between Profiles Using IOS-XR and RPL
Click here to view code image
RP/0/0/CPU0:R4#
route-policy MCast-Policy-RED
if source in (172.16.12.20/32) then
set core-tree pim-default
endif
if destination in (224.1.3.0/24) then
set core-tree mldp
else
set core-tree mldp-partitioned-mp2mp
endif
end-policy
In the command set shown in Example 3-57, the first if statement looks at the source address. A match to 172.16.12.20 means Profile 0, or data
MDT with GRE, is used. The second if statement looks for a destination match. If the destination multicast address falls in the subnet range
224.1.3.0/24, MLDP, or Profile 1, is used. The final else statement is a catch-all that means the partitioned MLDP, or Profile 2, is used. The
capabilities are almost endless.
Be sure to apply the statement as shown in Example 3-58.
Example 3-58 Applying Profiles Using IOS-XR and RPL
Click here to view code image
router pim
vrf RED
address-family ipv4
rpf topology route-policy MCast-Policy-RED
Transporting multicast traffic flows across the provider network is accomplished using any one of or a combination of these three methods:
PIM: With PIM in the core, customer multicast messages are mapped to provider multicast messages using IP in IP with a GRE header, as
shown in Example 3-3 with Profile 0.
MLDP: Both P2MP and MP2MP transport are supported.
TE tunnels: Profiles 8, 10, and 18 use traffic engineering (TE) tunnels. Profile 8 is unique in that it does not send traffic through a VPN but
rather uses the global routing table.
Multicast routing between multiple customer (C) locations requires the integration of the provider edge (PE) and customer edge (CE)
equipment. The universal requirement is to have PIM enabled and functional between the PE and the CE, as shown in Figure 3-12.
156
Chapter 3 Multicast MPLS VPNs
With ASM, an RP is required. The placement of the RP and how it is connected does provide some options. In previous examples, one of the
CE devices was configured as an RP for the entire VPN, but there are two more options. The first option is to use a PE device as the RP (in the
case of a self-managed MPLS solution). High availability can be configured using Anycast RP. The second solution is to build an RP at each CE
location and connect the RPs using MSDP. This solution adds a tremendous amount of configuration overhead. It would be a good choice if the
environment has an SP-managed MPLS Layer 3 VPN solution.
PE–PE ingress replication enables you to create a P2MP tree where ingress devices replicate the multicast packets, encapsulate the packets in a
unicast tunnel, and send the packets to egress devices. PE–PE ingress replication is a great tool to have in the toolbox, but it’s not one that
should be used all the time or even very often. PE–PE ingress replication works well when the core of the network does not support any type of
multicast routing. For example, you might have equipment in the core that will not support multicast, or you may be integrating with another
provider that does not allow multicast. Whatever the reason, PE–PE ingress replication is a workaround solution, and using it really defeats the
purpose of multicast in the first place because it involves converting multicast streams to unicast so they can be propagated over the core
network.
In Figure 3-13, the PE router in the top-right corner of the diagram is receiving multicast traffic from the sender, converting it into two unicast
flows, and sending it across the network.
157
Chapter 3 Multicast MPLS VPNs
multicast-routing
!
vrf BLU
address-family ipv4
interface all enable
bgp auto-discovery ingress-replication
!
mdt default ingress-replication
mdt data ingress-replication
router pim
!
vrf BLU
address-family ipv4
rpf topology route-policy IR-Default
mdt c-multicast-routing bgp
!
!
!
!
You can look at the MRIB table by using the show mrib vrf BLU route 224.2.2.22 command. The interface list in Example 3-60 shows
IRmdtBLU, which indicates that it is using ingress replication.
Example 3-60 IRmdtBLU Ingress Replication Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mrib vrf BLU route 224.2.2.22
Fri Feb 24 23:08:26.136 UTC
IP Multicast Routing Information Base
Entry flags: L - Domain-Local Source, E - External Source to the Domain,
C - Directly-Connected Check, S - Signal, IA - Inherit Accept,
IF - Inherit From, D - Drop, ME - MDT Encap, EID - Encap ID,
MD - MDT Decap, MT - MDT Threshold Crossed, MH - MDT interface handle
CD - Conditional Decap, MPLS - MPLS Decap, EX - Extranet
MoFE - MoFRR Enabled, MoFS - MoFRR State, MoFP - MoFRR Primary
MoFB - MoFRR Backup, RPFID - RPF ID Set, X - VXLAN
Interface flags: F - Forward, A - Accept, IC - Internal Copy,
NS - Negate Signal, DP - Don't Preserve, SP - Signal Present,
II - Internal Interest, ID - Internal Disinterest, LI - Local Interest,
LD - Local Disinterest, DI - Decapsulation Interface
EI - Encapsulation Interface, MI - MDT Interface, LVIF - MPLS Encap,
EX - Extranet, A2 - Secondary Accept, MT - MDT Threshold Crossed,
MA - Data MDT Assigned, LMI - mLDP MDT Interface, TMI - P2MP-TE MDT Interface
IRMI - IR MDT Interface
(172.19.14.22,224.2.2.22) RPF nbr: 172.19.4.4 Flags: RPF
Up: 00:02:06
Incoming Interface List
IRmdtBLU Flags: A IRMI, Up: 00:02:06
To verify the PE devices that are participating in the BLU MVPN, you use the show mvpn vrf BLU pe command, as shown in Example 3-61.
Example 3-61 MVPN VRF PEs Using IOS-XR
Click here to view code image
RP/0/0/CPU0:R4# show mvpn vrf BLU pe
Sat Feb 25 23:03:16.548 UTC
MVPN Provider Edge Router information
158
Chapter 3 Multicast MPLS VPNs
VRF : BLU
PE Address : 192.168.0.3 (0x12be134c)
RD: 65000:2 (valid), RIB_HLI 0, RPF-ID 9, Remote RPF-ID 0, State: 1, S-PMSI: 0
PPMP_LABEL: 0, MS_PMSI_HLI: 0x00000, Bidir_PMSI_HLI: 0x00000, MLDP-added: [RD 0, ID 0, Bidir ID 0, Remote
bgp_i_pmsi: 0,0/0 , bgp_ms_pmsi/Leaf-ad: 0/0, bgp_bidir_pmsi: 0, remote_bgp_bidir_pmsi: 0, PMSIs: I 0x0, 0
IIDs: I/6: 0x0/0x0, B/R: 0x0/0x0, MS: 0x0, B/A/A: 0x0/0x0/0x0
Bidir RPF-ID: 10, Remote Bidir RPF-ID: 0
I-PMSI: (0x0)
I-PMSI rem: (0x0)
MS-PMSI: (0x0)
Bidir-PMSI: (0x0)
Remote Bidir-PMSI: (0x0)
BSR-PMSI: (0x0)
A-Disc-PMSI: (0x0)
A-Ann-PMSI: (0x0)
RIB Dependency List: 0x121696a4
Bidir RIB Dependency List: 0x0
Sources: 1, RPs: 0, Bidir RPs: 0
PE Address : 192.168.0.5 (0x12c9f7a0)
RD: 0:0:0 (null), RIB_HLI 0, RPF-ID 11, Remote RPF-ID 0, State: 0, S-PMSI: 0
PPMP_LABEL: 0, MS_PMSI_HLI: 0x00000, Bidir_PMSI_HLI: 0x00000, MLDP-added: [RD 0, ID 0, Bidir ID 0, Remote
bgp_i_pmsi: 1,0/0 , bgp_ms_pmsi/Leaf-ad: 0/0, bgp_bidir_pmsi: 0, remote_bgp_bidir_pmsi: 0, PMSIs: I 0x1289
IIDs: I/6: 0x1/0x0, B/R: 0x0/0x0, MS: 0x0, B/A/A: 0x0/0x0/0x0
Bidir RPF-ID: 12, Remote Bidir RPF-ID: 0
I-PMSI: Unknown/None (0x12897d48)
Most service providers do not generally allow multicast from one customer network into another customer network. An extranet VPN or inter-
VRF multicast is useful when a customer is deploying a self-managed MPLS VPN network. This feature allows multicast messages to flow from
one VRF to another. An example of a use case is a customer that has a VPN for video cameras and wants to protect those resources from attack
by placing them into a unique VPN, but it wants other departments, groups, or organizations to be able to view multicast video streams from the
cameras. Several options can be used, including the following:
Route leaking
VRF fallback
VRF select
Fusion router
VRF-Aware Service Infrastructure (VASI)
Route Leaking
Although route leaking is a possible solution, it not one that is recommended. With route leaking, routes from one VRF are imported/exported
into another VRF, as shown in Figure 3-14.
160
Chapter 3 Multicast MPLS VPNs
The configuration for route leaking using IOS-XR is shown in Example 3-63.
Example 3-63 Route Leaking Using IOS-XR
Click here to view code image
vrf BLU
address-family ipv4 unicast
import route-target
65000:1
65000:2
!
export route-target
65000:1
65000:2
!
!
vrf RED
address-family ipv4 unicast
import route-target
65000:1
65000:2
!
export route-target
65000:1
65000:2
router bgp 65000
bgp router-id 192.168.0.4
address-family ipv4 unicast
!
address-family ipv4 multicast
!
address-family vpnv4 unicast
!
address-family ipv6 unicast
!
address-family vpnv6 unicast
!
address-family ipv4 mdt
!
session-group AS65000
remote-as 65000
update-source Loopback0
!
neighbor-group AS65000
use session-group AS65000
address-family ipv4 unicast
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
address-family ipv4 multicast
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
address-family vpnv4 unicast
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
address-family ipv6 unicast
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
161
Chapter 3 Multicast MPLS VPNs
address-family vpnv6 unicast
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
address-family ipv4 mdt
route-policy ALLOW-ALL in
route-policy ALLOW-ALL out
!
!
multicast-routing
!
vrf BLU
address-family ipv4
mdt source Loopback0
interface all enable
mdt default ipv4 232.0.0.2
mdt data 232.0.2.0/24 threshold 2
!
vrf RED
address-family ipv4
mdt source Loopback0
interface all enable
mdt default ipv4 232.0.0.1
mdt data 232.0.1.0/24 threshold 2
!
!
!
router pim
!
vrf BLU
address-family ipv4
interface GigabitEthernet0/0/0/1
!
!
!
vrf RED
address-family ipv4
!
interface GigabitEthernet0/0/0/2
!
!
!
!
The primary challenge with this solution is that you have limited control over the traffic flow, and it works only if you do not have overlapping
address space. You are essentially making one VRF instance out of two, and this completely invalidates the need for multiple VRF instances.
VRF Fallback
When all else fails, you can use a static route! VRF fallback gives you the ability to configure a static multicast route that directs multicast
messages to another VRF or the global routing table. As shown in Figure 3-15, there are three VRF instances: SVCS, RED, and BLU. The RP is
in the SVCS VRF with IP address 10.0.0.5, and you want the RED and BLU VRF instances to be able to access the sender in the SVCS VRF.
162
Chapter 3 Multicast MPLS VPNs
164
Chapter 3 Multicast MPLS VPNs
Incoming interface: Ethernet0/0, RPF nbr 0.0.0.0, using vrf SVCS
Outgoing interface list:
Ethernet0/1, Forward/Sparse, 00:03:03/00:03:11
Finally, the show ip rpf vrf RED 10.0.0.1 and show ip rpf vrf BLU 10.0.0.1 commands show that the extranet RPF rule is in use, as indicated
by the highlighted output in Example 3-67.
Example 3-67 Extranet RPF Rule Using IOS-XE
Click here to view code image
S1# show ip rpf vrf RED 10.0.0.1
RPF information for ? (10.0.0.1)
RPF interface: Ethernet0/0
RPF neighbor: ? (10.0.0.1) - directly connected
RPF route/mask: 10.0.0.0/24
RPF type: multicast (connected)
Doing distance-preferred lookups across tables
Using Extranet RPF Rule: Static Fallback Lookup, RPF VRF: SVCS
RPF topology: ipv4 multicast base
S1# show ip rpf vrf BLU 10.0.0.1
RPF information for ? (10.0.0.1)
RPF interface: Ethernet0/0
RPF neighbor: ? (10.0.0.1) - directly connected
RPF route/mask: 10.0.0.0/24
RPF type: multicast (connected)
Doing distance-preferred lookups across tables
Using Extranet RPF Rule: Static Fallback Lookup, RPF VRF: SVCS
RPF topology: ipv4 multicast base
VRF Select
The VRF select feature is similar to VRF fallback except that with VRF select, the multicast address can be specified. This feature provides
additional security because you can easily limit the multicast groups. The IOS-XE configuration is shown in Example 3-68.
Example 3-68 VRF Select Using IOS-XE
Click here to view code image
ip mroute vrf BLU 10.0.0.0 255.255.255.0 fallback-lookup vrf SVCS
ip mroute vrf RED 10.0.0.0 255.255.255.0 fallback-lookup vrf SVCS
ip multicast vrf BLU rpf select vrf SVCS group-list 1
ip multicast vrf RED rpf select vrf SVCS group-list 1
!
access-list 1 permit 224.1.1.1
Using the show ip mroute vrf SVCS 224.1.1.1 command, validate the extranet receivers in the RED and BLU VRF instances, as shown in
Example 3-69.
Example 3-69 VRF Select Validation Using IOS-XE
Click here to view code image
S1# show ip mroute vrf SVCS 224.1.1.1
IP Multicast Routing Table
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 224.1.1.1), 00:20:42/stopped, RP 10.0.0.5, flags: SJCE
Incoming interface: Ethernet0/0, RPF nbr 10.0.0.5
Outgoing interface list: Null
Extranet receivers in vrf BLU:
(*, 224.1.1.1), 01:08:40/stopped, RP 10.0.0.5, OIF count: 1, flags: SJC
165
Chapter 3 Multicast MPLS VPNs
Extranet receivers in vrf RED:
(*, 224.1.1.1), 01:08:31/stopped, RP 10.0.0.5, OIF count: 1, flags: SJC
(10.0.0.1, 224.1.1.1), 00:20:42/00:02:03, flags: TE
Incoming interface: Ethernet0/0, RPF nbr 0.0.0.0
Outgoing interface list: Null
Extranet receivers in vrf BLU:
(10.0.0.1, 224.1.1.1), 00:15:15/stopped, OIF count: 1, flags: T
Extranet receivers in vrf RED:
(10.0.0.1, 224.1.1.1), 00:55:31/stopped, OIF count: 1, flags: T
Fusion Router
A fusion router acts as a fusion or connector point for multiple VPNs. It is fundamentally an L3 device where all VPNs connect. This design has
several benefits, such as better control of routing and multicast routing and the ability to perform Network Address Translation (NAT)
functionality in the event of overlapping IP addresses. In addition, a fusion router can act as a central RP for all the VPNs, as shown in Figure 3-
16.
VASI is essentially a software cable that connects VRF instances. It supports NAT, firewalling, IPsec, IPv4, IPv6, multiple routing protocols,
and IPv4 and IPv6 multicast. This design is very similar to the fusion routers concept except that the CE routing and fusion capability happens
all in the same device, as shown in Figure 3-18.
166
Chapter 3 Multicast MPLS VPNs
IPv6 MVPN
Fortunately, making the transition to IPv6 is easy when comes to MVPNs. Most of the features and functionality you have already learned about
in this chapter apply to IPv6. (Check the latest documentation for current feature support.)
167
Chapter 3 Multicast MPLS VPNs
Just as you would enable address families for IPv4, you can do this for IPv6, as shown in the following snippet from IOS-XR:
Click here to view code image
address-family ipv6 unicast
!
address-family vpnv6 unicast
!
address-family ipv4 mvpn
!
address-family ipv6 mvpn
168
Chapter 3 Multicast MPLS VPNs
Summary
MPLS VPN has been around for many years, implemented by service providers and enterprise customers because it provides the ability to
logically separate traffic using one physical infrastructure. The capability of supporting multicast in an MPLS VPN environment has been
169
Chapter 3 Multicast MPLS VPNs
available using default MDT. As technology has evolved, so has the capability of supporting multicast in an MPLS network. Today you can still
use the traditional default MDT method, but there are now also 26 other profiles to choose from—from IP/GRE encapsulation using PIM, to
traffic engineering tunnels, to MLDP, to BIER.
The requirements for extranet MVPNs, or the capability of sharing multicast messages between VPNs, has also brought about development of
new capabilities, such as VRF fallback, VRF select, and VASI. The combinations of solutions are almost endless, and this chapter was
specifically written to provide a taste of the different solutions and capabilities. Please do not stop here; a search engine can help you make your
way through the mass of information available online to continue your education.
References
“Configure mVPN Profiles Within Cisco IOS,” www.cisco.com/c/en/us/support/docs/ip/multicast/118985-configure-mcast-00.html
RFC 2365, “Administratively Scoped IP Multicast”
RFC 6037, “Cisco Systems’ Solution for Multicast in BGP/MPLS IP VPNs”
RFC 6514, “BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs”
RFC 6516, “IPv6 Multicast VPN (MVPN) Support Using PIM Control Plane and Selective Provider Multicast Service Interface (S-PMSI) Join
Messages”
RFC 7358, “Label Advertisement Discipline for LDP Forwarding Equivalence Classes (FECs)”
RFC 7441, “Encoding Multipoint LDP (mLDP) Forwarding Equivalence Classes (FECs) in the NLRI of BGP MCAST-VPN Routes”
“Encapsulating MPLS in IP or GRE,” https://tools.ietf.org/html/draft-rosen-mpls-in-ip-or-gre-00
“Encapsulation for Bit Index Explicit Replication in MPLS and non-MPLS Networks,” https://tools.ietf.org/html/draft-ietf-bier-mpls-
encapsulation-06
170
Chapter 4 Multicast in Data Center Environments
Chapter 4
Multicast in Data Center Environments
Most of this book discusses multicast in LANs and WANs, but one of the most critical components of
any organization is its data center. Understanding the nuances of how multicast functions in myriad
solutions is extremely critical to the success of an organization. The goal of this chapter is to provide
insight into the operation of multicast, using the most popular methods for data center implementation,
including virtual port channel (VPC), Virtual Extensible LAN (VXLAN), and Application Centric
Infrastructure (ACI).
Consider the example in Figure 4-2, in which the source is connected to the L3 network (upstream router
via router A), and the receiver is connected to the VPC environment (downstream). Figure 4-2 shows the
control-plane flow, which involves the following steps:
173
Chapter 4 Multicast in Data Center Environments
Step 7. Only the elected forwarder responds to the (S, G) state toward the source and adds the relative
state changes to (S, G) entry:
Click here to view code image
SwitchA# show ip mroute
IP Multicast Routing Table for VRF "default"
(10.20.0.10/32, 239.1.1.1/32), uptime: 00:00:55, ip pim
Incoming interface: Vlan100, RPF nbr: 10.20.0.20
Outgoing interface list: (count: 1)
port-channel30, uptime: 00:00:49, pim
SwitchB# show ip mroute
IP Multicast Routing Table for VRF "default"
(10.20.0.10/32, 239.1.1.1/32), uptime: 00:01:12, ip pim
Incoming interface: Vlan100, RPF nbr: 10.20.0.20
Outgoing interface list: (count: 0)
Switch A is the VPC primary and VPC role is used as tiebreaker for multicast forwarding path (source is
connected to the upstream).
Step 8. Data flows down the source to the forwarding VPC peer.
VXLAN
VXLAN provides an overlay network to transport L2 packets on an L3 network. VXLAN uses MAC-in-
UDP encapsulation, which extends Layer 2 segments. MAC-in-UDP encapsulation is illustrated in Figure
4-3.
VTEP
The IP encapsulation of an L2 frame is accomplished with VXLAN, which uses VXLAN Tunnel
Endpoints (VTEPs) to map a tenant’s end devices to VXLAN segments and to perform VXLAN
encapsulation and de-encapsulation. A VTEP has two functions: It is a switch interface on the local LAN
segment to support local endpoint communication via bridging, and it is also an IP interface to the
transport IP network toward the remote endpoint. The VTEP IP interface identifies the VTEP on the
transport IP network. The VTEPs use this IP address to encapsulate Ethernet frames as shown in Figure
4-4.
174
Chapter 4 Multicast in Data Center Environments
A VTEP device holds the VXLAN forwarding table, which maps a destination MAC address to a remote
VTEP IP address where the destination MAC is located, and also a local table that contains the VLAN-
to-VXLAN map. As shown in Figure 4-6, the data plane is represented by the VNI interface (NVE).
175
Chapter 4 Multicast in Data Center Environments
Now that you understand the basic building blocks of VXLAN, you’re ready to look at the these VXLAN
deployment types:
VXLAN flood and learn
VXLAN with EVPN
In flood and learn mode, the source and destination frames are encapsulated with the VTEP source and
destination IP addresses. The source IP address is the IP address of the encapsulating VTEP, and the
destination IP address is either a multicast or unicast address of the remote VTEP that hosts the
destination. VXLAN uses VTEPs to map tenants’ end devices to VXLAN segments. The communication
between VTEPs depends on whether the end host address is known to the remote VTEP connected to the
source; this is maintained in the VXLAN forwarding table. If the destination is known, the process
leverages unicast data plane overlay communication between the VTEPs, as shown in Figure 4-7. The
communication of the unicast data plane occurs when the VTEP receives an ARP reply from a connected
system and knows the destination MAC address, thanks to local MAC-A-to-IP mapping to determine the
IP address of the destination VTEP connected to the destination host. The original packet is encapsulated
in a unicast tunnel using User Datagram Protocol (UDP), with the source IP address of the local VTEP
and the destination IP address of the remote VTEP.
176
Chapter 4 Multicast in Data Center Environments
177
Chapter 4 Multicast in Data Center Environments
vlan 2901
vn-segment 2901
Note: In flood and learn, the spine only needs to support routing paths between the leafs. The RPs (pim
sparse-mode, Anycast RP distribution) need to be added to the spine configuration.
Example 4-1 shows the show commands you can use to verify that the NVE peers are using multicast.
Example 4-1 Leaf VTEP 1: show nve peers Command Output
Click here to view code image
VTEP-1# show nve peers
Interface Peer-IP State LearnType Uptime Router-Mac
--------- --------------- ----- --------- -------- -----------------
nve1 10.50.1.1 Up DP 00:05:09 n/a
179
Chapter 4 Multicast in Data Center Environments
VTEP-1# show ip mroute
IP Multicast Routing Table for VRF "default"
Using Border Gateway Protocol (BGP) and Ethernet virtual private networks (EVPNs) with VXLAN
reduces the problems related to learning via flooding (which is applicable in scalable environments).
VXLAN optimizes learning of MAC addresses of all hosts in the VXLAN fabric without flood behavior.
BGP is a standardized protocol for Network Layer Reachability Information (NLRI) to exchange host-to-
host reachability and learning within the fabric. BGP NLRI provides complete visibility of all the MAC
and IP address combinations of the hosts behind the VTEP in order to complete VXLAN fabric
connectivity by providing the MAC/IP info to all VTEPs. The EVPN extensions that are part of BGP (or
Multiprotocol BGP MBGP) add enough information within these to standardize the control plane for
communication. Through integrated routing and switching, BGP EVPN facilitates transport for both L2
and L3, using known workload addresses present within the VXLAN network. This is particularly useful
in multitenant environments as MAC address and IP information can be separated by BGP NLRI for
individual tenants. Figure 4-9 shows this VXLAN capability.
180
Chapter 4 Multicast in Data Center Environments
The EVPN address family allows the host MAC, IP, network, Virtual Route Forwarding (VRF), and
VTEP information to be carried over MBGP. In this way, within the fabric, the VTEP learns about hosts
connected using BGP EVPN. BGP EVPN does not eliminate the need for flooding for BUM traffic, but
unknown unicast is eliminated (sparing silent hosts). However, broadcast and multicast data
communication between the hosts still needs to be transported. The transportation is implemented by
multicast underlay, which is similar to flood and learn, but the multicast traffic is optimized because all
hosts’ IP/MAC addresses are learned via BGP EVPN. The only exception is silent hosts. In BGP EVPN
or flood and learn mode, the multicast deployments are either Any-Source Multicast (ASM) (using
Anycast) or bidirectional (with a phantom RP). With ASM, the multicast design uses the spine as the RP
for the underlay traffic traversal. Multicast implementation involves one group to manage a set of VNIs
(grouped under a category that is tied to multitenancy, user groups, and so on). This reduces the multicast
state table in the underlay.
You can also have each VNI tied to a separate multicast group; this is adequate for a small deployment,
but you should not consider it for larger deployments that have scale constraints due to the number of
multicast group entries mapped with each VNI. VXLAN supports up to 16 million logical L2 segments,
using the 24-bit VNID field in the header. With one-to-one mapping between VXLAN segments and IP
Multicast groups, an increase in the number of VXLAN segments causes a parallel increase in the
required multicast address space and some forwarding states on the core network devices. Packets
forwarded to the multicast group for one tenant are sent to the VTEPs of other tenants that are sharing
the same multicast group. This communication is inefficient utilization of multicast data plane resources.
Therefore, the solution is a trade-off between control plane scalability and data plane efficiency.
To understand the configuration of VXLAN with BGP EVPN, let’s review the use case of BGP EVPN
181
Chapter 4 Multicast in Data Center Environments
with a multicast underlay configuration. The diagram in Figure 4-10 illustrates eBGP established between
spine and leaf, using Nexus 9000 devices.
Spine 1 Configuration
182
Chapter 4 Multicast in Data Center Environments
interface loopback0
ip address 10.10.1.1/32
ip pim sparse-mode
interface loopback1
ip address 10.100.1.1/32
ip pim sparse-mode
Step 4. Configure Anycast RP:
183
Chapter 4 Multicast in Data Center Environments
disable-peer-as-check
send-community extended
route-map permitANY out
Step 7. Specify the BGP underlay configurations (with the direct IP address of the interface between
the spine and leaf):
Click here to view code image
neighbor 172.16.1.2 remote-as 20
address-family ipv4 unicast
allowas-in
disable-peer-as-check
<..>
Leaf Configuration
Then view the nve peers using the show nve peers command:
Click here to view code image
show nve peers
9396-B# show nve peers
Interface Peer-IP Peer-State
--------- --------------- ----------
nve1 10.30.1.1 Up
To verify the status of the VNI use the show nve vni command:
Click here to view code image
show nve vni
9396-B# show nve vni
Codes: CP - Control Plane DP - Data Plane
UC - Unconfigured SA - Suppress ARP
The control plane for the VNI is built and can be viewed by show ip mroute at the VTEP (leaf), see the
highlighted flags to shows the nve control state:
Click here to view code image
VTEP-1# sh ip mroute
IP Multicast Routing Table for VRF "default"
(*, 232.0.0.0/8), uptime: 5d10h, pim ip
Incoming interface: Null, RPF nbr: 0.0.0.0, uptime: 5d10h
Outgoing interface list: (count: 0)
(*, 239.1.1.1/32), uptime: 00:05:18, nve ip pim
Incoming interface: Ethernet1/58, RPF nbr: 10.19.9.2, uptime: 00:05:18
Outgoing interface list: (count: 1)
nve1, uptime: 00:05:18, nve
Ingress Replication
In this case, the BUM traffic is not sent via the multicast underlay. Instead, data packets are replicated by
the ingress VTEP to other neighboring VTEPs that are part of the same VNI. The resources the ingress
VTEP needs to allocate for BUM traffic are tied to the number of VTEPs associated with the VNI in the
fabric. The ingress replication method can also be applied to BGP EVPN and VXLAN flood and learn, as
shown in Example 4-2 (using the topology from Figure 4-10).
Example 4-2 Leaf Configuration for nve1 for Ingress Replication
Click here to view code image
interface nve1
no shutdown
source-interface loopback0
host-reachability protocol bgp
member vni 2901
suppress-arp
ingress-replication protocol static
peer-ip 10.50.1.1
member vni 900001 associate-vrf
Example 4-3 shows the command to verify ingress replication.
Example 4-3 Verifying Ingress Replication
Click here to view code image
VTEP-1# show nve vni ingress-replication
Interface VNI Replication List Source Up Time
--------- -------- ----------------- ------- -------
nve1 2901 10.50.1.1 CLI 00:00:33
VTEP-1#
The spine and leaf need not have multicast configuration with ingress replication. Leaf IP address
10.50.1.1 (loopback 0 for remote VTEP identification) is mentioned in this configuration. The example
shows only two VTEPs participating in the fabric. The ingress replication IP address is equal to the total
number of leafs in the fabric and must be replicated to all the VNI segments in the leaf.
188
Chapter 4 Multicast in Data Center Environments
Host-to-Host Multicast Communication in VXLAN
Enabling IGMP snooping helps achieve communication to hosts in the VNI that are interested in
multicast traffic. With standard IGMP snooping, the VTEP interface is added to the outgoing interface
list for multicast traffic. Then, even if no receiver is connected to the VTEP, the VTEP receives the
traffic and drops it. The way to optimize this behavior for only the VTEP with multicast receivers in the
L2 VNI is by adding the configuration shown in Example 4-4 (under the bridge domain).
Example 4-4 Configuration to Optimize IGMP Snooping for VLXAN
Click here to view code image
VTEP-2(config)# ip igmp snooping vxlan
VTEP-2 (config)# int vlan 01
VTEP-2 (config-if)# ip igmp snooping disable-nve-static-router-port
VTEP-2 (config)#
You can configure ip igmp snooping disable-nve-static-router-port globally or per VLAN to learn
snooping states dynamically, as show in Figure 4-11.
189
Chapter 4 Multicast in Data Center Environments
Figure 4-11 L2 Communication Within the Boundary of the VNI
Using this command, the VTEP interface to the multicast group is added to the Layer 2 outgoing
interface, based on the availability of the receiver. In Figure 4-11 the multicast source is connected to
VTEP2. The traffic does not get flooded to all the VTEPs where the VNI has an instance. Instead,
communication is optimized between VTEP2 and VTEP1, and VTEP 3 does not get the traffic because
no receiver is connected to VTEP3 for the multicast traffic 239.1.1.1.
The routed multicast capability over VXLAN fabric is achieved by extending the bridge domain over
VXLAN to the edge router. A per-tenant pairing needs to be configured to establish a Protocol
Independent Multicast (PIM) relationship within the L3 environments. Unfortunately, this per-tenant
peering is not efficient.
Figure 4-12 shows a centralized model of a multicast control plane for a VXLAN fabric. The VXLAN
fabric leverages an external router. This method is similar to using Layer 2 multicast in a given bridge
domain sent to an external router. All bridge domain instances are present in the router. The default
gateway for the VXLAN is still through the unicast distributed anycast gateways. The designated router
for PIM is at the external router, outside the VXLAN fabric. Incoming packets will need to pass the
Reverse Path Forwarding (RPF) checks on the external router. The external router knows all the sources
or receivers in the fabric and is the conduit to exchange unicast RIB information with the L3
environment outside the fabric. However, with this approach, you need a dedicated external router
infrastructure to support multicast. A future VXLAN solution with multicast may evolve with more
distributed approaches for Layer 3 deployment.
190
Chapter 4 Multicast in Data Center Environments
191
Chapter 4 Multicast in Data Center Environments
Each of the leaf and border leaf switches in the ACI topology is acting as a VTEP. Thus, many of the
same rules for multicast routing over VXLAN apply. The major differences in an ACI deployment are
related to the way the network handles network overlay elements. These elements include endpoint
groups (EPGs), bridge domains (BDs), policies, and Virtual Route Forwarding (VRF) instances. These
elements allow ACI to isolate tenants, networks, and applications within the underlying VXLAN fabric.
All these elements are configured with the APIC, like the one shown in Figure 4-13.
Regardless of the virtual network topologies overlaid on the fabric, ACI uses a specific forwarding model
for all multidestination traffic that is a little unique. All multidestination traffic in an ACI fabric is
encapsulated in a IP Multicast packet. These packets then follow a forwarding tag (FTAG) tree, which is
built between leafs and fabric spines so that traffic is load balanced over all available bandwidth. The
idea is that if the leaf must send all traffic north, regardless of its east–west or north–south path, load
splitting traffic between the spines improves the overall efficiency of the fabric. FTAGs are built into the
MAC address of each packet that uses the tree. The spine switches manage the tree and then forward the
packets to any VTEPs in the tree that need the packet. Each FTAG has one FTAG tree associated with it.
Between any two switches, only one link forwards per FTAG. Because there are multiple FTAGs,
parallel links are used, with each FTAG choosing a different link for forwarding. A larger number of
FTAG trees in the fabric means better load balancing potential. The ACI fabric supports up to 12 FTAGs.
Figure 4-14 depicts the FTAG tunneling matrix for a two-spine, three-leaf fabric.
192
Chapter 4 Multicast in Data Center Environments
ACI supports IGMP snooping by enabling an IGMP router function in software. The IGMP router
function is enabled first within a bridge domain and is used to discover EPG ports that have attached
hosts that are multicast clients. ACI uses the port information obtained through IGMP snooping to reduce
bandwidth consumption in a multi-access bridge domain environment. If the leaf switch knows where the
clients are located, there is no need to flood multicast flows across the entire bridge domain. Instead, only
leaf ports with attached subscribers receive multicast flow packets for a given flow. IGMP snooping is
enabled on each bridge domain by default.
When IGMP snooping is enabled, the leaf switch snoops the IGMP membership reports and leave
messages as they enter the fabric from attached hosts. The leaf switch records the group subscriptions
and then forwards them to the IGMP router function—only if L3 processing is required. Figure 4-16
shows the IGMP router function and IGMP snooping functions both enabled on an ACI leaf switch.
ACI is an advanced data center networking platform that uses SDN. ACI, therefore, supports both Layer
2 and Layer 3 multicast configurations within the configurable virtual network overlays. PIM is
supported in these constructs; however, it is important to understand the following:
PIM-enabled interfaces: The border leaf switches run the full PIM protocol. This allows the ACI
fabric to peer with other PIM neighbors outside the fabric.
Passive mode PIM-enabled interfaces: The PIM-enabled interfaces do not peer with any PIM router
outside the fabric. This is configured on all non-border leaf interfaces.
The fabric interface: This interface is used within the fabric for multicast routing. This interface is a
software representation of a segment/node in ACI fabric for multicast routing. The interface is similar to
a tunnel interface with the destination being GIPo (Group IP Outer Address). VRF GIPo is allocated
implicitly based on the configuration of the APIC. There is one GIPo for the VRF and one GIPo for
every bridge domain under that VRF. Each interface is tied to a separate multitenant domain (VRF) in the
same node. Within the fabric, if the border leaf has an outgoing interface for multicast group 239.1.1.1,
the fabric interface is tied to the VRFs. This is accomplished by using a unique loopback address on each
border leaf on each VRF that enables multicast routing. Figure 4-17 gives a quick depiction of how the
logical fabric interfaces are viewed by the leaf switches.
196
Chapter 4 Multicast in Data Center Environments
Summary
There are numerous network architecture solutions in the data center, and each of them includes a unique
method to support multicast. A solution may be as simple as VPC, which uses a hashing method to
determine the path that messages use to traverse the network. Another solution is VXLAN, which uses an
underlay to encapsulate messages and propagate those messages across the fabric. Finally, ACI also takes
advantage of VXLAN for extensibility but also adds functionality to specifically control traffic flow
based on the application.
Applications within the data center (server to server) and those used to support the customers (client to
server) may have unique multicast requirements. It is best practice to delve deeply into those multicast
technologies and the applications that are most critical to the success of your organization.
199
Chapter 5 Multicast Design Solutions
Chapter 5
Multicast Design Solutions
The previous chapters of this book discuss some rather advanced types of multicast networks. Typically, as the
requirements of a network grow in complexity, so do the number and type of protocols needed to meet those
requirements. The chapters so far in this book and IP Multicast, Volume 1 have provided the tools needed to meet
most of these requirements. However, there are some additional elements to consider when designing complex
multicast networks.
Items to cogitate include desired replication points, multitenancy and virtualization properties, and scalability of the
multicast overlay in relation to the unicast network. In addition, every network architect must be concerned with the
redundancy, reliability, and resiliency of a network topology. Multicast network design must consider these elements
as well. The best way to discuss these elements of multicast design is through the examination of specific network
archetypes that employ good design principles.
This chapter examines several archetypical network design models. These models might represent a specific network
strategy that meets a specific commercial purpose, such as a trade floor. The examined model may also be a general
design for a specific industry, such as the deployment of multicast in a hospital environment. The intent is to provide
a baseline for each type of design, while providing examples of best practices for multicast deployments.
This chapter looks at the following design models:
Multicast-enabled hospital networks, with an emphasis on multicast in wireless access networks
Multicast multitenancy data centers
Software-defined networks with multicast
Multicast applications in utility networks
Multicast-enabled market applications and trade floors
Multicast service provider networks
Note The information provided in this chapter is not intended to be a comprehensive list of everything an architect
needs to know about a particular design type, nor is it meant to be an exercise in configuration of these elements.
Rather, this chapter provides a baseline of best practices and principles for specific types of networks. Before
completing a network implementation, consult the most current design documents for multicast networks at
www.cisco.com.
200
Chapter 5 Multicast Design Solutions
Some newcomers to networking may be unacquainted with the access/distribution/core design model. Modern
network devices and protocols allow architects to collapse much of the three-tier hierarchy into segments or, in many
cases, single devices. For most campuses, even if the physical segmentation between the layers does not exist, there
is at minimum a logical segmentation between the access and distribution/core layers of the design. For example,
there is a routing and switching layer at the collapsed distribution/core of the network. Additional resources connect
to this collapsed core, including data center or WAN connections. Figure 5-2 shows a basic campus network with
collapsed distribution and core layers.
201
Chapter 5 Multicast Design Solutions
Note: The ramifications of this hierarchy discussion and the effects on campus design are beyond the scope of this
text but provide context for discussing design models as the parlance has not changed. For more information on this
topic and full campus design, see the latest network design guides at www.cisco.com.
Because network access can be obtained by a myriad of device types, the access layer design usually includes a very
robust wireless and wired infrastructure. Most isolation and segmentation in the access layer is provided by virtual
local area network (VLAN) segmentation on Layer 2 switches. Layer 3 distribution routers provide inter-VLAN
routing, virtual private network (VPN) services, and transit to centralized resources. Campus core switches and
routers move traffic across all segments of the network quickly and efficiently.
In addition, many elements of hospital network design are completely unique. A typical campus network provides
employees access to resources such as Internet services, centralized employee services (print, file, voice, and so on),
bring-your-own-device (BYOD) services, and some minor guest services. A hospital network needs to deliver these
to employees as well. However, the main purpose of a hospital is to provide services to patients and guests.
Therefore, a hospital network needs to accommodate devices that service patients. Devices in a hospital network
include badge systems, medical monitors, interpersonal communications, media services, and biomedical equipment.
Data generated by these devices needs to be stored locally and remotely, and other data center type services are
required as well. In addition, the hospital is likely to have extensive guest services, mostly focused on providing
wireless guest Internet access.
Keeping all these devices secure and properly securing and segmenting data paths is critical. Government regulations,
such as the well-known Health Insurance Portability and Accountability Act (HIPAA) in the United States, require
that these security measures be comprehensive and auditable. This means hospital devices must be segregated from
both users and personnel networks.
All these unique elements make network design very complex. For this reason, there is a great deal of variability in
hospital network deployments. Regardless of this variability, almost every hospital network offers some level of
multicast service. Most medical devices and intercommunications media can be configured with IP Multicast to make
service delivery and data collection more efficient. Providing medical device and intercommunications services is the
primary focus of this design discussion.
If you have ever been to a hospital, it is likely that you have noticed that many medical devices are used by doctors
and staff to treat and monitor patients. What you may not have realized is that many of these devices are connected
to the hospital campus network. These devices can include anything from a smart pump, to a heart monitor, to
imaging equipment such as x-ray machines.
The benefits of connecting these devices to the network are enormous. The most important benefit is that data
generated by these devices can be accessed remotely and, in many cases, in real time. In addition, network-
connected biomedical devices—as well as non-medical devices such as badges—can be moved to any room on the
hospital campus. When connected to a wireless network, these devices can be small and follow staff and patients
about the hospital as required.
Note: Badge systems in hospitals are critical infrastructure. Because of the intense security requirements of human
care and regulations such as HIPAA, hospitals need to know where patients are, where staff are, and where each
person should be. Badge systems also control access to medical devices, medication, and data. This not only secures
the devices, medication, and data but protects patients from their own curiosity.
Generally, the way in which these devices communicate is largely controlled by the device manufacturer. However,
most modern hospital devices can at a minimum communicate using Layer 2 Ethernet, and many communicate using
202
Chapter 5 Multicast Design Solutions
Internet Protocol (IP) at Layer 3. Most medical devices of a particular kind need to connect to a centralized
controller or database that also needs access to the same network resources.
Manufacturers use three prototypical communication models to accomplish this:
Layer 2 only
Layer 2 and Layer 3 hybrid
Layer 3 only
The first type of model uses Layer 2 connections only. In this model, it may be a requirement that all devices be on
the same Layer 2 domain (segment). In an Ethernet network, this means they must be on the same VLAN.
Centralized device controllers and data collectors may also need to be on the same segment. Some manufacturers
require that the same wireless network, identified by the same Service Set Identification (SSID), be used for the
monitors and controllers alike. Figure 5-2 depicts this type of model, with all the networked devices and their
centralized resources connected to the same SSID (OneHospital) and VLAN (VLAN 10).
Note: In Figure 5-3, the campus distribution/core switch is more than a single switch. In a modern network, the
collapsed core is typically made of at least one pair of switches, acting as a singular switch. For example, this might
be Cisco’s Virtual Switching System (VSS) running on a pair of Cisco Catalyst 6800 switches.
203
Chapter 5 Multicast Design Solutions
Even though there are three distinct models, you cannot assume that all devices fit neatly within one model—or even
within the examples as they are shown here. They are, after all, only models. It is also likely that a hospital campus
network has to accommodate all three models simultaneously to connect all medical devices.
Medical devices that make full use of IP networking and multicast communications are very similar to other network
devices. They need to obtain an IP address, usually via Dynamic Host Configuration Protocol (DHCP), and have a
204
Chapter 5 Multicast Design Solutions
default gateway to communicate outside their local VLAN. One major difference, as shown in the models in this
section, is that medical devices typically connect to some type of controller or station. This is where multicast
becomes very practical: allowing these devices to communicate with the central station in a very efficient manner, as
there are many devices that connect to a single controller. In these scenarios, either the central station or the device
(or both) may be the multicast source. This is a typical many-to-many multicast setup. Figure 5-6 shows this
communication in action for the Layer 3 model shown in Figure 5-5.
205
Chapter 5 Multicast Design Solutions
environment where severe segmentation at both Layer 2 and Layer 3 are required. When multiple RPs need to
coexist, a standard PIM Sparse-Mode (PIM-SM) implementation may be a better choice. This is essentially a
multidomain configuration. If interdomain multicast forwarding is required, PIM-SM with Multicast Source
Discovery Protocol (MSDP), and perhaps Border Gate Protocol (BGP), is ideal.
PIM Source-Specific Multicast (SSM) provides the benefits of both Bidir-PIM and PIM-SM without the need for a
complex interdomain or multidomain design. SSM does not require an RP for pathing, nor does it require additional
protocols for interdomain communications. The biggest drawback to using SSM is that many medical devices,
especially those that are antiquated, may not support IGMPv3. Recall from IP Multicast, Volume 1 that IGMPv3
subscribes a host to a specific source/group combination and is required for the last-hop router (LHR) to join the
appropriate source-based tree. In addition, SSM may not be able to scale with a very large many-to-many design.
Selecting the best PIM implementation for a hospital campus is an exercise in comparing the advantages and
disadvantages of each with the capabilities of both the network devices and the devices connecting to the campus
network. It is very likely that no single implementation will perfectly address all the needs of the network. Table 5-1
compares some of the advantages and disadvantages of the different PIM implementations in a hospital network.
Table 5-1 Comparison for PIM Implementation Selection for Hospital Networks
This table is useful for any campus design but considers the specific needs of a hospital campus. It is very likely that
the deployment of more than one implementation is the best choice. For example, localized medical device multicast
that does not leave the local VLANs can use a simple Bidir-PIM implementation with a centralized Phantom RP.
VLANs with devices that need fully routed IP Multicast with possible interdomain communication can use a single
Anycast RP solution, with PIM-SM providing the tree building. Core network services may use an SSM
implementation to improve the reliability of the infrastructure by eliminating the need for RPs. All three PIM
implementations can coexist within the same campus if properly configured as separate, overlapping, or intersecting
domains. PIM-SM and PIM-SSM can be configured on the same router interfaces, whereas PIM-SM and Bidir-PIM
are mutually exclusive.
This brings up an important point: When purchasing network equipment for a medical campus, make sure it
optimizes support for these implementations. Whenever possible, select routers, switches, APs, WLCs, and other
infrastructure that support the following:
All versions of IGMP (versions 1, 2, and 3)
The appropriate scale of IGMP joins and IGMP snooping for the number of expected devices (This is the most
basic requirement at the edge of the network.)
206
Chapter 5 Multicast Design Solutions
The appropriate PIM implementation(s) (Bidir-PIM, PIM-SM, and PIM-SSM)
Optimal packet replication techniques to improve throughput and resource consumption
Simplified multicast configuration and troubleshooting
Multicast path optimization features, such as multipath multicast and spanning-tree optimization, or elimination, for
all traffic flows
Ability to secure multicast devices and domains
Ability to provide quality of service (QoS) to multicast flows, including rate-limiting and/or storm control to limit
overconsumption of resources by malfunctioning devices
For wired switches, multicast is rudimentary, a basic component of Layers 2 and 3. A wired switch uses IGMP
snooping, and each VLAN is configured with at least one routed interface that supports IGMP and multicast
forwarding. This ensures that the devices connected to the switch port have access to the larger campus multicast
overlay. Wireless Ethernet functions in a similar manner but introduces additional concerns. Ensuring that the
wireless network efficiently and effectively transports all traffic is perhaps the most important consideration in
designing multicast in a medical campus.
The first point to understand when considering multicast data transmission in wireless networking is that wireless
networks use IP Multicast in the wireless management plane. A campus network is likely to have many APs. To
ensure that these APs function in sync with one another, one or more WLCs are used. A WLC acts very much like
the medical device central station discussed in the previous section. It connects to and oversees the configuration of
the APs in the network. To enable multicast within the management plane, the switched management network must
be properly configured for a multicast overlay. This may very well be an MVPN using multicast Virtual Route
Forwarding (VRF) instances, as discussed in Chapter 3, “Multicast MPLS VPNs.”
More important is the flow of multicast traffic to and from wireless LAN (WLAN) connected devices. This is
especially relevant when sources and receivers are connected via either the wired or wireless infrastructure and they
must be able to complete a forwarding tree, regardless of the connection type or location. Cisco WLANs must work
seamlessly with the physical wired LAN. Another way of saying this is that a campus WLAN must be a functional
extension of the VLAN, making the two functionally equivalent.
In an archetypical campus, each SSID is equivalent to a VLAN and is essentially configured as such. The VLAN may
only be delivered wirelessly, or it could be accessible through both the wireless and the wired infrastructure. Figure
5-4 (shown earlier) illustrates just such a network, where patient heart monitors and the central station for these
monitors are all part of the same VLAN, VLAN 11. SSID MonitorTypeTwo extends the VLAN to the individual
wireless client monitors.
This design works well in environments where multicast is rarely forwarded past the local VLAN and where
individual monitor functions need to be on the same Layer 2 domain. However, many modern hospitals use
segmentation that is based on the patient’s location rather than the type of device deployed. Figure 5-7 shows a
design that uses a room grouping (perhaps based on the hospital floor, or even the type of room) to divide VLANs
and SSIDs. In this design, the central monitor stations are part of the same VLAN, terminated on the access layer
switches of the hospital.
207
Chapter 5 Multicast Design Solutions
Figure 5-7 SSIDs Deployed by Room Groups with Local Control Stations
It is also possible that a routed network separates the monitoring station servers, which are then accessed by hospital
staff through an application. These stations are generally centralized in a local data center to prevent WAN failures
from causing service interruption. Figure 5-8 depicts the same room grouping design, with centralized control
stations. Wireless APs provide access to the monitors, while a fully routed Layer 3 network extends between the
monitor devices and the data center, crossing the collapsed campus core.
Figure 5-8 SSIDs Deployed by Room Groups with Centralized Central Stations
Regardless of how the SSIDs are deployed, how does the AP know what to do with the multicast packets sourced
from either the wireless clients or other sources on the network? This question is especially relevant in a Layer 2 and
3 hybrid design or in a fully routed Layer 3 device communications model, like the one shown in Figure 5-8. The APs
need to be able to replicate and forward packets upstream to where Layer 3 is terminated.
208
Chapter 5 Multicast Design Solutions
Wireless networks are, of course, more than just a collection of APs, SSIDs, and VLANs. Campus WLANs also use
WLCs to manage the APs and SSIDs of the network. WLCs in a hospital campus almost always function as local
management, meaning that APs are not only configured by the WLC but also tunnel traffic to the WLC for advanced
switching capabilities, such as multicast packet replication. The tunneling mechanism used in this type of campus is
the IETF standard Control and Provisioning of Wireless Access Points (CAPWAP), as defined by IETF RFC 5415.
Figure 5-9 shows this relationship between the APs and the WLC(s) in a campus with a fully routed Layer 3 device
communications model.
210
Chapter 5 Multicast Design Solutions
Step 1. The AP receives a multicast packet from a wireless client—in this case, for group 239.1.1.110. The AP
encapsulates the original multicast packet inside the CAPWAP unicast tunnel and forwards it to the WLC. This is the
same process for any packet that needs additional routing in a wireless CAPWAP network.
Step 2. The WLC receives the packet, strips the unicast CAPWAP header, and forwards the original multicast
packet upstream, toward the routed network. The standard routed multicast overlay ensures that the packets are
forwarded toward any wired subscribed clients, regardless of their location (assuming that a complete multicast tree
can be built). The WLAN controller also replicates the original multicast packet, encapsulates it inside a single
CAPWAP multicast packet with the appropriate bitmap ID, and sends it toward the routed network. From there, the
process is the same as when the packet comes from a wired client, eventually being forwarded to the appropriate
APs and then to the SSID. This means the originating SSID, including the originating client, gets a copy of the same
packet. Wireless multicast clients are designed to account for this phenomenon and should not process the replicated
packet.
One of the primary uses of a multicast design in a hospital is to enable a hospital badging system. Clinical badge
systems are not like the standard swipe card badge systems that many users are familiar with. Badge systems for
hospitals, such as those provided by system manufacturer Vocera, are more forward thinking. They use IP integration
and Voice over IP (VoIP) to establish communications between badge users and other hospital systems and
applications, much like the futuristic badges used on your favorite sci-fi TV series.
Clinical badge systems allow hospital staff to communicate quickly and efficiently across specific teams, hospital
wards, or other organizations. Many have unique paging features and are also used to track the locations of wearers.
This location tracking is integrated into hospital security systems for quick access to specific areas of the hospital.
Other medical badge systems can be used to track location for system lockdowns, like those used to monitor infant
location and prevent infants from leaving the wards to which they are assigned. If you have ever been to a hospital,
you may have seen badge systems like these.
Because badges are worn by hospital staff, it is critical that wireless infrastructure be fast, resilient, and reliable.
Medical devices also need similar network resiliency. That is the primary reason that WLCs are installed locally at
the distribution or access layer of the network and use CAPWAP tunnels instead of remotely deployed WLCs. As
211
Chapter 5 Multicast Design Solutions
mentioned earlier, these badge systems rely on multicast as well. This means the multicast infrastructure should also
be fast, resilient, and reliable. Achieving this means that designing a robust Layer 2 and Layer 3 multicast
implementation is critical; the implementation must follow the best practices described in IP Multicast, Volume 1.
In addition, for PIM-SM networks, the placement of the RP is a critical consideration. If the RP is located on a
remote WAN separated segment, or if there are too many failure points between the WLC and the RP, there will
likely be outages for multicast clients and sources. To reduce the number of failures, place the WLCs and RPs close
to each other and near the multicast network, with few failure points between.
Two typical multicast domain models are used to deploy RPs:
Domain model 1: The first model is used for isolated domains that do not extend beyond specific geographic
locations. In this model, RPs are placed locally in each domain. Often the nearest Layer 3 switch that is acting as the
first-hop router (FHR) and last-hop router (LHR) is also configured as the RP. This limits the failure scope of the
domain to that specific Layer 2 or 3 domain.
Domain model 2: The second model is used when multidomain or interdomain communications are required. In
this model, multiple redundant RPs may be used, usually placed by domain. This is similar to the first model but with
the addition of hospitalwide resources.
Figure 5-12 shows a comparison of the domain setup for the two models. Domain model 1 uses one controller and
the distribution/core switch pair as the RP for each group of rooms. The Layer 3 FHR and RP are configured on the
appropriate VLAN interfaces. Domain model 2 expands on Domain model 1 by adding a hospitalwide domain with
Anycast RPs and a stack of hospitalwide WLCs in the local hospital data center.
Note: Remember that the example using a RoomGroup for VLAN and WLAN assignment is simply one option for
device grouping. The domains in each model can be organized in any logical way, typically based on the
requirements of the application. The grouping used in this chapter simply makes the design elements easier to
consume. In addition, a larger Layer 3 PIM-SM model can link domains together by using MSDP and/or BGP to
provide the ability to forward between multicast domains. In addition, more intense segmentation may exist within
the hospital, and multicast VRF instances may be a requirement that must also be met in the design.
212
Chapter 5 Multicast Design Solutions
No matter what domain model is used, domains must be secured and segmented properly. Wireless and multicast
architects should be well versed in the operation of these networks and should design clinical multicast with
resiliency and reliability as key foundational elements. They need to ensure that clinical staff do not experience
interruptions to service, which produces a superior patient experience.
213
Chapter 5 Multicast Design Solutions
214
Chapter 5 Multicast Design Solutions
Cisco ACI enhances the segmentation capabilities of multitenancy. ACI, as you learned in Chapter 4, is an SDN data
center technology that allows administrators to deconstruct the segmentation and policy elements of a typical
multitenant implementation and deploy them as an overlay to an automated Layer 2/Layer 3 underlay based on
VXLAN. All principal policy elements in ACI, including VRF instances, bridge domains (BDs), endpoint groups
(EPGs), and subnets, are containerized within a hierarchical policy container called a tenant. Consequently,
multitenancy is an essential foundation for ACI networks. Figure 5-15 shows the relationship between these overlay
policy elements.
216
Chapter 5 Multicast Design Solutions
denied by default. EPGs can then align to private and public zones and can be designated as front-end or back-end
zones by the policies and contracts applied to the application profile in which the EPG resides.
When using non-routed L3 multicast within a zone, the BD that is related to the EPG natively floods multicasts
across the BD, much as a switched VLAN behaves. Remember from Chapter 4 that the default VRF is enabled with
IGMP snooping by default, and all BDs flood the multicast packets on the appropriate ports with members. Link-
local multicast, group range 224.0.0.X, is flooded to all ports in the BD GIPo (Group IP outer), along with one of the
FTAG (forwarding tag) trees.
ACI segmentation and policy elements can be configured to support multicast routing, as covered in Chapter 4. For
east–west traffic, border leafs act as either the FHR or LHR when PIM is configured on the L3 Out or VRF instance.
Border leafs and spines cannot currently act as RPs. Thus, the real question to answer for ACI multitenancy is the
same as for VMDC architectures: What is the scope of the domain? The answer to this question drives the placement
of the RP.
Because the segmentation in ACI is cleaner than with standard segmentation techniques, architects can use a single
RP model for an entire tenant. Any domains are secured by switching VRF instances on or off from multicast or by
applying specific multicast contracts at the EPG level. Firewall transport of L3 multicast may not be a concern in
these environments. A data center–wide RP domain for shared services can also be used with ACI, as long as the RP
is outside the fabric as well. Segmenting between a shared public zone and a tenant public or private zone using
different domains and RPs is achieved with VRF instances on the outside L3 router that correspond to internal ACI
VRF instances. EPG contracts further secure each domain and prevent message leakage, especially if EPG contracts
encompass integrated L3 firewall policies.
219
Chapter 5 Multicast Design Solutions
ETR publishes EID-to-RLOC mappings for the site-to-map server and responds to map-request messages.
ETRs and ITRs de-encapsulate and deliver LISP-encapsulated packets to a local end host within a site.
During operation, an ETR sends periodic map-register messages to all its configured map servers. These messages
contain all the EID-to-RLOC entries for the EID-numbered networks that are connected to the ETR’s site.
An ITR is responsible for finding EID-to-RLOC mappings for all traffic destined for LISP-capable sites. When a site
sends a packet, the ITR receives a packet destined for an EID (IPv4 address in this case); it looks for the EID-to-
RLOC mapping in its cache. If the ITR finds a match, it encapsulates the packet inside a LISP header, with one of its
RLOCs as the IP source address and one of the RLOCs from the mapping cache entry as the IP destination. If the
ITR does not have a match, then the LISP request is sent to the mapping resolver and server.
Both map resolvers (MR) and map servers (MS) connect to the LISP topology. The function of the LISP MR is to
accept encapsulated map-request messages from ITRs and then de-encapsulate those messages. In many cases, the
MR and MS are on the same device. Once the message is de-encapsulated, the MS is responsible for matching the
ETR’s RLOC is authoritative for the requested EIDs.
An MS maintains the aggregated distributed LISP mapping database; it accepts the original request from the ETR
that provides the site RLOC with EID mapping connected to the site. The MS/MR functionality is similar to that of a
DNS server.
LISP PETRs/PITRs
A LISP proxy egress tunnel router (PETR) implements ETR functions for non-LISP sites. It sends traffic to non-
LISP sites (which are not part of the LISP domain). A PETR is called a border node in SDA terminology.
A LISP proxy ingress tunnel router (PITR) implements mapping database lookups used for ITR and LISP
encapsulation functions on behalf of non-LISP-capable sites that need to communicate with the LISP-capable site.
Now that you know the basic components of a LISP domain, let’s review the simple unicast packet flow in a LISP
domain, which will help you visualize the function of each node described. Figure 5-17 shows this flow, which
proceeds as follows:
The LISP multicast feature introduces support for carrying multicast traffic over a LISP overlay. This support
currently allows for unicast transport of multicast traffic with headend replication at the root ITR site.
The implementation of LISP multicast includes the following:
At the time of this writing, LISP multicast supports only IPv4 EIDs or IPv4 RLOCs.
LISP supports only PIM-SM and PIM-SSM at this time.
LISP multicast does not support group-to-RP mapping distribution mechanisms, Auto-RP, or Bootstrap Router
(BSR). Only Static-RP configuration is supported at the time of this writing.
LISP multicast does not support LISP VM mobility deployment at the time of this writing.
In an SDA environment, it is recommended to deploy the RP outside the fabric.
Example 5-1 is a LISP configuration example.
Example 5-1 LISP Configuration Example
Click here to view code image
xTR Config
ip multicast-routing
!
interface LISP0
ip pim sparse-mode
!
interface e1/0
ip address 10.1.0.2 255.255.255.0
ip pim sparse-mode
!
router lisp
221
Chapter 5 Multicast Design Solutions
database-mapping 192.168.1.0/24 10.2.0.1 priority 1 weight 100
ipv4 itr map-resolver 10.140.0.14
ipv4 itr
ipv4 etr map-server 10.140.0.14 key password123
ipv4 etr
exit
!
!
Routing protocol config <..>
!
ip pim rp-address 10.1.1.1
MR/MS Config
ip multicast-routing
!
interface e3/0
ip address 10.140.0.14 255.255.255.0
ip pim sparse-mode
!
!
router lisp
site Site-ALAB
authentication-key password123
eid-prefix 192.168.0.0/24
exit
!
site Site-B
authentication-key password123
eid-prefix 192.168.1.0/24
exit
!
ipv4 map-server
ipv4 map-resolver
exit
!
Routing protocol config <..>
!
ip pim rp-address 10.1.1.1
In this configuration, the MS/MR is in the data plane path for multicast sources and receivers.
For SDA, the configuration for multicast based on the latest version of DNA Center is automated from the GUI. You
do not need to configure interface or PIM modes using the CLI. This configuration is provided just to have an
understanding of what is under the hood for multicast configuration using the LISP control plane. For the data plane,
you should consider the VXLAN details covered in Chapter 3, and with DNA Center, no CLI configuration is needed
for the SDA access data plane.
222
Chapter 5 Multicast Design Solutions
223
Chapter 5 Multicast Design Solutions
control centers have very different requirements for security and connections to real-time systems. The use of
firewalls at different tiers for regulatory compliance and protection of grid assets creates a requirement for multicast
transport across the firewalls.
The data center network, or corporate office, follows the traditional enterprise environments and feature design
components. To design multicast in a utility environment, you need to understand a couple key applications that are
unique to utility environments, as discussed in the following sections.
PMU
Phasor measurement units (PMUs) allow for granular collection of important operational data to provide high-quality
observation and control of the power system as it responds to supply and demand fluctuations. PMU data is useful
for early detection of disturbances and needs to be collected at significantly higher frequency (typically 120 to 200
times per second); it requires a high degree of performance collection, aggregation, dissemination, and management.
This requires higher bandwidth and data transmission with the least amount of latency. The bandwidth ranges
between 1Mb/s and 2Mb/s of continuous streaming, and the latency component can be anywhere between 60 ms to
160 ms.
Most networks in the utility space are not designed or positioned to deal with the explosion of data that PMUs
generate. PMU data is a multicast feed that is sourced from sensors and sent to the control grid. The design for PIM-
SM or PIM-SSM depends on the number of multicast state entries and the type of IGMP driver supported in the host.
The critical design factor to consider is the firewall’s support for multicast; most of the sensor data is within the
substation domain, which is considered a North American Electric Reliability Corporation (NERC) asset. The control
grid is also protected by a firewall infrastructure. For PMU application, multicast transport across the firewall
infrastructure is a key design factor.
In the utility environment, the Radio over IP design is a key building block required for communication. The use of
multicast needs to be understood based on the design of Radio over IP systems. The communication details are as
follows:
Individual calls between two radios from different site (Unicast could be a possible communication mechanism.)
Group calls from radios from different site (Multicast could be a possible communication mechanism.)
Dispatcher calls between different sites
Radio to telephone calls, where the radio and the private automatic branch exchange (PABX) are at different sites
Normally, Radio over IP design is considered any-to-any multicast. You are correct if you are thinking about
bidirectional multicast design.
In addition, the energy grid application utility infrastructure hosts corporate applications such as music on hold
(MoH), wireless multicast, multicast applications for imaging, data center requirements for multicast, and so on. The
utility multicast design has requirements of different types of multicast traffic with separate domains. These domains
can be virtual and separated by a control plane (MPLS Layer 3 VPNs using MVPNs or other methods) or can be
logical, existing in the same control plane and overlapping. This is generally applicable to the utility WAN, control
grid, and data center design blocks. It is important to configure multicast applications with a proper multicast
addressing plan that can be rolled out enterprisewide.
224
Chapter 5 Multicast Design Solutions
Multicast-Enabled Markets
The use of multicast in financial environments is very prevalent, and it is extremely important for business
application in the trade environment. This section introduces you to multicast design and fundamentals of financial
networks. The elements of the financial network are stock exchanges, financial service providers, and brokerages
(see Figure 5-19). These domains are not only separate multicast networks, they are separate entities; for example,
NASDAQ is an example of a stock exchange, Bloomberg is an example of a financial provider, and JPMorgan is an
example of a brokerage.
225
Chapter 5 Multicast Design Solutions
Financial service provider (FSP): The FSP is the provider network that extends over a regional or global area.
The design of regional and global is tied to the FSP’s business model and customers. The network design is similar to
that of a traditional enterprise WAN.
An additional element of the FSP is the service edge network, which includes access points of presence (POPs) to
support necessary service policies for positioning services for end brokerage customers. Some FSPs also offer a
service that involves normalization of market data feeds to reduce errors (if any) in the transmission by identifying
any missing parameters in the ordering process. Such a service is offered by the brokerages themselves or could be
tied to a service from the FSPs. The FSPs also provide business-to-business service of market data feeds for
brokerages that deal with high volume. This necessitates a network transport system created for differential services.
Brokerage network: This network is divided into two components:
The back-office network: This network includes trading and security, trading record keeping, trade confirmation,
trade settlement, and regulatory compliance, and so on. This network is highly secured and always protected by
firewalls.
The front-end brokerage network: This is where the feeds (A and B) are collapsed by an application (such as
TIBCO) in a messaging bus architecture for arbitration. These streams are distributed to the traders.
This section provides an overview of market data multicast design considerations. For more information, see IP
Multicast, Volume 1 and Chapter 1, “Interdomain Routing and Internet Multicast,” Chapter 2, “Multicast Traffic
Engineering, Scalability, and Reliability,” and Chapter 3, “Multicast MPLS VPNs,” of this book. Figure 5-20
illustrates multicast domains from the exchange, FSP, and brokerage environment.
Figure 5-21 provides an overview of the FSP network and connectivity. This is the network on which the multicast
domain is overlaid.
The design requirements for an FSP are similar to those of a regular service provider. The key difference is that an
FSP may require higher reliability and strict latency. The market data streams are typically forwarded statically into
the FSP, as noted earlier in this chapter. Path separation for the multicast streams is created by traffic engineering
methods, or the FSP can have a dual-core architecture to create complete isolation and separation of the control and
data planes. For segmentation of traffic, MPLS VPN service is commonly utilized. Multicast transport schemes
applicable to WAN, traffic engineering, and MPLS VPN are applied in the design.
The multicast design of the front-office brokers transports the multicast feed to the application message bus. Each
feed terminates at a different data pod infrastructure; this pod consists of a message bus for handling the feeds. The
design is simple, but multiple geographic locations or sites for the pod distribution require multicast to be enabled
227
Chapter 5 Multicast Design Solutions
over a WAN or metro network. Generally, Bidir-PIM is best in this scenario. Figure 5-22 illustrates the multicast data
feed to a brokerage network. The figure shows the multicast brokerage design.
The back-end office that hosts the traders has separate RPs for the new multicast streams available from the different
application handlers and should use Bidir-PIM. Due to network platform restriction, if Bidir-PIM is not a viable
option, PIM-SM with shortest path tree (SPT) threshold infinity should be considered. Multiple RPs can be deployed,
each tied to a multicast group range based on the application use case.
The most important decisions that service providers have to make are similar to the decisions required for multitenant
228
Chapter 5 Multicast Design Solutions
data centers: What mode of PIM needs to be selected, and where should RPs be placed? Chapter 3 discusses the
numerous options that service providers use to transport multicast messages using MPLS, since using MPLS is
currently the preferred method of moving both unicast and multicast messages across a shared service cloud.
Remember that to deploy an MPLS service that supports multicast, the provider network needs a multicast-enabled
core network that supports multicast data trees (MDTs).
The provider multicast domain that services the MDTs is completely isolated from any customer domains. This
means the provider network needs to choose a PIM method (ASM or SSM) and, if necessary, configure RPs for the
network. Service provider network architects should consider using SSM as the infrastructure delivery method. It
greatly simplifies the multicast deployment at scale. SSM does not require an RP and has a separate and specific
multicast range, 232.0.0.0/8, that is easy to scope and manage. Examine the network diagram in Figure 5-23. This
network provides an MPLS MVPN service to the customer between Site 1 and Site 2. The customer is using PIM-SM
and is providing its own RP on the customer premises equipment router at Site 1. The provider network is using SSM
to transport the MDT for the customer VPN with route distinguisher (RD) 100:2.
230
Chapter 5 Multicast Design Solutions
Figure 5-24 PIM-SM Provider Domain
When a provider grows beyond the ability to support every customer in a single routed domain, BGP confederations
or even multiple public domains are often used to carry traffic. Remember from Chapter 1 that using multiple IGP
and BGP domains for unicast requires corresponding multicast domains, with interdomain routing, to complete
multicast transport across the network. In such situations, providers should consider nesting RPs, using anycast for
redundancy, and building a mesh of MSDP peerings between them. This completes the transport across the provider
domain. Figure 5-25 shows a very high-level diagram of this type of provider multicast domain design.
For networks that also provide multicast services to customers, there are other considerations. In particular, IP
Television (IPTV) delivery requires some unique design elements, which are discussed next.
Figure 5-26 provides a 10,000-foot view of the components of multicast in the cable environment.
Service interface: Consists of the signaling mode between the service plane and the network plane
Network plane: Consists of IP network configuration to support multicast (control and data plane), resiliency, and
high availability of the network transport
The network consists of a shared platform needed for all services, like QoS (DiffServ or RSVP based on
applicability) or QoS-based Call Admission Control (CAC) systems for transport of multiple types of content. IP
Multicast gateways consists of ad splicers, converters, and so on.
The choice of multicast transport protocols should determine the service plane communication needs of connected
devices. Based on protocol requirements for the content providers, such as CAC, IGMPv3 or v2 support, and
application redundancy, the multicast network technology selected for the transport layer should be able to support
all required application services. The WAN technology generally consists of an MPLS L3 VPN or L2 VPN solution
that connects to the end host access technology. Layer 2 Pseudowire could also be considered using a protected
Pseudowire deployment. This provides subsecond convergence by leveraging features such as Fast Reroute (FRR)
with RSVP-TE LSPs. It also provides the network operators service level agreement (SLA) guidelines for multicast
transport. The items to consider in the design are as follows:
The need for global, national, or regional content sources
Fast convergence and availability
Requirements for different media content
Other factors to keep in mind during the design stage relate to the type of feed. The feed could be any or all of the
following:
Broadcast feed: Including static forwarding to a DSLAM
Switched digital video: Static PIM tree to the PE-AGG router
232
Chapter 5 Multicast Design Solutions
The service interface consideration in this section includes multicast signaling support with IGMPv3, applications
built to SLA requirements, and applications built using CAC methods. The PIM-SSM model generally fits this design,
with one-to-many communication building on any individual sources. This method is best suited to handling
join/prune latency requirements. PIM-SSM will also help with the localization of services based on unicast IP
addresses of different host servers using the same multicast group. Techniques using different masks for source IP
addresses can be used for redundancy for the source of the multicast service. SSM multicast technology can be
aligned with channel feeds, and source IP address spoofing is mitigated based on built-in application support for
IGMPv3. The transport design also covers path separation across the WAN transport segment.
It is critical to understand multicast VLAN registration (MVR) and the features in the cable environments. MVR is
designed for multicast applications deployed in Ethernet ring-based service provider networks. The broadcast of
multiple television channels over a service provider network is one typical example. MVR allows a subscriber on a
port to subscribe and unsubscribe to a multicast stream on the networkwide multicast VLAN, thereby enabling a
single multicast VLAN to be shared in the network while subscribers (receivers) remain in separate VLANs. It also
optimizes stream delivery by providing the ability to send continuous multicast streams in the multicast VLAN rather
than send separate streams from the subscriber VLANs.
MVR assumes that subscriber ports subscribe and unsubscribe to these multicast streams by sending IGMPv2 join
and leave messages on the VLAN. MVR uses IGMP snooping, but MVR and IGMP snooping can be enabled or
disabled without impacting each other. The MVR feature only intercepts the IGMP join and leave messages from
multicast groups configured under MVR. The MVR feature has the following functions:
Categorizes the multicast streams configured under the MVR feature and ties the associated IP Multicast group in
the Layer 2 forwarding table
Modifies the Layer 2 forwarding table to include or exclude the receiver of the multicast stream (not constrained by
VLAN boundaries)
233
Chapter 5 Multicast Design Solutions
The MVR has two port types:
Source: Configures a port that receives and sends multicast data as source ports. Subscribers cannot be directly
connected to source ports; all source ports on a switch belong to the single multicast VLAN.
Receiver: Configures a port as a receiver port and only receives multicast data.
Summary
When designing a multicast infrastructure, items to consider are desired replication points, multitenancy, and
scalability. In addition, redundancy, reliability, and resiliency are primary design elements.
Transporting multicast messages is critical in health care, wireless networks, data centers, utilities, market exchanges,
service provider networks, and, of course, LANs and WANs. The better you understand the different multicast
service offerings and the desired outcomes for your organization, the better you will be able to provide a service that
is redundant, reliable, and resilient.
References
“Cisco Virtualized Multi-Tenant Data Center Cloud Consumer Models,”
www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Data_Center/VMDC/2-2/collateral/vmdcConsumerModels.html
RFC 4761, “Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling”
RFC 7623, “Provider Backbone Bridging Combined with Ethernet VPN (PBB-EVPN)”
234
Chapter 6 Advanced Multicast Troubleshooting
Chapter 6
Advanced Multicast Troubleshooting
IP Multicast, Volume 1 introduces a basic methodology for troubleshooting IP multicast networks. This methodology concentrates on
implementations of Any-Source Multicast (ASM) using PIM Sparse-Mode (PIM-SM). To review quickly, there are three ordered steps in
this methodology:
1. Receiver check
2. Source check
3. State verification
These fundamental steps of troubleshooting never change, even in advanced multicast designs. As discussed in this book, there are many
new protocols and nuances involved in advanced designs. Cross-domain forwarding of multicast over the Internet is a great example.
Interdomain multicast introduces Multicast Source Discovery Protocol (MSDP) and Border Gateway Protocol (BGP) into the multicast
design. In addition, source, receiver, and state checks can be complicated if there are multiple domains, some of which are under the control
of entities other than your own.
Despite these additional protocols and elements, with troubleshooting, you always start at the beginning. The following is a breakdown of
this three-element methodology into high-level steps for troubleshooting multicast in a single domain:
Note: Certain protocol and router checks have been added to each of these steps to help identify where the source of the problem exists.
These high-level steps are found in IP Multicast, Volume 1 Chapter 7, “Operating and Troubleshooting IP Multicast Networks.”
Step 1. Receiver check. Make sure a receiver is subscribed via Internet Group Management Protocol (IGMP) and that a (*, G) to the
rendezvous point (RP) exists (if using PIM-SM):
Check the group state on the last-hop router (LHR).
Check IGMP membership on the last-hop PIM-designated router (DR).
Verify the (*, G) state at the LHR and check the RP for the (*, G) entry and Reverse Path Forwarding (RPF).
Step 2. Source check. Make sure you have an active source before trying to troubleshoot:
Verify that the source is sending the multicast traffic to the first-hop router (FHR).
Confirm that the FHR has registered the group with the RP.
Determine that the RP is receiving the register messages.
Confirm that the multicast state is built on the FHR.
Step 3. State verification. Ensure that each router in the path has correct RPF information by using the show ip rpf <IP_address>
command:
Verify the RP and shortest-path tree (SPT) state entries across the path:
Check the MSDP summary to verify that peering is operational.
Verify the group state at each active RP.
Verify SPT changes.
Verify the mroute state information for the following elements:
Verify that the incoming interface list (IIF) is correct.
Verify that the outgoing interface list (OIF) is correct.
235
Chapter 6 Advanced Multicast Troubleshooting
Ensure that the flags for (*, G) and (S, G) entries are correct and that the RP information is correct.
Does this align with the information in the mroute entry?
Is this what you would expect when looking at the unicast routing table?
It is helpful to examine the troubleshooting steps for some of the different types of advanced designs introduced in this book. As shown
here, you need to adjust the steps to include additional checks for the protocols in use with each technology. However, the basic three-step
process is always used. This chapter covers these concepts and expands on your knowledge of advanced multicasting protocols, starting with
interdomain multicast forwarding.
236
Chapter 6 Advanced Multicast Troubleshooting
Is the next-hop router in the NLRI entry a PIM neighbor of the local domain?
Verify the mroute state information for the following elements at every router in the path:
Verify that the IIF is correct.
Verify that the OIF is correct.
Ensure that the flags for (*, G) and (S, G) entries are correct and that the RP information is correct.
Does this align with the information in the mroute entry?
Is this what you would expect when looking at the unicast routing table?
Let’s put the methodology to work, using the example of the final ASM multidomain network from Chapter 1, “Interdomain Routing and
Internet Multicast,” with a couple problems introduced into the configuration so that you can use the methodology to sniff them out and
repair them. Figure 6-1 depicts the high-level ASM network as completed in Chapter 1, and Figure 6-2 shows the interface map for the
routers in the path between the source and the client.
Remember that in this design, Enterprise Company is offering up content to clients on the public Internet via group 239.1.2.200. Enterprise
Company routers are configured as a single domain, and the BR router has an MSDP and MBGP relationship with router SP3-1 in Internet
service provider (ISP) Blue. The ISPs have interconnected multicast domains. At the end of this configuration example, the client at public
IP 172.21.100.2 was able to receive multicast packets from the source at 10.20.2.200 (see Example 6-1).
Example 6-1 Multicast Reachability from Source to Client for 10.20.2.100, 239.1.2.200
237
Chapter 6 Advanced Multicast Troubleshooting
Reply to request 0 from 172.21.100.2, 4 ms
Sometime after the completion of this configuration, the client stops receiving multicast. As shown in Example 6-2, a simple ping test
indicates that the client is no longer responding to multicast pings.
Example 6-2 Multicast Reachability Broken from Source to Client for 10.20.2.200
Example 6-5 Checking for Source State and RP Registration on the FHR
240
Chapter 6 Advanced Multicast Troubleshooting
Is each MSDP-enabled RP also an MBGP peer, to prevent black-holing?
Is the MSDP-enabled RP running BGP receiving an NLRI entry that covers the source IP?
Is the next-hop router in the NLRI entry a PIM neighbor of the local domain?
Verify the mroute state information for the following elements at every router in the path:
Verify that the IIF is correct.
Verify that the OIF is correct.
Ensure that the flags for (*, G) and (S, G) entries are correct and that the RP information is correct.
Does this align with the information in the mroute entry?
Is this what you would expect when looking at the unicast routing table?
The first part of step 3 is to ensure that there is a proper PIM neighborship at each interface in the path. Use the show ip pim neighbor
command at each router, starting with the FHR and moving toward the LHR, as shown in Example 6-6.
Example 6-6 Checking for PIM Neighborships Along the Path
FOR ENTERPRISE:
R1# show ip msdp summary
MSDP Peer Status Summary
Peer Address AS State Uptime/ Reset SA Peer Name
Downtime Count Count
172.23.0.2 65003 Up 1w0d 0 0 ?
10.0.0.2 65102 Up 1w0d 0 1 ?
10.0.0.3 65103 Up 1w0d 0 0 ?
244
Chapter 6 Advanced Multicast Troubleshooting
tree. You need to check the MSDP configuration on SP3-2 and look for any configuration issues. You can use the show running-config |
begin ip msdp command to see only the relevant configuration, as shown in Example 6-13.
247
Chapter 6 Advanced Multicast Troubleshooting
249
Chapter 6 Advanced Multicast Troubleshooting
router bgp 65102
bgp router-id 10.0.0.2
bgp log-neighbor-changes
bgp confederation identifier 65100
bgp confederation peers 65101 65103
neighbor 10.0.0.1 remote-as 65101
neighbor 10.0.0.1 ebgp-multihop 2
neighbor 10.0.0.1 update-source Loopback0
neighbor 10.0.0.3 remote-as 65103
neighbor 10.0.0.3 ebgp-multihop 2
neighbor 10.0.0.3 update-source Loopback0
!
address-family ipv4
network 10.20.2.0 mask 255.255.255.0
neighbor 10.0.0.1 activate
neighbor 10.0.0.3 activate
exit-address-family
!
address-family ipv4 multicast
neighbor 10.0.0.1 activate
neighbor 10.0.0.3 activate
exit-address-family
!
ip pim rp-address 10.0.0.1 override
R3
hostname R3
!
ip multicast-routing
ip cef
no ipv6 cef
!
interface Loopback0
ip address 10.0.0.3 255.255.255.255
ip pim sparse-mode
!
interface Ethernet0/0
no ip address
ip pim sparse-mode
!
interface Ethernet0/1
ip address 10.1.3.3 255.255.255.0
ip ospf cost 1
!
interface Ethernet0/2
ip address 10.2.3.3 255.255.255.0
ip ospf cost 1
!
router ospf 10
network 10.0.0.3 0.0.0.0 area 0
network 10.1.3.3 0.0.0.0 area 0
network 10.2.3.3 0.0.0.0 area 0
!
router bgp 65103
bgp router-id 10.0.0.3
bgp log-neighbor-changes
bgp confederation identifier 65100
bgp confederation peers 65101 65102
neighbor 10.0.0.1 remote-as 65101
neighbor 10.0.0.1 ebgp-multihop 2
neighbor 10.0.0.1 update-source Loopback0
neighbor 10.0.0.2 remote-as 65102
250
Chapter 6 Advanced Multicast Troubleshooting
neighbor 10.0.0.2 ebgp-multihop 2
neighbor 10.0.0.2 update-source Loopback0
!
address-family ipv4
neighbor 10.0.0.1 activate
neighbor 10.0.0.2 activate
exit-address-family
!
address-family ipv4 multicast
neighbor 10.0.0.1 activate
neighbor 10.0.0.2 activate
exit-address-family
!
ip pim rp-address 10.0.0.1 override
Notice from the configuration in Example 6-15 that the network path from R2 to R1 has a configured OSPF cost of 1,000, whereas the cost
of the path R2–R3–R1 has a configured cost of 1. This causes the routers to prefer the lower-cost path for all traffic between the BGP
distributed networks 10.10.0.0/24 (where the client is located) and 10.20.2.0/24 (where the source is connected). The interfaces in the R2–
R3–R1 path do not have any PIM configurations.
This should cause an incomplete tree between the source and the client. You can prove this with a simple ping to the group 239.20.2.100 on
Server 2 from the source (10.20.2.100). Example 6-18 shows this ping failing.
Example 6-18 Failed PIM Path for (10.20.2.100, 239.20.2.100)
Click here to view code image
Server2# ping 239.20.2.100
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.20.2.100, timeout is 2 seconds:
. . .
The reason for this break is obvious, but let’s apply the troubleshooting elements from step 3 in the model to prove why the break in the
stream is occurring. Check the PIM path between the source and client, starting at the router closest to the receiver, R1 (which also happens
to be the RP for this network). Then move to R2, the next router in the path, and the one closest to the source. The show ip mroute group
and show ip rpf neighbor source commands are sufficient to meet the checks in step 3. The command output for both routers is shown in
Example 6-19.
Example 6-19 Discovering the Broken Path Between R1 and R2
251
Chapter 6 Advanced Multicast Troubleshooting
Ethernet0/1, Forward/Sparse, 00:10:39/00:02:02
(10.20.2.100, 239.20.2.100), 00:00:02/00:02:57, flags:
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Ethernet0/1, Forward/Sparse, 00:00:02/00:02:57
R2# config t
R2(config)# ip mroute 10.0.0.1 255.255.255.255 10.1.2.1
253
Chapter 6 Advanced Multicast Troubleshooting
(*, 239.20.2.100), 00:42:07/stopped, RP 10.0.0.1, flags: SJC
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Ethernet0/1, Forward/Sparse, 00:42:07/00:02:33
(10.20.2.100, 239.20.2.100), 00:01:10/00:01:49, flags:
Incoming interface: Ethernet0/2, RPF nbr 10.1.2.2, Mroute
Outgoing interface list:
Ethernet0/1, Forward/Sparse, 00:01:10/00:02:33
R1# show ip rpf 10.20.2.100
RPF information for ? (10.20.2.100)
RPF interface: Ethernet0/2
RPF neighbor: ? (10.1.2.2)
RPF route/mask: 10.20.2.0/24
RPF type: multicast (static)
Doing distance-preferred lookups across tables
RPF topology: ipv4 multicast base
R2# show ip mroute 239.20.2.100
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
255
Chapter 6 Advanced Multicast Troubleshooting
Network Next Hop Metric LocPrf Weight Path
*> 10.20.2.0/24 10.0.0.2 0 100 0 (65102) i
BGP on R2 is now sharing the multicast prefix for 10.20.2.0/24 with R1. R1 uses this prefix as the RPF source (as shown by MBGP in the
mroute entry and bgp 65102 in the RPF output) and completes the shared and source trees. This should result in a successful ping.
As you can see, the adjustments given for step 3 in the troubleshooting methodology help identify when and where traffic engineering might
be needed or where in the network configured traffic engineering may be broken. Multicast traffic engineering implementations can be very
complicated. Some in-depth understand of the protocols is certainly going to make troubleshooting and repair easier. But even without this
knowledge, an operator should be able to follow the outlined methodology and make significant progress. The additional checks for static
mroute entries and MBGP entries are necessary for identifying where pathing may be breaking down.
You should also be able to see now that the three-step methodology proposed at the beginning of this chapter is universal for all
troubleshooting exercises. The more in-depth your understanding of the protocols in use, the more checks you can add to each step. The
following sections introduce additional troubleshooting tips for other advanced designs, but only the additional checks and commands for
each are provided to avoid repetition. Let’s start with multicast VPN (MVPN) designs.
Troubleshooting MVPN
MVPN is significantly more difficult to troubleshoot than traditional ASM networks as there are multiple layers (like an onion or parfait) that
rely upon one another to function correctly. The preceding examples have already explained: step 1 (receiver check) and step 2 (source
check). Therefore, this section begins at step 3, using the same troubleshooting methodology as before but with additional changes to the
checks and substeps.
This example uses the diagram shown in Figure 6-4 (which you might recognize from Chapter 3, “Multicast MPLS VPNs”).
Taking a bottom-up approach, let’s start with the core or transport network and follow these steps:
Step 1. Receiver check. Make sure a receiver is subscribed via IGMP and that the correct (*, G) is present.
Step 2. Source check. Make sure you have an active source before trying to troubleshoot.
Step 3. State verification. Ensure that the core network is functioning:
256
Chapter 6 Advanced Multicast Troubleshooting
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
a - application route
+ - replicated route, % - next hop override, p - overrides from PfR
Gateway of last resort is not set
258
Chapter 6 Advanced Multicast Troubleshooting
Is multicast established in the core? Are there MDT default and data trees?
IOS-XE troubleshooting commands:
show ip mroute
Click here to view code image
R10# show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
G - Received BGP C-Mroute, g - Sent BGP C-Mroute,
N - Received BGP Shared-Tree Prune, n - BGP C-Mroute suppressed,
Q - Received BGP S-A Route, q - Sent BGP S-A Route,
V - RD & Vector, v - Vector, p - PIM Joins on route,
x - VxLAN group
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
259
Chapter 6 Advanced Multicast Troubleshooting
Incoming interface: Ethernet0/1, RPF nbr 192.168.109.9
Outgoing interface list:
Ethernet0/2, Forward/Sparse, 00:59:45/00:02:46
260
Chapter 6 Advanced Multicast Troubleshooting
Checksumming of packets disabled
Tunnel TTL 255, Fast tunneling enabled
Tunnel transport MTU 1476 bytes
Tunnel transmit bandwidth 8000 (kbps)
Tunnel receive bandwidth 8000 (kbps)
Last input 00:00:01, output 00:00:00, output hang never
Last clearing of "show interface" counters 01:11:52
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/0 (size/max)
5 minute input rate 12000 bits/sec, 12 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
102124 packets input, 12627804 bytes, 0 no buffer
Received 0 broadcasts (101511 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
258 packets output, 20872 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
The highlighted output indicates that the tunnel is passing traffic. This is a good thing!
IOS-XR troubleshooting commands:
show pim [ vrf vrf-name ] mdt interface
For profiles using MLDP:
IOS-XE and IOS-XR troubleshooting commands:
show mpls mldp neighbors
Ensure that the VPN is functioning.
Verify that the unicast routing protocol is working correctly and that you are able to ping from PE to PE within a VRF.
Verify the communication between PE devices. This could be PIM, BGP, MLDP, and so on. The troubleshooting options are dependent
on the specific profile you are using. Please refer to Chapter 3 and specific operating system documentation for additional details.
Ensure interoperability between PE and CE.
Determine whether PIM is enabled between the PE interface or subinterface and the CE device.
IOS-XE troubleshooting commands:
show ip pim interface
show ip pim vrf [ vrf vrf-name ]
show ip pim neighbor
show ip pim vrf [ vrf vrf-name ] neighbor
261
Chapter 6 Advanced Multicast Troubleshooting
172.16.6.16 Ethernet0/2 00:35:33/00:01:35 v2 1 / DR S P G
192.168.0.5 Tunnel1 00:33:40/00:01:31 v2 1 / S P G
192.168.0.4 Tunnel1 00:33:40/00:01:30 v2 1 / S P G
192.168.0.3 Tunnel1 00:33:40/00:01:33 v2 1 / S P G
The output from R6 shows a neighbor adjacency to R15 on the Ethernet0/2 interface; this is in VRF RED. The adjacencies on the Tunnel1
interface connect to other PE devices in the network that are participating in the same MVPN.
IOS-XR troubleshooting commands:
show pim [ vrf vrf-name ] [ ipv4 | ipv6 ] interface
262
Chapter 6 Advanced Multicast Troubleshooting
Click here to view code image
VTEP-1# show nve peers
Interface Peer-IP State LearnType Uptime Router-Mac
--------- --------------- ----- --------- -------- -----------------
nve1 10.50.1.1 Up DP 00:05:09 n/a
Now you need to check the multicast group aligned to the active VNIs (as reviewed in the second check). This is accomplished using the
show nve vni command. Example 6-24 gives a generic output from this command.
Example 6-24 Showing Active VNI Groups
Click here to view code image
VTEP-1# sh nve vni
Codes: CP - Control Plane DP - Data Plane
UC - Unconfigured SA - Suppress ARP
Interface VNI Multicast-group State Mode Type [BD/VRF] Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1 2901 239.1.1.1 Up DP L2 [2901]
In flood and learn mode, the NVE is aligned with the underlay multicast, in this case nve1 of 2901 is mapped with 239.1.1.1. In this
scenario, VTEP-1 is the source for 239.1.1.1, and the other VTEPs in the VxLAN domain are the receivers. The data plane is the Broadcast
and Unknown Unicast and Multicast (BUM) traffic that uses the multicast group.
Now check the Multicast group information, and make sure the VTEP peer information is seen and the outgoing information is tied to NVE1
interface (as reviewed in the third check). This is to confirm the multicast group as (*,G) & (S,G) entry in the control plane. Confirm this by
using a simple show ip mroute command on the VTEP. Example 6-25 shows this output from the generic VTEP.
Example 6-25 Verifying Mroute with VNI Interface
Summary
Troubleshooting large multicast environments can be daunting. As with eating an elephant, it involves determining where to start—trunk,
tail, or somewhere in the middle. Now that you are familiar with the three fundamental troubleshooting steps, you know where to start:
Step 1. Receiver check
Step 2. Source check
Step 3. State verification
You can use this methodology to troubleshoot any multicast problem, including those explained in this chapter (interdomain, traffic
engineering, MVPN, and VXLAN) and those that are not.
264
Chapter 6 Advanced Multicast Troubleshooting
The key to quickly and effectively determining the root cause of a multicast problem is to be familiar with the different technologies and
follow the methodology outlined in this chapter.
265
Index
Index
Symbols
(*, G) state entry, 10, 49
checking, 50, 96, 289–290
data MDT traffic flow, 148–149
learning, 32–34
multicast reachability failure, 34–35
(S, G) state entry, 10, 49
checking, 96
completing, 50
data MDT traffic flow, 148–149
A
access control
firewalls, 83
prefix filtering, 84–88
access points (APs), 244
ACI (Application Centric Infrastructure), 258–260
data center networks, 227–228
fabrics and overlay elements, 229–230
Layer 2 IGMP snooping in, 231–235
266
Index
ACKs (acknowledgements), eliminating, 87
actions (MSDP), 41–42
activate command, 27
AD (Auto Discovery), 182–183
address-family information (AFI)
BGP (Border Gateway Protocol), 28
MBGP (Multiprotocol Border Gateway Protocol), 27–28
advertisements
PIM auto-RP advertisements, 235
SAs (source actives), 40, 47–50
AFI. See address-family information (AFI)
Application Centric Infrastructure. See ACI (Application Centric Infrastructure)
APs (access points), 244
ASM (Any-Source Multicast), 6
ASNs (autonomous system numbers), 3
ASs (autonomous systems), 3–6. See also interdomain design
ASNs (autonomous system numbers), 3
borders of, 22–24
intra-AS multidomain design, 62–71
Auto Discovery (AD), 182–183
B
267
Index
badging systems, 253
best-effort forwarding, 1–4
BGP (Border Gateway Protocol), 2, 5
AD (Auto Discovery), 182–183
importance of, 7
inter-AS routing with, 22–24
MBGP (Multiprotocol Border Gateway Protocol)
advantages of, 24
configuring for multicast, 25–32
MDT (Multicast Distribution Tree) configuration, 145–148
prefix table
clearing, 37
multicast prefix acceptance, 36
bidir-neighbor-filter command, 272
Bidir-PM (Bidirectional PIM), 244
BIER (Bit Index Explicit Replication), 202–205
bindings, displaying
data MDT MLDP, 180
default MDT MLDP, 170–171, 175
Bit Index Explicit Replication (BIER), 202–205
Border Gateway Protocol. See BGP (Border Gateway Protocol)
268
Index
border nodes, 264–265
borders/boundaries. See also multicast domains
AS (autonomous system) borders, 22–24
BGP (Border Gateway Protocol), 2, 5
AD (Auto Discovery), 182–183
importance of, 7
inter-AS routing with, 22–24
MDT (Multicast Distribution Tree) configuration, 145–148
prefix table, 36, 37
configuration of, 32–37
configured multicast boundaries, 32–37
MBGP (Multiprotocol Border Gateway Protocol)
advantages of, 24
configuring for multicast, 25–32
scoped multicast domains
group scope, 14–15
overlapping scopes, 17–19
RP scope, 16–17
boundary command, 84
brokerage multicast design, 273–274
BYOD (bring-your-own-device) services, 240
269
Index
C
CAC (Call Admission Control), 280
cache
MDT (Multicast Distribution Tree), 157
MSDP (Multicast Source Discovery Protocol), 298–299
Call Admission Control (CAC), 280
campus design, multicast-enabled clinical networks, 238–240
CAPWAP (Control and Provisioning of Wireless Access Points), 248–251
CE (customer edge) devices
CE-CE multicast routing, 187
CE-PE multicast routing, 186–187
definition of, 137
CFS (Cisco Fabric Services), 210
Cisco ACI (Application Centric Infrastructure). See ACI (Application Centric
Infrastructure)
Cisco Design Zone, 255
Cisco DNA (Digital Network Architecture), 260–261
Cisco Fabric Services (CFS), 210
Cisco ISE (Identity Services Engine), 262
Cisco LISP. See LISP (Locator/ID Separation Protocol)
Cisco TrustSec infrastructure, 262
270
Index
Cisco UA (Unified Access), 249–250
Cisco Validated Design (CVD), 255
Cisco Viptela, 261
Cisco VMDC (Virtualized Multi-Tenant Data Center), 255–256
Cisco VSS (Virtual Switching System), 241
clear ip bgp * command, 37
clear ip msdp peer command, 55
clear ip msdp sa-cache command, 55
clearing
BGP table, 37
MSDP peering sessions, 55
clinical network design
campus design, 238–240
medical device communications, 240–246
three-tier network hierarchy, 238–239
wireless networks, 246–254
cloud environments, 101
cloud broker-based connectivity, 101–103
cloud connectivity to enterprise, 101–103
CSPs (cloud service providers), 100
enterprise adoption of, 99–100
271
Index
IaaS (infrastructure as a service), 100
multicast traffic engineering, 117–118
DMVPN (Dynamic Multipoint VPN), 118–126
intra-regional multicast flow between two spokes, 130–132
multicast flow from hub, 129–130
unicast spoke-to-central hub communication, 128–129
unicast spoke-to-spoke interregional communication, 128
unicast spoke-to-spoke intra-regional communication, 126–127
native multicast, lack of, 99
PaaS (platform as a service), 100
service reflection, 105
multicast-to-multicast destination conversion, 105–109
multicast-to-unicast destination conversion, 113–117
unicast-to-multicast destination conversion, 109–113
use cases, 132–135
virtual services in, 103–105
cloud service providers (CSPs), 100
COLO (colocation) facilities, 133–134
commands, 312
activate, 27
bidir-neighbor-filter, 272
272
Index
boundary, 84
clear ip bgp *, 37
clear ip msdp peer, 55
clear ip msdp sa-cache, 55
connect-source, 39
debug ip msdp detail, 296–297, 298–299
debug ip msdp details, 48
debug ip msdp peer, 43–45
debug ip pim, 34
debug ip pim rp, 48
feature msdp, 51
ip mroute, 24
ip msdp keepalive, 52
ip msdp mesh-group, 54
ip msdp originator-id, 54
ip msdp peer, 38–39, 51–52
ip msdp reconnect-interval, 53
ip msdp sa-filter in, 55
ip msdp sa-filter out, 55
ip msdp sa-limit, 56
ip msdp sa-policy, 55
273
Index
ip msdp shutdown, 55
ip msdp timer, 53
ip multicast boundary, 84
ip pim autorp listener, 148
ip pim jp-policy, 84
ip pim sparse-mode, 36, 294
ip pim ssm default, 145
ip pim vrf RED autorp listener, 148
ipv6 mld join-group, 95
ipv6 pim rp-address, 95
mdt data 254 threshold 2, 177
mdt data group-address-range wildcard-bits, 142
mdt data mpls mldp, 177
mdt data threshold, 149, 156
mdt default, 140–141
mdt preference mldp, 185
msdp filter-out, 59
no ip sap listen, 87–88
peer, 51–52, 54–56
ping
ping 10.0.100.100 source lo 100, 126, 128–129
274
Index
ping 239.1.2.200, 79–82, 286–287, 300
ping 239.20.2.100, 71, 307, 312
ping FF73:105:2001:192::1, 95
remote-as, 39
router bgp, 27
router msdp originator-id, 54
sa-limit, 56
set core-tree ingress-replication-default, 188
show bgp ipv4 mdt vrf RED, 147–148
show interfaces tunnel 1, 153–154, 320–321
show ip bgp, 31
show ip bgp ipv4 mdt vrf RED, 146–147
show ip bgp ipv4 multicast, 31, 36–37, 312–313
show ip bgp neighbors, 28–31
show ip igmp groups, 209, 288–289
show ip mfib vrf RED 224.1.1.20, 159
show ip mroute, 9, 61, 318, 323–324
failed PIM path between source and client, 307–309
group flow for 239.1.1.1, 60–61
interdomain segregation, verifying, 70–71
multicast flow over VPC, 209–211
275
Index
routing table for VRF “default”, 216
service reflection examples, 106–116
show ip mroute 239.1.1.1, 129–130
show ip mroute 239.1.2.200, 79–82, 288–290, 294, 299–300
show ip mroute 239.120.1.1, 33, 34–35
show ip mroute 239.192.1.1, 130–132
show ip mroute 239.20.2.100, 310–312
show ip mroute 239.2.2.2, 59, 60
show ip mroute vrf BLU 224.1.1.1, 197–198
show ip mroute vrf BLU 224.2.2.22, 171–172
show ip mroute vrf RED 224.1.1.1, 197–198
show ip mroute vrf RED 224.1.1.20, 152–153
show ip mroute vrf SVCS 224.1.1.1, 196–197, 199
show ip msdp sa-cache, 49, 60, 61, 71, 295–296, 298–299
show ip msdp sa-cache rejected-sa, 297–298
show ip msdp summary, 295–296
show ip nhrp, 126–127, 129
show ip pim, 70
show ip pim neighbor, 150–151
show ip pim rp mapping, 59
show ip pim rp mappings, 288–289
276
Index
show ip pim vrf RED mdt, 156–157
show ip pim vrf RED mdt send, 157
show ip pim vrf RED neighbor, 150, 321
show ip route, 315–316
show ip rpf, 284, 309, 311–312
show ipv6 mroute, 96
show mfib vrf RED interface, 144
show mfib vrf RED route 224.1.1.20, 159–160
show mpls forwarding-table, 181, 316–317
show mpls forwarding-table labels 31, 174
show mpls forwarding-table labels 36, 179
show mpls ldp neighbor, 317–318
show mpls mldp bindings, 170–171, 175, 180
show mpls mldp database, 173, 174–175, 177–178
show mpls mldp database brief, 181
show mpls mldp database p2mp root 192.168.0.4, 179–180
show mpls mldp database summary, 180, 181
show mpls mldp neighbors, 167–169
show mpls mldp root, 169–170
show mrib vrf BLU route 224.2.2.22, 172, 189
show mrib vrf RED mdt-interface detail, 157
277
Index
show mvpn vrf BLU ipv4 database ingress-replication, 189–191
show mvpn vrf BLU pe, 189–191
show nve peers, 216, 222, 323
show nve vni, 216, 223, 323
show nve vni ingress-replication, 224
show pim vrf RED mdt cache, 157, 158
show pim vrf RED mdt interface, 154–155
show pim vrf RED neighbor, 150–151
show run interface vif1, 106, 110
show running-config | begin ip msdp, 298
show running-config interface e0/1 command, 293–294
show vrf RED, 143–144
static-rpf, 24
configuration. See also network design models; troubleshooting
BIER (Bit Index Explicit Replication), 202–205
CE-CE multicast routing, 187
CE-PE multicast routing, 186–187
configured multicast boundaries, 32–37
data MDT (Multicast Distribution Tree)
basic configuration, 142–143
example of, 155–160
278
Index
MTI (multicast tunnel interface), 143–144, 152–155
multicast signaling in the core, 144–148
traffic flow, 148–149
default MDT (Multicast Distribution Tree)
basic configuration, 139–142
example of, 149–152
sample network scenario, 148
EVPN VXLAN
ingress replication, 224
leaf, 220–223
spine, 219–220
inter-AS and Internet design, 72–82
completed multicast tree, 79–82
ISP multicast configurations, 72–79
interdomain multicast without active source learning, 88
IPv6 with embedded RP, 90–96
SSM (Source-Specific Multicast), 88–90
intra-AS multidomain design, 62–71
IPv6 MVPN, 202
LISP (Locator/ID Separation Protocol), 265–267
MBGP (Multiprotocol Border Gateway Protocol), 25–32
279
Index
BGP address-family configuration, 28
MBGP address-family configuration, 27–28
show ip bgp command, 31
show ip bgp ipv4 multicast command, 31
show ip bgp neighbors command, 28–31
standard BGP configuration, 25
MLDP (Multicast LDP)
data MDT MLDP, 176–181
default MDT MLDP, 164–176
FEC (Forwarding Equivalence Class) elements, 161–162
in-band signaling operation, 162–163
out-of-band signaling operation, 163
topology, 160–161
MSDP (Multicast Source Discovery Protocol), 50–56
mesh groups, 54
originator IDs, 54
peers, 50–53
SAs (source actives), 55–56
shutdown commands, 55
multicast boundaries, 32–37
multicast extranet VPNs, 192
280
Index
fusion router, 199–200
route leaking, 192–195
VASI (VRF-aware service infrastructure), 200–202
VRF fallback, 195–198
VRF select, 198–199
PE-PE ingress replication, 187–191
profiles
available options, 182
migrating between, 185–186
operating system support matrix, 183–185
provider multicast transport, 186
scoped multicast domains
group scope, 14–15
overlapping scopes, 17–19
RP scope, 16–17
security
firewalls, 83
prefix filtering, 84–87
service filtering, 87–88
service reflection, 105
multicast-to-multicast destination conversion, 105–109
281
Index
multicast-to-unicast destination conversion, 113–117
unicast-to-multicast destination conversion, 109–113
static interdomain forwarding, 19–22
traffic engineering
DMVPN (Dynamic Multipoint VPN), 118–126
intra-regional multicast flow between two spokes, 130–132
multicast flow from hub, 129–130
router configurations, 303–306
unicast spoke-to-central hub communication, 128–129
unicast spoke-to-spoke interregional communication, 128
unicast spoke-to-spoke intra-regional communication, 126–127
connect-source command, 39
Control and Provisioning of Wireless Access Points (CAPWAP), 248–249
cross-domain packet forwarding, verifying, 71
CSPs (cloud service providers), 100
CSR 1000v devices, 104
customer edge devices. See CE (customer edge) devices
customers
CE (customer edge) devices
CE-CE multicast routing, 187
CE-PE multicast routing, 186–187
282
Index
definition of, 137
definition of, 137
CVD (Cisco Validated Design), 255
D
data center environments, 207
ACI (Application Centric Infrastructure), 227–228
fabrics and overlay elements, 229–230
Layer 2 IGMP snooping in, 231–232
Layer 3 multicast in, 232–235
VPC (virtual port channel)
conceptual design, 207–208
multicast flow over, 208–211
orphan ports, 208
VPC peer links, 208
VXLAN (virtual extensible local-area network), 211
EVPNs (Ethernet virtual private networks), 216–224
flood and learn mode, 213–216
host-to-host multicast communication in, 224–226
VTEPs (VXLAN Tunnel Endpoints), 211–213
data MDT (Multicast Distribution Tree). See also default MDT (Multicast Distribution
Tree)
283
Index
basic configuration, 142–143
data MDT MLDP (Multicast LDP), 176–181
example of, 155–160
MTI (multicast tunnel interface), 143–144, 152–155
multicast signaling in the core, 144–148
traffic flow, 148–149
data threshold configuration (MDT), 149–150
databases, displaying
data MDT MLDP, 177–178, 180–181
default MDT MLDP
IOS-XE, 173
IOS-XR, 174–175
De Ghein, Luc, 183
debug commands
debug ip msdp detail, 296–297, 298–299
debug ip msdp details, 48
debug ip msdp peer, 43–45
debug ip pim, 34
debug ip pim rp, 48
debugging. See troubleshooting
default MDT MLDP (Multicast LDP)
284
Index
example of, 165–176
basic configuration, 165–166
bindings, displaying, 170–173, 175
database, displaying, 173–175
forwarding tables, 174
neighbors, displaying, 167–169
packet capture, 176
root of default MDT trees, verifying, 169–170
explanation of, 164
root high availability, 164–165
default MDT (Multicast Distribution Tree). See also data MDT (Multicast Distribution
Tree)
basic configuration, 139–142
default MDT MLDP (Multicast LDP)
example of, 165–176
explanation of, 164
root high availability, 164–165
example of, 149–152
sample network scenario, 148
design, 237–238
market applications, 269–271
285
Index
brokerage multicast design, 273–274
FSP (financial service provider) multicast design, 273
market data environments, 271–272
multicast-enabled clinical networks
campus design, 238–240
medical device communications, 240–246
three-tier network hierarchy, 238–239
wireless networks, 246–254
multitenant data centers
ACI (Application Centric Infrastructure), 258–260
tenant overlays, 254–258
service provider multicast, 274–275
IPTV delivery, 279–282
PIM-type selection, 275–279
RP placement, 275–279
software-defined networking
Cisco DNA (Digital Network Architecture), 260–261
DMVPN (Dynamic Multipoint VPN), 261
IWAN (Intelligent WAN), 261
LISP (Locator/ID Separation Protocol), 262–267
SD-Access, 261–262
286
Index
Viptela, 261
utility networks
design blocks, 267
Distribution Level tiers, 268
PMUs (phasor measurement units), 268–269
Radio over IP design, 269
SCADA (Supervisory Control and Data Acquisition), 267
Substation tier, 268
Design Zone (Cisco), 255
designated forwarder (DF) election, 272
destination conversion, 105
multicast-to-multicast destination conversion, 105–109
packet capture before conversion, 107
show ip mroute output at R2, 107–108
show ip mroute output at R4, 106–107, 109
VIF configuration at R2, 106
multicast-to-unicast destination conversion, 113–117
show ip mroute output at R2, 115–116
sniffer capture after conversion, 114–115
sniffer output after conversion, 116–117
VIF configuration at R2, 114
287
Index
unicast-to-multicast destination conversion, 109–113
show ip mroute output at R2, 112
show ip mroute output at R4, 110–111, 113
sniffer capture before conversion, 111–112
VIF configuration at R2, 110
DF (designated forwarder) election, 272
DHCP (Dynamic Host Configuration Protocol), 243
Digital Network Architecture (DNA), 260–261
direct cloud connectivity, 101
Distribution Level tiers, 268
DMVPN (Dynamic Multipoint VPN), 118–126, 261
configuration snapshot from hub R2, 123
configuration snapshot of regional hub R3, 123–125
configuration snapshot of regional spoke R6, 125–126
DNA (Digital Network Architecture), 260–261
domain boundaries. See borders/boundaries
domains. See multicast domains
draft Rosen model. See default MDT (Multicast Distribution Tree)
Dynamic Host Configuration Protocol (DHCP), 243
Dynamic Multipoint VPN. See DMVPN (Dynamic Multipoint VPN)
E
288
Index
ECMP (equal-cost multipath), 117
edge LSRs. See PE (provider edge) devices
EGPs (External Gateway Protocols), 5. See also BGP (Border Gateway Protocol)
egress tunnel routers (ETRs), 263–265
EIDs (endpoint identifiers), 263
embedded RPs (rendezvous points), 90–96
endpoints
EIDs (endpoint identifiers), 263
EPGs (endpoint groups), 258–259
VTEPs (VXLAN Tunnel Endpoints), 211–213
EPGs (endpoint groups), 258–259
equal-cost multipath (ECMP), 117
Ethernet virtual private networks. See EVPNs (Ethernet virtual private networks)
Ethernet VPN (PBB-EVPN), 277
ETRs (egress tunnel routers), 263–265
events (MSDP), 41–42
EVPNs (Ethernet virtual private networks), 216–218
Ethernet VPN (PBB-EVPN), 277
ingress replication, 224
leaf configuration, 220–223
spine configuration, 219–220
289
Index
External Gateway Protocols (EGPs), 5. See also BGP (Border Gateway Protocol)
extranet VPNs, 192
route leaking, 192–195
VASI (VRF-aware service infrastructure), 200–202
VRF fallback, 195–198
VRF select, 198–200
F
fabrics (ACI), 229–230
failed PIM path between source and client, troubleshooting, 307–309
fallback (VRF), 195–198
FANs (field area networks), 268
Fast Reroute (FRR), 280–281
feature msdp command, 51
FEC (Forwarding Equivalence Class) elements, 161–162
FHRs (first-hop routers), 253
FIB (forwarding information base), 309–310
field area networks (FANs), 268
filtering
prefixes, 84–87
SAs (source actives), 55–56
services at edge, 87–88
290
Index
financial applications, 269–271
brokerage multicast design, 273–274
FSP (financial service provider) multicast design, 273
market data environments, 271–272
firewalls
routed mode, 83
transparent mode, 83
VFW (virtual firewall), 255
first-hop routers (FHRs), 253
flood and learn mode (VXLAN), 213–216
flooding, peer-RPF, 49
Forwarding Equivalence Class (FEC) elements, 161–162
forwarding tables
data MDT MLDP, 179, 181
default MDT MLDP, 174
forwarding tag (FTAG) trees, 228
forwarding trees
building, 6–12
completed multicast tree, 79–82
FTAG (forwarding tag) trees, 228
OIL (outgoing interface list), 9
291
Index
sample completed tree, 9
shared trees, 8–11
shortest path trees, 8–11
FRR (Fast Reroute), 280–281
FSP (financial service provider) multicast design, 273
FTAG (forwarding tag) trees, 228
fusion routers, 199–200
G
GIPo (Group IP outer), 259
GRE (generic routing encapsulation), 47
groups
IGMP (Internet Group Messaging Protocol), 8, 231–232, 244
Layer 3 communication, 8
scope, 14–15
H
HA (high availability), 164–165
Halabi, Sam, 28
Health Insurance Portability and Accountability Act (HIPAA), 240
hellos (PIM), 234
HIPAA (Health Insurance Portability and Accountability Act), 240
292
Index
hospital network design
campus design, 238–240
medical device communications, 240–246
three-tier network hierarchy, 238–239
wireless networks, 246–254
host mobility, LISP (Locator/ID Separation Protocol) support for, 263
host-to-host multicast communication
Layer 2 communication, 224–226
Layer 3 communication, 226
I
IaaS (infrastructure as a service), 100, 254–255
IANA (Internet Assigned Numbers Authority), 3
Identity Services Engine (ISE), 262
IDs
EIDs (endpoint identifiers), 263
MSDP originator IDs, 54
IETF (Internet Engineering Task Force), 4
IGMP (Internet Group Messaging Protocol), 8, 231–232, 244
IGPs (Interior Gateway Protocols), 5
in-band signaling operation, 162–163
infrastructure as a service (IaaS), 100, 254–255
293
Index
ingress replication
EVPN VXLAN, 224
PE-PE ingress replication, 188–189
ingress tunnel routers (ITRs), 263–265
Intelligent WAN (IWAN), 261
interdomain design
BGP (Border Gateway Protocol), 2, 5
importance of, 7
inter-AS routing with, 22–24
MBGP (Multiprotocol Border Gateway Protocol), 24–32
prefix table, 36, 37
characteristics of, 6–14
configured multicast boundaries, 32–37
inter-AS and Internet design, 72–82
completed multicast tree, 79–82
ISP multicast configurations, 72–79
interdomain segregation, verifying, 70–71
MSDP (Multicast Source Discovery Protocol)
advantages of, 38
configuration of, 50–56
debug ip msdp peer command, 43–45
294
Index
peers, 38–40, 50–53
RPF (reverse path forwarding) checks, 46–47
SAs (source actives), 40, 47–50
state machine events and actions, 41–42
scoped multicast domains
group scope, 14–15
overlapping scopes, 17–19
RP scope, 16–17
security, 82–83
static interdomain forwarding, 19–22
troubleshooting
design interface map, 287
high-level ASM network design, 286–287
methodology overview for, 285–286
multicast reachability from source to client, 286–287
receiver checks, 288–290
source checks, 290–292
state verification, 292–301
without active source learning, 88
IPv6 with embedded RPs, 90–96
SSM (Source-Specific Multicast), 88–90
295
Index
interdomain segregation, verifying, 70–71
Interior Gateway Protocols (IGPs), 5
Intermediate System-to-Intermediate System (IS-IS), 5
Internet Assigned Numbers Authority (IANA), 3
Internet Engineering Task Force (IETF), 4
Internet Group Messaging Protocol. See IGMP (Internet Group Messaging Protocol)
Internet Routing Architectures (Halabi), 28
Internet service providers (ISPs), 2, 72–79
Internet-based cloud connectivity, 101
intra-AS multidomain design, 62–71
intra-regional multicast flow between two spokes, 130–132
IOS-XE
CSR 1000v device, 104
data MDT MLDP (Multicast LDP)
data threshold configuration, 177
database, displaying, 177–178
database summary, 180
forwarding tables, 179, 181
data MDT (Multicast Distribution Tree)
basic configuration, 143
BGP MDT address family, 145
296
Index
cache entries, displaying, 157
data MDT in operation, 158
data threshold configuration, 149, 156
MDT BGP adjacency, verifying, 146–147
MTI (multicast tunnel interface), 144, 152–153
packet count, 159
PIM neighbor for VRF RED, 150
PIM VRF data, 157–158
verifying, 156–157
default MDT MLDP (Multicast LDP)
basic configuration, 166
bindings, displaying, 170
database, displaying, 173
forwarding table, displaying, 174
neighbors, displaying, 167
root of default MDT trees, verifying, 169
show ip mroute vrf BLU 224.2.2.22 command, 171–172
default MDT (Multicast Distribution Tree) configuration, 140–141
MBGP (Multiprotocol Border Gateway Protocol)
address-family configuration, 26–27
BGP IPv4 multicast prefix acceptance, 36
297
Index
show ip bgp neighbors command, 28–31
MSDP (Multicast Source Discovery Protocol)
debug ip msdp peer command, 43–44
final configuration for Mcast Enterprises, 64–69
mesh group commands, 54
originator ID commands, 54
peer configuration commands, 38, 50–51
peer description commands, 51
peer password authentication and encryption, 52
peer password timer commands, 52–53
SA filter in/out commands, 55–56
SA limit commands, 56
show ip msdp sa-cache command, 49
shutdown commands, 55
multicast boundary configuration, 84
MVPN support matrix, 183–184
profiles, migrating between, 185
static entry configuration, 24
VASI (VRF-aware service infrastructure) configuration, 201
VRF (Virtual Route Forwarding) fallback
configuration, 196
298
Index
extranet RPF rule, 198
fallback interface, 197–198
verification, 197
VRF (Virtual Route Forwarding) select
configuration, 199
validation, 199
IOS-XR
data MDT MLDP (Multicast LDP)
bindings, displaying, 180
data threshold configuration, 177
database brief, 181
database summary validation, 181
P2MP, 179–180
data MDT (Multicast Distribution Tree)
basic configuration, 143
BGP MDT configuration, 146
data threshold configuration, 150, 156
MDT BGP adjacency, verifying, 147–148
MRIB VRF RED, 157
MTI (multicast tunnel interface), 144, 154–155
packet count, 159–160
299
Index
PIM neighbor for VRF RED, 150–151
PIM VRF data, 157–158
SSM configuration, 145
default MDT MLDP (Multicast LDP)
basic configuration, 166
bindings, displaying, 171, 175
database, displaying, 174–175
neighbors, displaying, 168–169
root of default MDT trees, verifying, 170
show mrib vrf BLU route 224.2.2.22 command, 172
default MDT (Multicast Distribution Tree) configuration, 141
MBGP (Multiprotocol Border Gateway Protocol), 27–28
MSDP (Multicast Source Discovery Protocol)
mesh group commands, 54
originator ID commands, 54
peer configuration commands, 39, 50–51
peer description commands, 51
peer password authentication and encryption, 52
peer password timer commands, 52
peer reset timers, 53
SA filter in/out commands, 55–56
300
Index
SA limit commands, 56
shutdown commands, 55
multicast boundary configuration, 84
MVPN support matrix, 183–184
PE-PE ingress replication, 188–191
profiles
applying, 185
migrating between, 185
route leaking, 193–195
static entry configuration, 24
ip mroute command, 24
ip msdp keepalive command, 52
ip msdp mesh-group command, 54
ip msdp originator-id command, 54
ip msdp peer command, 38–39, 51–52
ip msdp reconnect-interval command, 53
ip msdp sa-filter in command, 55
ip msdp sa-filter out command, 55
ip msdp sa-limit command, 56
ip msdp sa-policy command, 55
ip msdp shutdown command, 55
301
Index
ip msdp timer command, 53
ip multicast boundary command, 84
ip pim autorp listener command, 148
ip pim jp-policy command, 84
ip pim sparse-mode command, 36, 294
ip pim ssm default command, 145
ip pim vrf RED autorp listener command, 148
IPTV (IP Television) delivery, 279–282
IPv6 embedded RPs (rendezvous points), 90–96
ipv6 mld join-group command, 95
IPv6 MVPN, 202
ipv6 pim rp-address command, 95
ISE (Identity Services Engine), 262
IS-IS (Intermediate System-to-Intermediate System), 5
ISPs (Internet service providers), 2, 72–79
ITRs (ingress tunnel routers), 263–265
IWAN (Intelligent WAN), 261
J-K-L
Label Distribution Protocol. See MLDP (Multicast LDP)
label-switched routers (LSRs), 137
last-hop routers (LHRs), 50, 245, 253
302
Index
LDP (Label Distribution Protocol). See MLDP (Multicast LDP)
leaf configuration (EVPN VXLAN), 220–223
leaking, route, 192–195
LFA (Loop-Free Alternate), 184
LHR (last-hop router), 245, 253
LHRs (last-hop routers), 50
limits on SAs (source actives), 55–56
LISP (Locator/ID Separation Protocol), 262–263
configuration example, 265–267
MRs (map resolvers), 263–264
MSs (map servers), 263–264
PETRs (proxy egress tunnel routers), 264–265
PITRs (proxy ingress tunnel routers), 264–265
Loop-Free Alternate (LFA), 184
LSRs (label-switched routers), 137
M
map resolvers (MRs), 263–264
map servers (MSs), 263–264
market applications, 269–271
brokerage multicast design, 273–274
FSP (financial service provider) multicast design, 273
303
Index
market data environments, 271–272
MBGP (Multiprotocol Border Gateway Protocol)
advantages of, 24
configuring for multicast, 25–32
BGP address-family configuration, 28
MBGP address-family configuration, 27–28
show ip bgp command, 31
show ip bgp ipv4 multicast command, 31
show ip bgp neighbors command, 28–31
standard BGP configuration, 25
Mcast Enterprises example
BGP (Border Gateway Protocol), 2, 5
importance of, 7
inter-AS routing with, 22–24
MBGP (Multiprotocol Border Gateway Protocol), 24–32
prefix table, 36, 37
configured multicast boundaries, 32–37
inter-AS and Internet design, 72–82
completed multicast tree, 79–82
ISP multicast configurations, 72–79
interdomain segregation, verifying, 70–71
304
Index
intra-AS multidomain design, 62–71
MSDP (Multicast Source Discovery Protocol)
advantages of, 38
configuration of, 50–56
debug ip msdp peer command, 43–45
peers, 38–40, 50–53
RPF (reverse path forwarding) checks, 46–47
SAs (source actives), 40, 47–50
state machine events and actions, 41–42
scoped multicast domains
group scope, 14–15
overlapping scopes, 17–19
RP scope, 16–17
static interdomain forwarding, 19–22
troubleshooting
design interface map, 287
high-level ASM network design, 286–287
methodology overview for, 285–286
multicast reachability from source to client, 286–287
receiver checks, 288–290
source checks, 290–292
305
Index
state verification, 292–301
traffic engineering problems, 301–313
MDT (Multicast Distribution Tree), 139, 275. See also profiles
data MDT
basic configuration, 142–143
example of, 155–160
MTI (multicast tunnel interface), 143–144, 152–155
multicast signaling in the core, 144–148
traffic flow, 148–149
data MDT MLDP, 176–181
default MDT
basic configuration, 139–142
example of, 149–152
sample network scenario, 148
default MDT MLDP
example of, 165–176
explanation of, 164
root high availability, 164–165
MDT-SAFI, 145
partitioned MDT, 182–183
service provider multicast, 275–277
306
Index
mdt data 254 threshold 2 command, 177
mdt data group-address-range wildcard-bits command, 142
mdt data mpls mldp command, 177
mdt data threshold 2 command, 156
mdt data threshold command, 149
mdt default command, 140–141
mdt preference mldp command, 185
medical device communications, 240–246
mesh groups (MSDP), 54
MFIB (Multicast Forwarding Information Base), 8–9
mGRE (multipoint generic routing encapsulation), 118
migrating between profiles, 185–186
MLDP (Multicast LDP)
data MDT MLDP, 176–181
default MDT MLDP
example of, 165–176
explanation of, 164
root high availability, 164–165
FEC (Forwarding Equivalence Class) elements, 161–162
in-band signaling operation, 162–163
out-of-band signaling operation, 163
307
Index
topology, 160–161
MP2MP (Multipoint-to-Multipoint) trees, 140
MPLD (Multicast Label Distribution Protocol), 138
MPLS (Multiprotocol Label Switching) VPNs, 137–138
BIER (Bit Index Explicit Replication), 202–205
CE (customer edge) devices, 137
CE-CE multicast routing, 187
CE-PE multicast routing, 186–187
customers, 137
data MDT (Multicast Distribution Tree)
basic configuration, 142–143
data MDT MLDP, 176–181
example of, 155–160
MTI (multicast tunnel interface), 143–144, 152–155
multicast signaling in the core, 144–148
traffic flow, 148–149
default MDT (Multicast Distribution Tree)
basic configuration, 139–142
default MDT MLDP, 164–176
example of, 149–152
sample network scenario, 148
308
Index
IPv6 MVPN, 202
MLDP (Multicast LDP)
data MDT MLDP, 176–181
default MDT MLDP, 164–176
FEC (Forwarding Equivalence Class) elements, 161–162
in-band signaling operation, 162–163
out-of-band signaling operation, 163
topology, 160–161
multicast extranet VPNs, 192
fusion router, 199–200
route leaking, 192–195
VASI (VRF-aware service infrastructure), 200–202
VRF (Virtual Route Forwarding) fallback, 195–198
VRF (Virtual Route Forwarding) select, 198–199
MVR (multicast VLAN registration), 281–282
PE (provider edge) devices, 137
PE-PE ingress replication, 187–191
profiles
available options, 182
migrating between, 185–186
operating system support matrix, 183–185
309
Index
provider multicast transport, 186
providers, 137
troubleshooting, 314–322
MRIB (Multicast Routing Information Base), 10
MRs (map resolvers), 263–264
MSDP (Multicast Source Discovery Protocol)
advantages of, 38
configuration of, 50–56
mesh groups, 54
originator IDs, 54
peers, 50–53
SAs (source actives), 55–56
deployment use case, 56–61
peers
clearing, 55
configuration of, 38–40, 50–53
mesh groups, 54
peer-RPF flooding, 49
timers, 52–53
RPF (reverse path forwarding) checks, 46–47
SAs (source actives)
310
Index
definition of, 40
explanation of, 47–50
filters, 55–56
limits, 56
shutdown commands, 55
state machine events and actions, 41–42
troubleshooting
cache repair, 298–299
configuration check, 298
debugging, 43–45, 296–297
peering check, 295–296
msdp filter-out command, 59
msdp-peer configuration mode, 50
MSs (map servers), 263–264
MTI (Multicast Tunnel Interface), 143–144, 152–155
multi-address family support, 262–263
Multicast Distribution Tree. See MDT (Multicast Distribution Tree)
multicast domains. See also borders/boundaries
BGP (Border Gateway Protocol), 2, 5
importance of, 7
inter-AS routing with, 22–24
311
Index
MBGP (Multiprotocol Border Gateway Protocol), 24–32
prefix table, 36, 37
characteristics of, 6–14
configured multicast boundaries, 32–37
hospital network design, 253
inter-AS and Internet design, 72–82
completed multicast tree, 79–82
ISP multicast configurations, 72–79
inter-AS multicast, 72–82
interdomain segregation, verifying, 70–71
intra-AS multidomain design, 62–71
MSDP (Multicast Source Discovery Protocol)
advantages of, 38
configuration of, 50–56
debug ip msdp peer command, 43–45
peers, 38–40, 50–53
RPF (reverse path forwarding) checks, 46–47
SAs (source actives), 40, 47–50
state machine events and actions, 41–42
scoped multicast domains, 258
group scope, 14–15
312
Index
overlapping scopes, 17–19
RP scope, 16–17
security, 82–83
firewalls, 83
prefix filtering, 84–87
push data model, 82
service filtering, 87–88
static interdomain forwarding, 19–22
without active source learning, 88
IPv6 with embedded RPs, 90–96
SSM (Source-Specific Multicast), 88–90
Multicast Forwarding Information Based (MFIB), 8–9
Multicast Label Distribution Protocol (MLDP), 138
Multicast Routing Information Base (MRIB), 10
Multicast Source Discovery Protocol. See MSDP (Multicast Source Discovery Protocol)
Multicast Tunnel Interface (MTI), 143–144, 152–155
multicast VLAN registration (MVR), 281–282
multicast VPNs. See MPLS (Multiprotocol Label Switching) VPNs
multicast-to-multicast destination conversion, 105–109
packet capture before conversion, 107
show ip mroute output at R2, 107–108
313
Index
show ip mroute output at R4, 106–107, 109
VIF configuration at R2, 106
multicast-to-unicast destination conversion, 113–117
show ip mroute output at R2, 115–116
sniffer capture after conversion, 114–115
sniffer output after conversion, 116–117
VIF configuration at R2, 114
multipoint generic routing encapsulation (mGRE), 118
Multipoint Label Distribution Protocol. See MPLD (Multicast Label Distribution
Protocol)
Multipoint-to-Multipoint (MP2MP) trees, 140
Multiprotocol Label Switching. See MPLS (Multiprotocol Label Switching) VPNs
multitenant data centers
ACI (Application Centric Infrastructure), 258–260
software-defined networking
Cisco DNA (Digital Network Architecture), 260–261
DMVPN (Dynamic Multipoint VPN), 261
IWAN (Intelligent WAN), 261
LISP (Locator/ID Separation Protocol), 262–267
SD-Access, 261–262
Viptela, 261
314
Index
tenant overlays, 254–258
MVPNs. See MPLS (Multiprotocol Label Switching) VPNs
MVR (multicast VLAN registration), 281–282
N
NAN (neighborhood area network) level, 268
National Institute of Standards and Technology (NIST), 259
NDP (Network Data Platform), 262
neighborhood area network (NAN) level, 268
NERC (North American Electric Reliability Corporation) assets, 268–269
Network Data Platform (NDP), 262
network design models, 237–238
market applications, 269–271
brokerage multicast design, 273–274
FSP (financial service provider) multicast design, 273
market data environments, 271–272
multicast-enabled clinical networks
campus design, 238–240
medical device communications, 240–246
three-tier network hierarchy, 238–239
wireless networks, 246–254
multitenant data centers
315
Index
ACI (Application Centric Infrastructure), 258–260
tenant overlays, 254–258
service provider multicast, 274–275
IPTV delivery, 279–282
PIM-type selection, 275–279
RP placement, 275–279
software-defined networking
Cisco DNA (Digital Network Architecture), 260–261
DMVPN (Dynamic Multipoint VPN), 261
IWAN (Intelligent WAN), 261
LISP (Locator/ID Separation Protocol), 262–267
SD-Access, 261–262
Viptela, 261
utility networks
design blocks, 267
Distribution Level tiers, 268
PMUs (phasor measurement units), 268–269
Radio over IP design, 269
SCADA (Supervisory Control and Data Acquisition), 267
Substation tier, 268
Network Virtualization Endpoint (NVE), 323
316
Index
NFV (network functions virtualization), 103
NHRP (Next Hop Resolution Protocol), 118
NIST (National Institute of Standards and Technology), 259
no ip sap listen command, 87–88
North American Electric Reliability Corporation (NERC) assets, 268–269
NVE (Network Virtualization Endpoint), 323
NX-OS
MSDP (Multicast Source Discovery Protocol)
mesh group commands, 54
originator ID commands, 54
peer configuration commands, 39, 50–51
peer description commands, 51
peer password authentication and encryption, 52
peer password timer commands, 52–53
SA filter in/out commands, 55–56
SA limit commands, 56
shutdown commands, 55
multicast boundary configuration, 84
MVPN support matrix, 183–184
static entry configuration, 24
O
317
Index
OIF (outgoing interface list), 233
OIL (outgoing interface list), 9
Open Shortest Path First (OSPF), 5
originator IDs (MSDP), 54
orphan ports, 208
OSPF (Open Shortest Path First), 5
outgoing interface list (OIF), 233
outgoing interface list (OIL), 9
out-of-band signaling operation, 163
overlapping domain scopes, 17–19
overlays, 229–230, 254–258
P
P (provider) multicast transport, 186
PaaS (platform as a service), 100
package installation envelope (PIE), 51
packet count (MDT), 159–160
partitioned MDT (Multicast Distribution Tree), 182–183
PBB-EVPN (Ethernet VPN), 277
PBR (policy-based routing), 118
PE (provider edge) devices
CE-PE multicast routing, 186–187
318
Index
definition of, 137
PE-PE ingress replication, 187–191
peer command, 51–52, 54–56
peer links (VPC), 208
peer-RPF flooding, 49
peers (MSDP)
clearing, 55
configuration of, 38–40, 50–53
mesh groups, 54
peer-RPF flooding, 49
state verification, 295–296
timers, 52–53
PETRs (proxy egress tunnel routers), 264–265
phasor measurement units (PMUs), 268–269
PIE (package installation envelope), 51
PIM (Protocol Independent Multicast), 1
auto-RP advertisements, 235
Bidir-PM (Bidirectional PIM), 244
forwarding trees
building, 6–12
completed multicast tree, 79–82
319
Index
OIL (outgoing interface list), 9
RPs (rendezvous points), 10
sample completed tree, 9
shared trees, 8–11
shortest path trees, 8–11
hellos, 234
multicast domains
characteristics of, 6–14
configured multicast boundaries, 32–37
group scope, 14–15
overlapping scopes, 17–19
RP scope, 16–17
static interdomain forwarding, 19–22
neighbors, checking for, 150–151, 292–293
RPF (reverse path forwarding), 7–8
selection for hospital networks, 244–246
service provider multicast, 275–279
SM (sparse mode), 8
SSM (Source-Specific Multicast), 140, 144–145, 245
troubleshooting, 301–302
debugging, 34
320
Index
failed PIM path between source and client, 307–309
R1 unicast RIB and FIB entries, 309–310
static mroute, adding to R1, 310–312
static mroute, adding to R2, 312–314
traffic engineering router configurations, 303–306
verification, 293–294
ping command
ping 10.0.100.100 source lo 100
unicast spoke-to-central hub communication, 128–129
unicast spoke-to-spoke interregional communication, 128
unicast spoke-to-spoke intra-regional communication, 126
ping 239.1.2.200, 79–82, 286–287, 300
ping 239.20.2.100, 71, 307, 312
ping FF73:105:2001:192::1, 95
PITRs (proxy ingress tunnel routers), 264–265
platform as a service (PaaS), 100
PMUs (phasor measurement units), 268–269
points of presence (POPs), 271
policy-based routing (PBR), 118
POPs (points of presence), 271
port channels, virtual
321
Index
conceptual design, 207–208
multicast flow over, 208–211
orphan ports, 208
VPC peer links, 208
prefix filtering, 84–87
prefix table
clearing, 37
multicast prefix acceptance, 36
profiles
available options, 182
migrating between, 185–186
operating system support matrix, 183–185
Protocol Independent Multicast. See PIM (Protocol Independent Multicast)
Provider Backbone Bridging, 277
provider edge devices. See PE (provider edge) devices
providers
definition of, 137
PE (provider edge) devices
CE-PE multicast routing, 186–187
definition of, 137
PE-PE ingress replication, 187–191
322
Index
provider multicast transport, 186
proxy egress tunnel routers (PETRs), 264–265
proxy ingress tunnel routers (PITRs), 264–265
Pseudowire services, 277
public cloud services. See cloud environments
push data model, 82
Q-R
Radio over IP design, 269
receiver checks, 284
interdomain multicast network
checking RP for (*, G), 289–290
receiver checking output on SP1–1, 288–289
MVPN support matrix, 314
VXLAN (Virtual Extensible LAN), 322
reflection. See service reflection
remote-as command, 39
rendezvous points. See RPs (rendezvous points)
replication
BIER (Bit Index Explicit Replication), 202–205
ingress replication
EVPN VXLAN, 224
323
Index
PE-PE ingress replication, 188–189
reverse path forwarding (RPF), 6, 7–8, 118
RFCs (requests for comment)
RFC 2283, 24
RFC 3306, 91
RFC 3446, 38, 59
RFC 3618, 41, 47
RFC 3956, 96
RFC 4364, 138
RFC 4607, 90
RFC 4760, 24
RFC 5036, 138
RFC 5415, 248–249
RFC 5771, 90
RFC 6996, 3
RFC 7371, 91
RFC 7441, 139
RFC 7606, 24
RIB (router information base), 7, 309–310
RLOCs (routing locators), 263
Rosen, Eric, 139
324
Index
Rosen model. See default MDT (Multicast Distribution Tree)
route leaking, 192–195
route policy language (RPL), 185
router bgp command, 27
router information base (RIB), 7
router msdp configuration mode, 50
router msdp originator-id command, 54
routing locators (RLOCs), 263
RPF (reverse path forwarding), 6, 7–8, 118
checks, 46–47
peer-RPF flooding, 49
state verification, 292
RPL (route policy language), 185
RPs (rendezvous points). See also MSDP (Multicast Source Discovery Protocol)
checking for (*, G), 289–290
definition of, 10
IPv6 with embedded RPs, 90–96
market data environments, 272–274
medical device communications, 245
MSDP peers
clearing, 55
325
Index
configuration of, 38–40, 50–53
mesh groups, 54
peer-RPF flooding, 49
state verification, 295–296
timers, 52–53
multitenant data centers, 258–260
scope of, 16–17
service provider multicast, 275–279
traffic engineering, 122
wireless networks, 253
S
SAFI (subsequent address family identifiers), 145
sa-limit command, 56
SAP (Session Advertisement Protocol), 87
SAs (source actives)
definition of, 40
explanation of, 47–50
filters, 55–56
limits, 56
SCADA (Supervisory Control and Data Acquisition), 267
scalability. See cloud environments
326
Index
scoped multicast domains, 258
group scope, 14–15
overlapping scopes, 17–19
RP scope, 16–17
SD-Access, 261–262
SDDCs (software-defined data centers), 260
SDP (Session Description Protocol), 87
SD-WAN. See software-defined networking
security, 82–83
firewalls, 83
routed mode, 83
transparent mode, 83
VFW (virtual firewall), 255
prefix filtering, 84–87
service filtering, 87–88
Security Group Access Control Lists (SGACL), 262
Security Group Tags (SGTs), 262
select feature (VRF), 198–199
service filtering, 87–88
service provider multicast, 274–275
IPTV delivery, 279–282
327
Index
PIM-type selection, 275–279
RP placement, 275–279
service reflection, 105
multicast-to-multicast destination conversion, 105–109
packet capture before conversion, 107
show ip mroute output at R2, 107–108
show ip mroute output at R4, 106–107, 109
VIF configuration at R2, 106
multicast-to-unicast destination conversion, 113–117
show ip mroute output at R2, 115–116
sniffer capture after conversion, 114–115
sniffer output after conversion, 116–117
VIF configuration at R2, 114
unicast-to-multicast destination conversion, 109–113
show ip mroute output at R2, 112
show ip mroute output at R4, 110–111, 113
sniffer capture before conversion, 111–112
VIF configuration at R2, 110
Service Set Identification (SSID), 241, 247–248
Session Advertisement Protocol (SAP), 87
Session Description Protocol (SDP), 87
328
Index
set core-tree ingress-replication-default command, 188
SGACL (Security Group Access Control Lists), 262
SGTs (Security Group Tags), 262
shared trees, 8–11
shared-to-source tree process, 149
shortest path trees, 8–11
show commands
show bgp ipv4 mdt vrf RED, 147–148
show interfaces tunnel 1, 153–154, 320–321
show ip bgp, 31
show ip bgp ipv4 mdt vrf RED, 146–147
show ip bgp ipv4 multicast, 31, 36–37, 312–314
show ip bgp neighbors, 28–31
show ip igmp groups, 209, 288–289
show ip mfib vrf RED 224.1.1.20, 159
show ip mroute, 9, 61, 318, 323–324
failed PIM path between source and client, 307–309
group flow for 239.1.1.1, 60–61
interdomain segregation, verifying, 70–71
multicast flow over VPC, 209–211
routing table for VRF “default”, 216
329
Index
service reflection examples, 106–116
show ip mroute 239.1.1.1, 129–130
show ip mroute 239.1.2.200, 79–82, 288–290, 294, 299–300
show ip mroute 239.120.1.1, 33, 34–35
show ip mroute 239.192.1.1, 130–132
show ip mroute 239.20.2.100, 310–313
show ip mroute 239.2.2.2, 59, 60
show ip mroute vrf BLU 224.1.1.1, 197–198
show ip mroute vrf BLU 224.2.2.22, 171–172
show ip mroute vrf RED 224.1.1.1, 197–198
show ip mroute vrf RED 224.1.1.20, 152–153
show ip mroute vrf SVCS 224.1.1.1, 196–197, 199
show ip msdp sa-cache, 49, 60, 61, 71, 295–296, 298–299
show ip msdp sa-cache rejected-sa, 297–298
show ip msdp summary, 295–296
show ip nhrp, 126–127, 129
show ip pim, 70
show ip pim neighbor, 150–151
show ip pim rp mapping, 59
show ip pim rp mappings, 288–289
show ip pim vrf RED mdt, 156–157
330
Index
show ip pim vrf RED mdt send, 157
show ip pim vrf RED neighbor, 150, 321
show ip route, 315–316
show ip rpf, 284
show ip rpf 10.20.2.100, 309, 311–312, 313
show ipv6 mroute, 96
show mfib vrf RED interface, 144
show mfib vrf RED route 224.1.1.20, 159–160
show mpls forwarding-table, 181, 316–317
show mpls forwarding-table labels, 174, 179
show mpls ldp neighbor, 317–318
show mpls mldp bindings, 170–171, 175, 180
show mpls mldp database, 173, 174–175, 177–178
show mpls mldp database brief, 181
show mpls mldp database p2mp root 192.168.0.4, 179–180
show mpls mldp database summary, 180, 181
show mpls mldp neighbors, 167–169
show mpls mldp root, 169–170
show mrib vrf BLU route 224.2.2.22, 172, 189
show mrib vrf RED mdt-interface detail, 157
show mvpn vrf BLU ipv4 database ingress-replication, 189–191
331
Index
show mvpn vrf BLU pe, 189–191
show nve peers, 216, 222, 323
show nve vni, 216, 223, 323
show nve vni ingress-replication, 224
show pim vrf RED mdt cache, 157, 158
show pim vrf RED mdt interface, 154–155
show pim vrf RED neighbor, 150–151
show run interface vif1, 106, 110
show running-config | begin ip msdp, 298
show running-config interface e0/1, 293–294
show vrf RED, 143–144
shutdown commands, 55
signaling
in-band, 162–163
out-of-band, 163
SM (sparse mode), 8
snooping (IGMP), 244
software-defined data centers (SDDCs), 260
software-defined networking
Cisco DNA (Digital Network Architecture), 260–261
DMVPN (Dynamic Multipoint VPN), 261
332
Index
IWAN (Intelligent WAN), 261
LISP (Locator/ID Separation Protocol), 262–263
configuration example, 265–267
MRs (map resolvers), 263–264
MSs (map servers), 263–264
PETRs (proxy egress tunnel routers), 264–265
PITRs (proxy ingress tunnel routers), 264–265
SD-Access, 261–262
Viptela, 261
source actives. See SAs (source actives)
source checks, 284
interdomain multicast network, 290–292
MPLS (Multiprotocol Label Switching) VPNs, 284
traffic engineering, 301
VXLAN (Virtual Extensible LAN), 322
source trees, 8–11
Source-Specific Multicast. See SSM (Source-Specific Multicast)
sparse mode (PIM), 8
spine configuration (EVPN VXLAN), 219–220
SSID (Service Set Identification), 241, 247–248
SSM (Source-Specific Multicast), 6, 88–90, 140, 144–145, 245
333
Index
state machine (MSDP), 41–42
state verification, 284
interdomain multicast network
LHR RP MSDP SA state, 295
mroute state on LHR, 299–300
MSDP cache repair, 298–299
MSDP configuration check, 298
MSDP debugging, 296–297
MSDP peering check, 295–296
PIM configuration, 293–294
PIM neighborships, 292–293
ping test, 300
rejected CA cache entries, 297–298
RPF information, 292
MPLS (Multiprotocol Label Switching) VPNs, 315–322
traffic engineering, 301–302
failed PIM path between source and client, 307–309
R1 unicast RIB and FIB entries, 309–310
router configurations, 303–306
static mroute, adding to R1, 310–312
static mroute, adding to R2, 312–314
334
Index
VXLAN (Virtual Extensible LAN), 322–325
static interdomain forwarding, 19–22
static mroute, adding
to R1, 310–312
to R2, 312–314
static-rpf command, 24
subsequent address family identifiers (SAFI), 145
Substation tier, 268
Supervisory Control and Data Acquisition (SCADA), 267
T
TCP (Transport Control Protocol), 38, 87
TDP (Tag Distribution Protocol), 138
tenants. See multitenant data centers
TIBCO Rendezvous, 270
timers (MDSP), 52–53
time-to-live (TTL), 87
trade floors, multicast in, 269–271
brokerage multicast design, 273–274
FSP (financial service provider) multicast design, 273
market data environments, 271–272
traffic engineering, 19, 117–118
335
Index
DMVPN (Dynamic Multipoint VPN), 118–126
configuration snapshot from hub R2, 123
configuration snapshot of regional hub R3, 123–125
configuration snapshot of regional spoke R6, 125–126
intra-regional multicast flow between two spokes, 130–132
multicast flow from hub, 129–130
troubleshooting, 301–302
failed PIM path between source and client, 307–309
R1 unicast RIB and FIB entries, 309–310
router configurations, 303–306
static mroute, adding to R1, 310–312
static mroute, adding to R2, 312–314
unicast spoke-to-central hub communication, 128–129
unicast spoke-to-spoke interregional communication, 128
unicast spoke-to-spoke intra-regional communication, 126–127
traffic flow (MDT), 148–149
translation. See service reflection
Transport Control Protocol (TCP), 38, 87
transport diversification. See cloud environments
trees. See forwarding trees
troubleshooting
336
Index
interdomain multicast networks
design interface map, 287
high-level ASM network design, 286–287
methodology overview for, 285–286
multicast reachability from source to client, 286–287
receiver checks, 288–290
source checks, 290–292
state verification, 292–301
MSDP (Multicast Source Discovery Protocol), 296–297
cache repair, 298–299
configuration check, 298
debugging, 43–45, 48, 296–297
peering check, 295–296
MVPN, 314–322
PIM (Protocol Independent Multicast), 301–302
debugging, 34
failed PIM path between source and client, 307–309
R1 unicast RIB and FIB entries, 309–310
static mroute, adding to R1, 310–312
static mroute, adding to R2, 312–314
traffic engineering router configurations, 303–306
337
Index
three-step methodology, 283–284
VXLAN (Virtual Extensible LAN), 322–325
TrustSec infrastructure, 262
TTL (time-to-live), 87
U
UA (Unified Access), 249–250
unicast spoke-to-central hub communication, 128–129
unicast spoke-to-spoke interregional communication, 128
unicast spoke-to-spoke intra-regional communication, 126–127
unicast-to-multicast destination conversion, 109–113
show ip mroute output at R2, 112
show ip mroute output at R4, 110–111, 113
sniffer capture before conversion, 111–112
VIF configuration at R2, 110
Unified Access (UA), 249–250
use cases
MSDP (Multicast Source Discovery Protocol), 56–61
multicast in cloud environments, 132–135
utility networks
design blocks, 267
Distribution Level tiers, 268
338
Index
PMUs (phasor measurement units), 268–269
Radio over IP design, 269
SCADA (Supervisory Control and Data Acquisition), 267
Substation tier, 268
V
Validated Design, 255
VASI (VRF-aware service infrastructure), 200–202
VFW (virtual firewall), 255
VIFs (virtual interfaces), configuring for service reflection
multicast-to-multicast destination conversion, 106
multicast-to-unicast destination conversion, 114
unicast-to-multicast destination conversion, 110
Viptela, 261
Virtual Extensible LAN. See VXLAN (Virtual Extensible LAN)
virtual firewall (VFW), 255
virtual port channel. See VPC (virtual port channel)
Virtual Private LAN Service (VPLS), 277
virtual private networks. See VPNs (virtual private networks)
Virtual Route Forwarding. See VRF (Virtual Route Forwarding)
virtual services, 103–105
Virtual Switching System (VSS), 241
339
Index
VMDC (Virtualized Multi-Tenant Data Center), 255–256
VNIs (VXLAN network identifiers), 323–324
VPC (virtual port channel)
conceptual design, 207–208
multicast flow over, 208–211
orphan ports, 208
VPC peer links, 208
VPLS (Virtual Private LAN Service), 277
VPNs (virtual private networks). See also MPLS (Multiprotocol Label Switching) VPNs
DMVPN (Dynamic Multipoint VPN), 118–126, 261
configuration snapshot from hub R2, 123
configuration snapshot of regional hub R3, 123–125
configuration snapshot of regional spoke R6, 125–126
EVPNs (Ethernet virtual private networks), 216–218
ingress replication, 224
leaf configuration, 220–223
spine configuration, 219–220
VRF (Virtual Route Forwarding), 262
VRF fallback, 195–198
VRF select, 198–199
VSS (Virtual Switching System), 241
340
Index
VTEPs (VXLAN Tunnel Endpoints), 211–213
VXLAN (Virtual Extensible LAN), 211
EVPNs (Ethernet virtual private networks), 216–218
ingress replication, 224
leaf configuration, 220–223
spine configuration, 219–220
flood and learn mode, 213–216
host-to-host multicast communication in
Layer 2 communication, 224–226
Layer 3 communication, 226
troubleshooting, 322–325
VNIs (VXLAN network identifiers), 323–324
VTEPs (VXLAN Tunnel Endpoints), 211–213
W-X-Y-Z
wireless LAN controllers (WLCs), 246–247
wireless networks, multicast considerations for, 246–254
WLCs (wireless LAN controllers), 242, 246–247
zones, 256–258
341