A Comparison of Scaling Techinques For BGP
A Comparison of Scaling Techinques For BGP
A Comparison of Scaling Techinques For BGP
Rohit Dube
High Speed Networks Research
Room 4C-508, Bell Labs, Lucent Technologies
101 Crawfords Corner Road, Holmdel, NJ 07733
Email: [email protected]
Abstract IBGP
R2 R3
IBGP
BGP is the inter-domain routing protocol used in the Inter-
net today. During the course of its evolution, the Internet R1 R4
has gone from being a simple and small network to one that EBGP EBGP
is run at its core by large service providers constantly bat-
tling with bigger and bigger topologies forcing the routing R6 R5
community to invent ways of scaling both interior and exte-
rior routing protocols. Route-re
ectors and confederations Figure 1: Full-mesh IBGP
have turned out to be the weapons of choice in scaling BGP
to these large topologies. This paper takes a close look at
these two mechanisms and seeks to compare them. prex is propagated to all the IBGP peers. In the stable
state this provides all the BGP routers in the network with
1 Introduction all possible routes to a prex. (Further details on this can
be found elsewhere in the literature [1], [2], [3]).
The Border Gateway Protocol (BGP) [1], [2], [3] is the Given that each BGP router has to peer with all the BGP
pervasive inter-domain routing protocol in the Internet to- routers within the AS, it is easy to see that as the number
day. Before the recent explosive growth of the service of BGP routers in the AS grows, the number of peering
providers topologies, BGP was typically used in a cong- sessions each router needs to maintain increases to (n 1)
uration where all the border routers imported routes from with a total of n (n 1)=2 IBGP peerings in the network,
external Autonomous Systems (ASes) and then distributed where n is the number of BGP routers. Maintaining these
them to all the routers within their own AS. This distri- peering sessions gets quickly out of hand with increasing
bution was accomplished using a full-mesh of Internal-BGP n, both for the network administrators and the the router
(IBGP) peerings amongst all the routers in the AS. Once hardware.
this
at topology hit the scaling limit (both administrative
and the cpu/memory ceiling), mechanisms were devised to 2.1 Route-re
ection
reduce the number of peering sessions per router. There
are three such mechanisms deployed in the Internet today - Route-re
ectors tackle this scaling problem by dividing the
route-re
ectors [4], confederations [5] and route-servers [6]. IBGP topology into clusters. A cluster consists of one or
Of these three, route-re
ectors and confederations are the more BGP routers acting as server(s) and the remaining as
dominant mechanisms having been implemented by multiple client(s). The servers are fully meshed with each other and
vendors and deployed by the biggest Internet and Network also have peering sessions with all the clients. The clients
Service Providers (ISPs and NSPs). In this paper we analyze may or may not peer with each other. Further, clients in a
and compare route-re
ectors and confederations. We start cluster can act as servers for sub-clusters provided a strict
by describing these mechanisms followed by a detailed com- ancestor-descendant relationship is maintained between the
parison. We conclude with a summary of our observations cluster and its sub-clusters and that the servers of all the
and pointers for future work. sub-clusters of a cluster are fully-meshed in a peer-peer re-
lationship. The sub-clusters can have their own sub-sub-
2 IBGP, Route-re
ectors and Confederations clusters and so on. Note that the servers of the top-level
clusters of the hierarchy form a full-mesh amongst them-
Consider the scaled down BGP topology depicted in gure 1. selves (i.e they are in a peer-peer relationship with each
Routers r1 through r6 form an ISP backbone. In order to other).
provide consistent loop free routing, each of these routers The client-server relationship described above is used to
maintains IBGP peering sessions with all the others. When break the \don't propagate IBGP routes" rule on the route-
one of these routers learns a prex, say from an External- re
ector servers. The server is allowed to re
ect routes from
BGP (EBGP) peer, it runs the BGP decision algorithm and a non-client (i.e. an IBGP router in a peer-peer relationship)
installs the best route to the prex into its routing table. If to all its clients and from a client to all the other clients as
this best route is in turn not learnt from an IBGP peer, the well as non-clients. It is helpful to think of the server as a
proxy agent which disseminates routes between its servers With respect to deployment, confederation are typically
and peers on one side and clients on the other (in both made to t the hub-and-spoke topology of gure 2. A central
directions). sub-AS forms the hub and spans the geography of the ISPs
network. Metropolitan or larger areas typically form the
spoke sub-ASes. Hierarchy within the hub or the spoke is
SPOKE
typically not used.
SPOKE HUB SPOKE
3 Similarities and Dierences
As may be evident by now, route-re
ection and confeder-
SPOKE
ations solve the IBGP scaling problem in ways which are
very similar on some counts but dissimilar on others. In
Figure 2: Hub and Spoke topology this section we analyze the two approaches with respect to
their underlying philosophies, deployment scenarios, prob-
ISPs typically deploy route-re
ectors in a two-level hi- lems unique to these approaches and scalability.
erarchy similar to the hub-and-spokes network in gure 2.
The hub consists of all the route-re
ector servers arranged 3.1 Underlying Philosophy
in a full-mesh. These servers are physically located in a Route-re
ection primarily works by changing the behavior
point-of-presence (POP) facility, typically in pairs for re- of IBGP sessions. The main idea is that of selectively prop-
dundancy. Each of these facilities also contain the client agating updates over IBGP sessions from the routers desig-
routers as shown in gure 3 which represents a scaled down nated to be route-re
ector servers. On the other hand, con-
version of a large ISPs POP. federations work by breaking up an AS into smaller, more
to Non-client Peers in other Clusters manageable sub-ASes, in the process changing the behavior
of EBGP sessions.
Route-reflector 3.2 Deployment
RR RR
Servers (RR) In the eld, route-re
ectors have proven to be more popu-
lar than confederations. This is probably because deploy-
ing route-re
ectors requires a software upgrade only on the
routers which are to be designated as servers. The clients
can be oblivious of the fact that some of the updates they
receive are re
ected. Confederations, on the other hand,
require all routers to be able to process the segment type
C C C C C C extensions to the AS Path attribute. This forces a topology
moving from a full-mesh IBGP network to confederations to
Route-reflector Clients(C) perform a fork-lift software upgrade of all the routers. For
most existing networks, this is likely too high a barrier to
Figure 3: Route-re
ector based POP entry. (For details on attribute extensions related to route-
re
ectors and confederations see [4] and [5]. In the interest
of brevity and clarity, we have deliberately culled the details
from this manuscript).
2.2 Confederations Interestingly, both route-re
ectors and confederations
Confederations tackle the same scaling problem by dividing are typically deployed in the hub-and-spoke topology dis-
an AS into sub-ASes. Each of these sub-ASes is fully meshed cussed earlier in gure 2. In both cases, hierarchy is not
inside with regular IBGP sessions and is a
at BGP network. used within the hub or the spokes. The only dierence
At the boundary between two sub-ASes, a modied form of is that while for route-re
ectors the boundary of the hub
EBGP is used (called confederation-EBGP) which adds the and the spokes is made up of routers (i.e route-re
ector
local sub-AS to the AS Path for loop detection within the servers), with confederations the boundary is actually a
confederation (the AS Path is a record of the ASes a pre- confederation-EBGP session between routers in dierent
x has traversed and is ordinarily used by EBGP to detect sub-ASes.
looping updates). To the outside world, this confederation
of sub-ASes, looks like a single regular BGP network. At 3.3 Unique Problems
the boundary between a confederation of sub-ASes and a
regular AS, the peering is a standard EBGP session, except
that the router in the confederation does some extra work in
Solid Lines denote physical connections. Dashed Lines denote BGP sessions.
E R1 R2
Both route-re
ectors and confederations have proven them-
selves in the eld and look very similar when deployed,
but they have distinct advantages over each other. Route-
re
ectors are backward compatible and can therefore be de-
Figure 5: Sub-optimal Routing ployed in a network incrementally without requiring a fork-
lift upgrade. Confederations on the other hand reduce the
Use of confederations on the other hand can lead to sub- number of BGP peering sessions much better (at least for
optimal routing within an AS. Consider the topology in g- the canonical hub-and-spoke topology).
ure 5. R1, R2 and R3 are routers in the same confederation Several questions remain unanswered in this article and
and each of them belong to a separate sub-AS. Router E is present an opportunity for extensive simulation. For in-
in an AS of its own and has a regular EBGP session with stance, how far in terms of the number of BGP sessions and
R1. E advertizes the network N to R1 which readvertizes the total number of BGP updates can the two-level hier-
it to R2 and R3. R2 and R3 also readvertize the route to archy for route-re
ectors and the similar hub-and-spoke for
N to each other. So R3 has a route to N from both R2 confederations scale? Or, how do the two techniques com-
and R1. All other things being equal, R3 may choose the pare in terms of convergence time in the face of failures? In
longer route through R2 to reach N. This is because while addition, the eect of these mechanisms on the stability of
tie-breaking between routes to the same prex, most BGP the network as a whole is not clear and should be looked at
implementations do not take into account the length of the more closely. [8], [9] analyze the general problem of insta-
sub-AS path (some vendors solve this problem by providing bility in the Internet, but they don't specically identify the
a knob to take the sub-AS path length into account). role of network architecture with respect to this instability.
As the size of the ISP networks increase, the importance of
3.4 Scalability this particular problem will grow.
Currently, large ISP networks run between 300 and 500
routers using one of these two approaches to reduce peer-
ing requirements. In the following paragraphs the maximum
Acknowledgements
number of peering sessions is calculated for a hub-and-spoke We would like to thank Vab Goel for describing the Sprint
network of approximately 400 routers { Network, Joe Malcolm for describing the UUNET network
Assume that a route-re
ector based network has 20 and Je Young for describing the Cable and Wireless (for-
POPs each with 2 servers and 18 clients, for a total of 400 merly MCI) network and Tony Przgyienda and the CCR
routers. Each server therefore sees 18 + 1 + 19 2 = 57 reviewers for reviewing this paper.
IBGP peering sessions. The clients in each POP (assuming
that they are fully meshed) see 19 IBGP sessions, one to
each router in the POP. References
Similarly assume that a confederation based network of [1] Y. Rekhter and T. Li. A Border Gateway Protocol
398 router has 20 sub-ASes, one of which is the hub con- (BGP-4), March 1995. IETF RFC 1771.
taining 18 routers and the remaining 19 are spokes each
containing 20 routers. Further assume that 2 routers from [2] B. Halabi. Internet Routing Architectures. Cisco-Press,
each spoke sub-AS peer with 2 router of the central sub-AS. 1997.
Each router on the spoke sub-AS boundary therefore sees [3] J.W. Stewart III. BGP4: Inter-Domain Routing in the
19 IBGP sessions and 2 confederation-EBGP sessions for a Internet. Addison-Wesley, 1998.
total of 21 BGP sessions. Each router in the hub has 17
IBGP sessions and 4 confederation-EBGP sessions for a to- [4] T. Bates and R. Chandra. BGP Route Re
ection: An
tal of 21 sessions. The routers not on the boundary of the alternative to full mesh IBGP, June 1996. IETF RFC
spoke sub-ASes see only the 19 IBGP sessions. 1966.
[5] P. Traina. Autonomous System Confederations for BGP,
June 1996. IETF RFC 1965.
[6] D. Haskin. A BGP/IDRP Route Server Alternative to
a full mesh routing, October 1995. IETF RFC 1863.
[7] R. Dube and J.G. Scudder. Route Re
ection Considered
Harmful, November 1998. IETF Draft draft-dube-route-
re
ection-harmful-00.txt.
[8] C. Labovitz, G.R. Malan, and F. Jahanian. Internet
Routing Instability. In SIGCOMM Conference. ACM,
1997.
[9] C. Labovitz, G.R. Malan, and F. Jahanian. Origins of
Internet Routing Instability. In INFOCOM Conference.
IEEE, 1999.