Cisco BGP
Cisco BGP
Cisco BGP
Border Gateway Protocol (BGP4), defined in RFC 1771 , allows you to create loop-free interdomain routing between one autonomous system (AS) and another. An AS is a set of routers under a single technical administration. Routers in an AS can use multiple interior gateway protocols to exchange routing information inside the AS and an exterior gateway protocol to route packets outside the AS.
BGP uses TCP as its transport protocol (port 179). Two BGP routers form a TCP connection between one another (peer routers) and exchange messages to open and confirm the connection parameters. BGP routers exchange network-reachability information. This information is mainly an indication of the full paths (BGP AS numbers) that a route should take in order to reach the destination network. This information helps in constructing a graph of ASs that are loop free and where routing policies can be applied in order to enforce some restrictions on the routing behavior.
What Are Peers (Neighbors)? Any two routers that have formed a TCP connection in order to exchange BGP routing information are called peers, or neighbors. BGP peers initially exchange their full BGP routing tables. After this exchange, incremental updates are sent as the routing table changes. BGP keeps a version number of the BGP table, which should be the same for all of its BGP peers. The version number changes whenever BGP updates the table because of routing information changes. Keepalive packets are sent to ensure that the connection is alive between the BGP peers and notification packets are sent in response to errors or special conditions.
If an AS has multiple BGP speakers, it could be used as a transit service for other ASs. As you see below, AS200 is a transit AS for AS100 and AS300. It is necessary to ensure reachability for networks within an AS before sending the information to external ASs. This is done by a combination of internal BGP (iBGP) peering between routers inside an AS and by redistributing BGP information to Interior Gateway Protocols (IGP) running in the AS. As far as this course is concerned, when BGP is running between routers belonging to two different ASs, we call this external BGP (eBGP). When BGP is running between routers in the same AS, we call this internal BGP (iBGP).
The following series of steps and examples will enable and configure BGP routing. Let us assume there are two routers, RTA and RTB, both communicating by using BGP. In the first example, RTA and RTB are in different ASs and in the second example, both routers belong to the same AS. We start by defining the router process and the AS number to which the routers belong. Use the following command to enable BGP on a router: router bgp autonomous-system
2
RTB# router bgp 200 The above statements indicate that RTA is running BGP and it belongs to AS100 and RTB is running BGP and it belongs to AS200. The next step in the configuration process is to define BGP neighbors, which indicates the routers that are trying to talk BGP.
Two BGP routers become neighbors when they establish a TCP connection between each other. The TCP connection is essential in order for the two peer routers to start exchanging routing updates. When the TCP connection is up, the routers send open messages in order to exchange values such as:
y y y y
After these values are confirmed and accepted, the neighbor connection is established. Any state other than "established" is an indication that the two routers didn't become neighbors, and BGP updates won't be exchanged. Use this neighbor command to establish a TCP connection: neighbor ip-address remote-as number The remote-as number is the AS number of the router we're trying to connect to using BGP. The ip-address is the next-hop directly connected address for eBGP and any IP address on the other router for iBGP. It's essential that the two IP addresses used in the neighbor command of the peer routers are able to reach one another. One sure way to verify reachability is an extended ping between the two IP addresses. The extended ping allows the source IP address to be changed to any IP address on the router. Therefore the extended ping can be used with the IP address in the neighbor command to verify reachability between two peer routers. It is important to reset the neighbor connection in case any BGP configuration changes are made in order for the new parameters to take effect. clear ip bgp address (where address is the neighbor address) clear ip bgp * (clear all neighbor connections) By default, BGP sessions begin using BGP Version 4 and negotiating downward to earlier versions if necessary. To prevent negotiations and force the BGP version used to communicate with a neighbor, perform the following task in router configuration mode: neighbor {ip address|peer-group-name} version value An example of the neighbor command configuration follows:
3
RTB# router bgp 200 neighbor 129.213.1.2 remote-as 100 neighbor 175.220.1.2 remote-as 200 RTC# router bgp 200 neighbor 175.220.212.1 remote-as 200 In the above example, RTA and RTB are running eBGP. RTB and RTC are running iBGP. The difference between eBGP and iBGP is shown by having the remote-as number pointing to either an external (eBGP) or an internal (iBGP) AS. Also, the eBGP peers are directly connected whereas the iBGP peers are not. iBGP routers don't have to be directly connected, as long as there is some IGP running that allows the two neighbors to reach one another. The following is an example of the information that the show ip bgp neighbors command displays. Pay special attention to the following:
y y y y
BGP stateAnything other than state "established" indicates the peers are not up. BGP version4 Remote router IDThis is the highest IP address on the router or the highest loopback interface if one exists. Table versionThis is the state of the table; any time new information comes in, the table increases the version and a version that keeps incrementing indicates that some route is flapping, thus causing routes to continuously be updated.
#show ip bgp neighbors BGP neighbor is 129.213.1.1, remote AS 200, external link BGP version 4, remote router ID 175.220.12.1 BGP state = Established, table version = 3, up for 0:10:59 Last read 0:00:29, hold time is 180, keepalive interval is 60 seconds Minimum time between advertisement runs is 30 seconds Received 2828 messages, 0 notifications, 0 in queue Sent 2826 messages, 0 notifications, 0 in queue Connections established 11; dropped 10
Internal BGP (iBGP) is used if an autonomous system (AS) wants to act as a transit system to other ASs. You might ask, why can't we do the same thing by learning via external BGP (eBGP) redistributing into Interior Gateway Protocol (IGP) and then redistributing again into another AS? You can, but iBGP offers more flexibility and more efficient ways to exchange information within an AS; for example iBGP provides ways to control what is the best exit point out of the AS by using local preference (local preference is discussed in a subsequent section, "BGP Path Selection").
In the above diagram, RTA and RTB are running iBGP and RTA and RTD are running iBGP also. The BGP updates coming from RTB to RTA will be sent to RTE (outside of the AS) but not to RTD (inside of the AS). This is why an iBGP peering should be made between RTB and RTD in order not to break the flow of the updates. RTA# router bgp 100
4
neighbor 190.10.50.1 remote-as 100 neighbor 170.10.20.2 remote-as 300 network 150.10.0.0 RTB# router bgp 100 neighbor 150.10.30.1 remote-as 100 neighbor 175.10.40.1 remote-as 400 network 190.10.50.0 RTC# router bgp 400 neighbor 175.10.40.2 remote-as 100 network 175.10.0.0 It is important to remember that when a BGP speaker receives an update from other BGP speakers in its own AS (iBGP), the receiving BGP speaker will not redistribute that information to other BGP speakers in its own AS. The receiving BGP speaker will redistribute that information to other BGP speakers outside of its AS. That is why it is important to sustain a full mesh between the iBGP speakers within an AS. Synchronization -The synchronization command enables synchronization between Border Gateway Protocol (BGP) and your Interior Gateway Protocol (IGP) system. To enable the Cisco IOS software to advertise a network route without waiting for the IGP, use the no form of this command. Before we discuss synchronization, let's look at the following scenario.
RTC in AS300 is sending updates about 170.10.0.0. RTA and RTB are running iBGP, so RTB will get the update and will be able to reach 170.10.0.0 via next-hop 2.2.2.1 (the next hop is carried via iBGP). In order to reach the next hop, RTB will have to send the traffic to RTE. Assume that RTA has not redistributed network 170.10.0.0 into IGP, so at this point RTE has no idea that 170.10.0.0 even exists. If RTB starts advertising to AS400 that it can reach 170.10.0.0 then traffic coming from RTD to RTB with destination 170.10.0.0 will flow in and get dropped at RTE. Synchronization states: If your AS is passing traffic from another AS to a third AS, BGP should not advertise a route before all routers in your AS have learned about the route via IGP. BGP waits until IGP has propagated the route within the AS and then advertises it to external peers. This is called synchronization. In the above example, RTB waits to hear about 170.10.0.0 via IGP before it starts sending the update to RTD. You can fool RTB into thinking that IGP has propagated the information by adding a static route in RTB pointing to 170.10.0.0. Care should be taken to ensure that other routers can reach 170.10.0.0; otherwise you will have a problem reaching that network. Disabling Synchronization ----In some cases you do not need synchronization. If you will not be passing traffic from a different AS through your AS, or if all routers in your AS will be running BGP, you can disable synchronization. Disabling this feature can allow you to carry fewer routes in your IGP and allow BGP to converge more quickly. Disabling synchronization are not automatic(In morden routers, it is disable by default). If you have all your routers in the AS running BGP and you are not running any IGP, the router has no way of knowing that, and your router will be waiting forever for an IGP update about a certain route before sending it to external peers. Disable synchronization manually in this case in order for routing to work correctly by using the no synchronization command:
5
router bgp 100 no synchronization Make sure you do a clear ip bgp address to reset the session.
RTB# router bgp 100 network 150.10.0.0 neighbor 1.1.1.2 remote-as 400 neighbor 3.3.3.3 remote-as 100 no synchronization !-- RTB puts 170.10.0.0 in its IP routing table and advertises it to RTD !-- even if it does not have an IGP path to 170.10.0.0) RTD# router bgp 400 neighbor 1.1.1.1 remote-as 100 network 175.10.0.0 RTA# router bgp 100 network 150.10.0.0 neighbor 3.3.3.4 remote-as 100
Load Balancing and Loopback Interfaces Using a loopback interface to define neighbors is common with internal BGP (iBGP), but not with eBGP. Normally the loopback interface is used to ensure the IP address of the neighbor stays up and is independent of hardware functioning properly. In the case of eBGP, peer routers are frequently directly connected and loopback doesn't apply. If you use the IP address of a loopback interface in the neighbor command, you need some extra configuration on the neighbor router. The neighbor router needs to tell BGP it's using a loopback interface rather than a physical interface to initiate the BGP neighbor TCP connection. The command used to indicate a loopback interface follows: neighbor ip-address update-source interface The following example illustrates the use of this command.
RTA# router bgp 100 neighbor 190.225.11.1 remote-as 100 neighbor 190.225.11.1 update-source loopback 1 RTB# router bgp 100 neighbor 150.212.1.1 remote-as 100 In the above example, RTA and RTB are running iBGP inside autonomous system (AS) 100. RTB is using in its neighbor command the loopback interface of RTA (150.212.1.1); in this case, RTA has to force BGP to use the loopback IP address as the source in the TCP neighbor connection. RTA does this by using the neighbor ip-address update-source interface type configuration command (neighbor 190.225.11.1 update-source loopback 1). This statement forces BGP to use the IP address of its loopback interface when talking to neighbor 190.225.11.1. Note that RTA has used the physical interface IP address (190.225.11.1) of RTB as a neighbor, which is why RTB doesn't need any special configuration.
In some cases, a Cisco router can run eBGP with a third-party router that doesn't allow the two external peers to be directly connected. To achieve this, you can use eBGP multihop, which allows the neighbor relationship to be established between two external peers who are not directly connected. The multihop is used only for eBGP and not for iBGP. The following example illustrates of eBGP multihop.
RTA# router bgp 100 neighbor 180.225.11.1 remote-as 300 neighbor 180.225.11.1 ebgp-multihop
RTB# router bgp 300 neighbor 129.213.1.2 remote-as 100 RTA is indicating an external neighbor that isn't directly connected. RTA needs to indicate that it's using eBGP-multihop. On the other hand, RTB is indicating a neighbor that is directly connected (129.213.1.2), which is why it doesn't need the neighbor ebgp-multihop command.You should also configure an Interior Gateway Protocol (IGP) or static routing to allow the non-connected neighbors to reach each other.
7
The following example shows how to achieve load balancing with BGP in a particular case where we have eBGP over parallel lines.
The following example illustrates the use of loopback interfaces, neighbor update-source and neighbor ebgp-multihop. This is a workaround in order to achieve load balancing between two eBGP speakers over parallel serial lines.
RTA# int loopback 0 ip address 150.10.1.1 255.255.255.0 router bgp 100 neighbor 160.10.1.1 remote-as 200 neighbor 160.10.1.1 ebgp-multihop neighbor 160.10.1.1 update-source loopback 0 network 150.10.0.0
ip route 160.10.0.0 255.255.0.0 1.1.1.2 ip route 160.10.0.0 255.255.0.0 2.2.2.2 RTB# int loopback 0 ip address 160.10.1.1 255.255.255.0 router bgp 200 neighbor 150.10.1.1 remote-as 100 neighbor 150.10.1.1 update-source loopback 0 neighbor 150.10.1.1 ebgp-multihop network 160.10.0.0
ip route 150.10.0.0 255.255.0.0 1.1.1.1 ip route 150.10.0.0 255.255.0.0 2.2.2.1 In a normal situation, BGP picks one of the lines to send packets on, and load balancing wouldn't happen. By introducing loopback interfaces, the next hop for eBGP is the loopback interface. You can use static routes (or an IGP) to introduce two equal-cost paths to reach the destination. RTA has two choices to reach next-hop 160.10.1.1: one via 1.1.1.2; the other via 2.2.2.2, and the same for RTB.
Route maps are used heavily with BGP. In the BGP context, a route map is a method used to control and modify routing information. This is done by defining conditions for redistributing routes from one routing protocol to another or controlling routing information when injected in and out of BGP. The format of the route-map command follows: route-map map-tag [[permit | deny] | [sequence-number]]
8
The map tag is just a name you give to the route map. Multiple instances of the same route map (same name tag) can be defined. The sequence number is just an indication of the position a new route map is to have in the list of route maps already configured with the same name. For example, if two instances of the route map are defined, (call it MYMAP), the first instance will have a sequence-number of 10, and the second will have a sequence number of 20. route-map MYMAP permit 10 (first set of conditions goes here) route-map MYMAP permit 20 (second set of conditions goes here) When applying route map MYMAP to incoming or outgoing routes, the first set of conditions will be applied via instance 10. If the first set of conditions is not met, then proceed to a higher instance of the route map.
match and set Configuration Commands Each route map will consist of a list of match and set configurations. The match command specifies a match criteria; set specifies a set action if the criteria enforced by the match command are met. For example, a route map could be defined that checks outgoing updates. If there is a match for IP address 1.1.1.1, then the metric for that update will be set to 5. This example can be illustrated by the following commands: match ip address 1.1.1.1 set metric 5 Now, if the match criteria are met and there is a permit, then the routes will be redistributed or controlled as specified by the set action, and we break out of the list. If the match criteria are met and there is a deny, then the route will not be redistributed or controlled, and we break out of the list. If the match criteria are not met and there is either a permit or deny, then the next instance of the route map (instance 20 for example) will be checked, and so on until we either break out or finish all the instances of the route map. If the list is finished without a match, then the route will be neither accepted nor forwarded. One restriction on route maps is that when used for filtering BGP updates (described later in the module) rather than when redistributing between protocols, you cannot filter on the inbound when using a "match" on the IP address. Filtering on the outbound is OK. The related commands for match follow: match as-path match community-list match clns route-hop match clns route-source match interface match ip address match ip next-hop match ip route-source match metric match route-type match tag
The related commands for set follow: set as-path set automatic-tag set community set interface set default interface set ip default next-hop
9
set level set local-preference set metric set metric-type set next-hop set origin set tag set weight Let's look at some route-map examples:
Example 1: Assume: RTA and RTB are running Routing Information Protocol (RIP); RTA and RTC are running BGP. RTA is getting updates via BGP and redistributing them to RIP. If RTA wants to redistribute routes for 170.10.0.0 to RTB with a metric of 2, and all other routes with a metric of 5, then the following configuration could be used: RTA# router rip network 3.0.0.0 network 2.0.0.0 network 150.10.0.0 passive-interface Serial0 redistribute bgp 100 route-map SETMETRIC router bgp 100 neighbor 2.2.2.3 remote-as 300 network 150.10.0.0 route-map SETMETRIC permit 10 match ip-address 1 set metric 2 route-map SETMETRIC permit 20 set metric 5 access-list 1 permit 170.10.0.0 0.0.255.255 In the above example, if a route matches the IP address 170.10.0.0, it will have a metric of 2, and then we break out of the route map list. If there is no match, then we go down the route-map list, which says, set everything else to metric 5. "What will happen to routes that do not match any of the match statements?" They will be dropped by default. Example 2: Suppose in the above example, we did not want the autonomous system AS100 to accept updates about 170.10.0.0. Because route maps cannot be applied on the inbound when matching based on an IP address, you have to use an outbound route map on RTC:
10
RTC# router bgp 300 network 170.10.0.0 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 route-map STOPUPDATES out route-map STOPUPDATES permit 10 match ip address 1 access-list 1 deny 170.10.0.0 0.0.255.255 access-list 1 permit 0.0.0.0 255.255.255.255 Now that you feel more comfortable with how to start BGP and how to define a neighbor, let's look at how to start exchanging network information. There are multiple ways to send network information using BGP. These methods are covered one by one in the next section.
Network Command The format of the network command follows: network network-number [mask network-mask] The network command controls what networks are originated by this box. This is a different concept from configuring with Interior Gateway Routing Protocol (IGRP) and RIP. With this command, we are not trying to run BGP on a certain interface, rather we are trying to indicate to BGP what networks it should originate from this box. The mask portion is used because BGP4 can handle subnetting and supernetting. A maximum of 200 entries of the network command are accepted. The network command will work if the network you are trying to advertise is known to the router, whether connected, static, or learned dynamically. An example of the network command follows: RTA# router bgp 1 network 192.213.0.0 mask 255.255.0.0 ip route 192.213.0.0 255.255.0.0 null 0 The above example indicates that RTA will generate a network entry for 192.213.0.0/16. The /16 indicates that we are using a supernet of the Class C address and we are advertising the first two octets (the first 16 bits). Note that we need the static route to get the router to generate 192.213.0.0 because the static route will put a matching entry in the routing table.
Redistribution The network command is one way to advertise your networks via BGP. Another way is to redistribute your Interior Gateway Protocol (IGP) (such as IGRP, Open Shortest Path First [OSPF], RIP, Enhanced Interior Gateway Routing Protocol [EIGRP], and so on) into BGP. This sounds complicated because now, all of the internal routes are being dumped into BGP, yet some of these routes might have been learned via BGP and you do not need to send them out again. Careful filtering should be applied to ensure that only routes that you want to advertise are being sent to the Internet, rather than every route you have. Let's look at the example below. RTA is announcing 129.213.1.0 and RTC is announcing 175.220.0.0. Look at the RTC configuration:
11
If you use a network command,you will have: RTC# router eigrp 10 network 175.220.0.0 redistribute bgp 200 default-metric 1000 100 250 100 1500 router bgp 200 neighbor 1.1.1.1 remote-as 300 network 175.220.0.0 mask 255.255.0.0 (this will limit the networks originated by your AS to 175.220.0.0) If you use redistribution instead, you will have: RTC# router eigrp 10 network 175.220.0.0 redistribute bgp 200 default-metric 1000 100 250 100 1500 router bgp 200 neighbor 1.1.1.1 remote-as 300 redistribute eigrp 10 (Here, EIGRP will inject 129.213.1.0 again into BGP) This will cause 129.213.1.0 to be originated by your AS. This is misleading because you are not the source of 129.213.1.0, AS100 is. So you would have to use filters to prevent that network from being sourced out by your AS. The correct configuration would be the following: RTC# router eigrp 10 network 175.220.0.0 redistribute bgp 200 default-metric 1000 100 250 100 1500 router bgp 200 neighbor 1.1.1.1 remote-as 300 neighbor 1.1.1.1 distribute-list 1 out redistribute eigrp 10 access-list 1 permit 175.220.0.0 0.0.255.255 The access list is used to control what networks are to be originated from AS200.
Static Routes and Redistribution Static routes could always be used to originate a network or a subnet. The only difference is that BGP will consider these routes as having an origin of incomplete (unknown). In the above example, the same could have been accomplished by doing the following:
12
RTC# router eigrp 10 network 175.220.0.0 redistribute bgp 200 default-metric 1000 100 250 100 1500 router bgp 200 neighbor 1.1.1.1 remote-as 300 redistribute static ... ip route 175.220.0.0 255.255.255.0 null0 The null 0 interface means disregard the packet. So if the packet is received and there is a more specific match than 175.220.0.0 (which exists, of course) the router will send it to the specific match; otherwise it will disregard it. This is a nice way to advertise a supernet. We have discussed how we can use different methods to originate routes out of our AS. Please remember that these routes are generated in addition to other BGP routes that BGP has learned via neighbors (internal or external). BGP passes on information that it learns from one peer to other peers. The difference is that routes generated by the network command, or redistribution or static routes, will indicate your AS as the origin for these networks. Injecting BGP into IGP is always done by redistribution. Example:
RTA# router bgp 100 neighbor 150.10.20.2 remote-as 300 network 150.10.0.0 RTB# router bgp 200 neighbor 160.10.20.2 remote-as 300 network 160.10.0.0 RTC# router bgp 300 neighbor 150.10.20.1 remote-as 100 neighbor 160.10.20.1 remote-as 200 network 170.10.00 Note that you do not need network 150.10.0.0 or network 160.10.0.0 in RTC unless you want RTC to also generate these networks on top of passing them on as they come in from AS100 and AS200. Again, the difference is the network command adds an extra advertisement for these same networks, indicating that AS300 is also an origin for these routes. It is important to remember that BGP will not accept updates that have originated from its own AS. This is to insure a loop-free interdomain topology. For example, assume AS200 had a direct BGP connection into AS100. RTA generates a route 150.10.0.0 and sends it to AS300; then RTC passes this route to AS200 with the origin kept as AS100. RTB passes 150.10.0.0 to AS100, with the origin still AS100. RTA notices that the update has originated from its own AS, and ignores it.
13
After BGP receives updates about different destinations from different autonomous systems (ASs), the protocol will have to decide which paths to choose in order to reach a specific destination. BGP will choose only a single path to reach a specific destination. However, note that if you have multiple physical connections between eBGP neighbors, using a loopback interface and static routes to the loopback interface allows load balancing across the connections. The decision process is based on different attributes, such as next hop, administrative weights, local preference, the route origin, path length, origin code, metric, and so on. This section explains the decision process BGP uses to propagate the best path to its neighbors, called the Best Path Selection Algorithm.
Description BGP routers typically receive multiple paths to the same destination. The BGP best path algorithm decides which is the best path to install in the IP routing table and to use for forwarding traffic. Let's begin by assuming that all received paths for a particular prefix are arranged in a list, similar to the output of the show ip bgp longerprefixes command. Some paths received by the router aren't considered as candidates for the best path. Such paths typically don't have the valid flag in the output of the show ip bgp longer-prefixes command. The following is a list of reasons that cause routers to ignore paths.
y y
y y
Paths marked as "not synchronized" in the show ip bgp longer-prefixes output. If BGP synchronization is enabled, which it is by default in Cisco IOS Software, there must be a match for the prefix in the IP routing table in order for an internal BGP (iBGP) path to be considered a valid path. If the matching route is learned from an OSPF neighbor, its OSPF router ID must match the BGP router ID of the iBGP neighbor. Most users prefer to disable synchronization using the no synchronization BGP subcommand. Paths for which the NEXT_HOP is inaccessible. This is why it's important to have an IGP route to the NEXT_HOP associated with the path. Paths from an external (eBGP) neighbor if the local autonomous system (AS) appears in the AS_path. Such paths are denied upon ingress into the router, and are not even installed in the BGP routing-information base (RIB). The same applies to any path denied by routing policy implemented via access, prefix, AS-PATH, or community lists, unless you've configured softreconfiguration inbound for the neighbor. If you enabled bgp enforce-first-as and the UPDATE doesn't contain the AS of the neighbor as the first AS number in the AS_Sequence, the router sends a notification and closes the session. Paths marked as "(received-only)" in the show ip bgp longer-prefixes output. These paths have been rejected by policy, but have been stored by the router because soft-reconfiguration inbound has been configured for the neighbor sending the path.
How the Best Path Algorithm Works BGP assigns the first valid path as the current best path. It then compares the best path with the next path in list, until it reaches the end of the list of valid paths. Following is a list of rules used to determine the best path: 1. 2. 3. 4. Prefer the path with the largest WEIGHT. WEIGHT is a Cisco-specific parameter, local to the router on which it's configured. Prefer the path with the largest Local _Preference. Prefer the path that was locally originated via a network or aggregate BGP subcommand, or through redistribution from an IGP. Local paths sourced by network and redistribute commands are preferred over local aggregates sourced by the aggregateaddress command. Prefer the path with the shortest AS-PATH. Note the following: o This step is skipped if bgp bestpath as-path ignore is configured. o An AS-SET counts as 1, no matter how many ASs are in the set. Recall that an AS-SET attribute is a mathematical set of ASs. o The AS_Confed_Sequence is not included in the AS-PATH length. Prefer the path with the lowest origin type: IGP is lower than EGP, and EGP is lower than INCOMPLETE. Prefer the path with the lowest Multi-Exit Discriminator (MED), also called the metric attribute. Note the following: o This comparison is only done if the first (neighboring) AS is the same in the two paths; any confederation sub-ASs are ignored. In other words, MEDs are compared only if the first AS in the AS_Sequence is the same for multiple paths. Any preceding AS_Confed_Sequence is ignored.
5. 6.
14
If bgp always-compare-med is enabled, MEDs are compared for all paths. This option needs to be enabled over the entire AS, otherwise routing loops can occur. o If bgp bestpath med-confed is enabled, MEDs are compared for all paths that consist only of AS_Confed_Sequence (paths originated within the local confederation). o Paths received from a neighbor with a MED of 4,294,967,295 will have the MED changed to 4,294,967,294 before insertion into the BGP table. o Paths received with no MED are assigned a MED of 0, unless bgp bestpath missing-as-worst is enabled, in which case they are assigned a MED of 4,294,967,294. o The bgp deterministic med command can also influence this step as demonstrated in the How BGP Routers Use the Multi-Exit Discriminator for Best Path Selection. Prefer external (eBGP) over internal (iBGP) paths. Paths containing AS_Confed_Sequence are local to the confederation, and therefore treated as internal paths. There is no distinction between Confederation External and Confederation Internal. Prefer the path with the lowest IGP metric to the BGP next hop. If maximum-paths n is enabled, and there are multiple external or confederation-external paths from the same neighboring AS or sub-AS, BGP inserts up to n most recently received paths in the IP routing table. This allows eBGP multipath load sharing. The maximum value of n is currently 6. The default value, when this option is disabled, is 1. The oldest received path is marked as the best path in the output of show ip bgp longer-prefixes, and the equivalent of next-hop-self is performed before forwarding this best path to internal peers. If both paths are external, prefer the path that was received first (the oldest one). This step minimizes route-flap, since a newer path won't displace an older one, even if it was the preferred route based on the additional decision criteria below. It's better practice to apply the additional decision steps below to iBGP paths only, in order to ensure a consistent best path decision within the network, and thereby avoid loops. This step is skipped if any of the following is true: o The bgp bestpath compare-routerid command is enabled. o The router ID is the same for multiple paths, since the routes were received from the same router. o There is no current best path. An example of losing the current best path occurs when the neighbor offering the path goes down. Prefer the route coming from the BGP router with the lowest router ID. The router ID is the highest IP address on the router, with preference given to loopback addresses. It can also be set manually using the bgp router-id command. (If a path contains routereflector (RR) attributes, the originator ID is substituted for the router ID in the path selection process.) If the originator or router ID is the same for multiple paths, prefer the path with the minimum cluster ID length. This will only be present in BGP route-reflector environments. It allows clients to peer with RRs or clients in other clusters. In this scenario, the client must be aware of the RR-specific BGP attribute. Prefer the path coming from the lowest neighbor address. This is the IP address used in the BGP neighbor configuration, and corresponds to the remote peer used in the TCP connection with the local router.
7. 8. 9.
10.
Subsequent sections of this module discuss these attributes one by one, and contain Configuration Lab exercises that let you practice various path selection techniques.
Now that you are familiar with the BGP attributes and terminology, the following summary indicates how BGP selects the best path for a particular destination. Remember that BGP selects one path only as the best path. That path is placed in the routing table and propagated it to the BGP neighbors. Path selection is based on the following: 1. 2. 3. 4. 5. 6. 7. 8. 9. If NextHop is inaccessible, do not consider it Prefer the largest weight. If the same weight, prefer the largest local preference. If the same local preference, prefer the route that the specified router has originated. If no route was originated, prefer the shorter AS path. If all paths are external, prefer the lowest origin code (IGP<EGP<INCOMPLETE). If origin codes are the same, prefer the path with the lowest MULTI_EXIT_DISC. If path is the same length, prefer the external path over internal. If Interior Gateway Protocol (IGP) synchronization is disabled and only internal paths remain, prefer the path through the closest IGP neighbor. 10. Prefer the route with the lowest IP address value for BGP router ID.
The weight attribute is a Cisco-defined attribute. The weight is assigned locally to the router. It is a value that makes sense only to the specific router and that is not propagated or carried through any of the route updates. A weight can be a number from 0 to 65535. Paths that the router originates have a weight of 32768 by default, and other paths have a weight of zero.
15
Routes with a higher weight are preferred when multiple routes exist to the same destination. Look at the above example: 1. 2. 3. RTA has learned about network 175.10.0.0 from autonomous system 400 (AS400) and will propagate the update to RTC. RTB has also learned about network 175.10.0.0 from AS400 and will propagate it to RTC. RTC has now two ways for reaching 175.10.0.0 and has to decide which way to go.
If on RTC we can set the weight of the updates coming from RTA to be higher than the weight of updates coming from RTB, then we will force RTC to use RTA as a next hop to reach 175.10.0.0. This is achieved by using multiple methods:
y y y
RTC#
Using the neighbor command: neighbor {ip-address|peer-group} weight weight. Using AS path access-lists: o ip as-path access-list access-list-number {permit|deny} as-regular-expression o neighbor ip-address filter-list access-list-number weight weight. Using route-maps.
router bgp 300 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 weight 200 !-- route to 175.10.0.0 from RTA has 200 weight neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 weight 100 !-- route to 175.10.0.0 from RTB will have 100 weight Routes with higher weight are preferred when multiple routes exist to the same destination. RTA is preferred as the next hop. The same outcome can be achieved using IP AS-path and filter lists. RTC# router bgp 300 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 filter-list 5 weight 200 neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 filter-list 6 weight 100 ... ip as-path access-list 5 permit ^100$ !-- this only permits path 100 ip as-path access-list 6 permit ^200$ ... The same outcome as above can be achieved by using route maps. RTC# router bgp 300 neighbor 1.1.1.1 remote-as 100
16
neighbor 1.1.1.1 route-map setweightin in neighbor 2.2.2.2 remote-as 200 neighbor 2.2.2.2 route-map setweightin in ... ip as-path access-list 5 permit ^100$ ... route-map setweightin permit 10 match as-path 5 set weight 200 !-- anything that applies to access-list 5, such as packets from AS100, have weight 200 route-map setweightin permit 20 set weight 100 !-- anything else would have weight 100
This configuration lab covers the weight attribute for BGP networks. Refer to the lab topology diagram below. For the weight attribute, a higher value is preferred. The range for the weight attribute is 0 -thru- 65535. The default value for paths that a router originates is 32768, the default value for all other paths is 0.
y y y
Become familiar with all initial router configurations for the connected routers in this lab. Use the weight attribute to change the path from the one that was chosen. Identify which path R6 chooses for AS 4 networks and why.
Example: On R6: Neighbor 132.108.4.7 route-map SET_WEIGHT in route-map SET_WEIGHT permit 10 match as-path 1 set weight 10 route-map SET-WEIGHT permit 20 [1] match as-path 2 [2] ip as-path access-list 1 permit _4$ ip as-path access-list 2 permit .* [2] There is a need for further explanation of the code example above: 1. 2. route-map SET-WEIGHT permit 20 is necessary so that the traffic that does not match as-path 1 is allowed to flow into the router. Without this line in the route-map, only paths matching ip as-path access-list 1 and coming from neighbor 132.108.4.7 would be allowed in the BGP table. The commands match as-path 2 and ip as-path access-list 2 permit .* operate in a similar fashion to putting a "deny any" at the end of an access-list (this is the default behavior). These lines just make explicit what is going to happen .
The topology graphic shows how the network is configured for BGP connectivity.
17
Following is a list of the Cisco IOS commands that are available to use in this lab: Configuration Commands: Show Commands
y y y y y y y y y y y
enable configure terminal exit ip as-path access-list access-list-number {permit|deny} asregular-expression neighbor ip-address route-map router bgp autonomous-system number route-map match as-path set weight clear ip bgp * help (?)
y y y y y y y y
show running-config show ip bgp show ip bgp neighbors show ip bgp summary show ip route show ip route bgp show ip interface brief show version
This lab is complete when: 1. 2. 3. The weight attribute is successfully configured for this BGP network. For R6, paths are changed from the chosen path to a new path using by using the weight attribute. For R6, the AS 4 networks are learned through R2 and R7.
Local preference is an indication to the autonomous system (AS) about which path is preferred to exit the AS in order to reach a certain network. A path with a higher local preference is more preferred. The default value for local preference is 100. Unlike the weight attribute, which is relevant only to the local router, local preference is an attribute that is exchanged among routers in the same AS.
18
Local preference is set via the bgp default local-preference command or with route maps, as illustrated in the following example:
The bgp default local-preference command will set the local preference on the updates out of the router going to peers in the same AS. In the above diagram, AS256 is receiving updates about 170.10.0.0 from two different sides of the organization. Local preference will help us determine which way to exit AS256 in order to reach that network. Let us assume that Router D (RTD) is the preferred exit point. The following configuration sets the local preference for updates coming from AS300 to 200, and those coming from AS100 to 150. RTC# router bgp 256 neighbor 1.1.1.1 remote-as 100 neighbor 128.213.11.2 remote-as 256 bgp default local-preference 150 RTD# router bgp 256 neighbor 3.3.3.4 remote-as 300 neighbor 128.213.11.1 remote-as 256 bgp default local-preference 200 In the above configuration, RTC sets the local preference of all updates to 150. RTD sets the local preference of all updates to 200. Because local preference is exchanged within AS256, both RTC and RTD will realize that network 170.10.0.0 has a higher local preference when coming from AS300 rather than when coming from AS100. All traffic in AS256 addressed to that network is sent to RTD as an exit point. More flexibility is provided by using route maps. In the above example, all updates received by RTD are tagged with local preference 200 when they reach RTD. This means that updates coming from AS34 will also be tagged with the local preference of 200. This might not be needed. This is why we can use route maps to specify what specific updates need to be tagged with a specific local preference, as shown below: RTD# router bgp 256 neighbor 3.3.3.4 remote-as 300 neighbor 3.3.3.4 route-map setlocalin in neighbor 128.213.11.1 remote-as 256 .... ip as-path access-list 7 permit ^300$ ... route-map setlocalin permit 10 match as-path 7 set local-preference 200 route-map setlocalin permit 20 set local-preference 150 With this configuration, any update coming from AS300 will be set with a local preference of 200. Any other updates such as those coming from AS34 will be set with a value of 150.
19
This section covers the metric attribute. The metric attribute, which is also called Multi-Exit Discriminator (MED) (in Border Gateway Protocol 4 [BGP4]) or Inter-As (in BGP3) is a hint to external neighbors about the preferred path into an autonomous system (AS). This is a dynamic way to influence another AS on which way to choose in order to reach a certain route, given that there are multiple entry points into that AS. A path with a lower metric value is more preferred. Unlike local preference, a metric is exchanged between ASs. A metric is carried into an AS but does not leave the AS. When an update enters the AS with a certain metric, that metric is used for decision-making inside the AS. When the same update is passed on to a third AS, that metric will be set back to 0 as shown in the illustration below. The Metric default value is 0.
Unless otherwise specified, a router will compare metrics for paths from neighbors in the same AS. In order for the router to compare metrics from neighbors coming from different ASs, the special configuration command bgp always-compare-med should be configured on the router. In the illustration, AS100 is getting information about network 180.10.0.0 via three different routers: RTC, RTD, and RTB. RTC and RTD are in AS300, and RTB is in AS400. Assume that the metric coming from RTC has been set to 120, the metric coming from RTD is set to 200 and the metric coming from RTB is set to 50. Given that by default a router compares metrics coming from neighbors in the same AS, RTA can compare only the metric coming from RTC to the metric coming from RTD, and will pick RTC as the best next hop because 120 is less than 200. When RTA gets an update from RTB with metric 50, it cannot compare it to 120 because RTC and RTB are in different ASs (RTA has to compare based on other attributes). In order to force RTA to compare the metrics from different ASs we have to configure bgp always-compare-med on RTA. This is illustrated in the configurations below: RTA# router bgp 100 neighbor 2.2.2.1 remote-as 300 neighbor 3.3.3.3 remote-as 300 neighbor 4.4.4.3 remote-as 400 .... RTC# router bgp 300 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 route-map setmetricout out neighbor 1.1.1.2 remote-as 300
20
route-map setmetricout permit 10 set metric 120 RTD# router bgp 300 neighbor 3.3.3.2 remote-as 100 neighbor 3.3.3.2 route-map setmetricout out neighbor 1.1.1.1 remote-as 300 route-map setmetricout permit 10 set metric 200 RTB# router bgp 400 neighbor 4.4.4.4 remote-as 100 neighbor 4.4.4.4 route-map setmetricout out route-map setmetricout permit 10 set metric 50 With the above configuration examples, RTA will pick RTC as next hop, assuming all other attributes are the same. In order to have RTB included in the metric comparison, we have to configure RTA as follows: RTA# router bgp 100 neighbor 2.2.21 remote-as 300 neighbor 3.3.3.3 remote-as 300 neighbor 4.4.4.3 remote-as 400 bgp always-compare-med In this case, RTA will pick RTB as the best next hop in order to reach network 180.10.0.0. A metric can also be set while redistributing routes into BGP using the default-metric command. In this case, RTB now injects a static network into AS100: RTB# router bgp 400 redistribute static default-metric 50 ip route 180.10.0.0 255.255.0.0 null 0 !-- Causes RTB to send out 180.10.0.0 with a metric of 50
This Configuration Lab covers the Multi-Exit Discriminator (MED) attribute, which is also called the metric attribute. This attribute acts as a hint to external neighbors about the preferred path into an autonomous system (AS) when there are multiple entry points into the AS. A lower MED value is preferred over a higher MED value. The default value of the MED attribute is 0. The MED attribute gets reset to 0 when leaving the AS. Refer to the lab topology diagram below. You need to use the command bgp always-compare-med if you want to compare MEDs for the same path coming from different ASs.
21
This Configuration Lab has the following objectives:
y y y
Learn how the routers in AS 1 are handling routes from AS 3. Use the MED attribute to influence the paths selected by AS 1. Verify that the configuration change is having the desired effect.
The topology graphic shows how the network is configured for BGP connectivity.
Following is a list of the Cisco IOS commands that are available to use in this lab: Configuration Commands: Show Commands:
y y y y y y y y y y y y y
bgp always-compare-med enable configure terminal exit ip as-path access-list access-list-number {permit|deny} as-regular-expression neighbor ip-address route-map router bgp autonomous-system number route-map match as-path set metric no synchronization clear ip bgp * help (?)
y y y y y y y y
show running-config show ip bgp show ip bgp neighbors show ip bgp summary show ip route show ip route bgp show ip interface brief show version
22
1. 2. 3. 4. AS 1 chooses all the paths through AS 4 for all destinations in AS 3. R4 is configured to advertise a better metric than R6 to AS 1 for AS 3 destinations. R3 and R7 are configured with the BGP command bgp always-compare-med so that the MEDs coming from AS 4 and AS 2 for AS 3 paths will be compared. Verify the BGP configuration information and routes on R3 and R7.
y y y
Interior Gateway Protocol (IGP): Network layer reachability information (NLRI) is interior to the originating autonomous system (AS). This normally happens when the Border Gateway Protocol (BGP) network command is used; or, if IGP is redistributed into BGP, then the origin of the path information will be IGP. This is indicated with an "i" in the BGP table. Exterior Gateway Protocol (EGP): NLRI is learned via EGP. This is indicated with an "e" in the BGP table. INCOMPLETE: NLRI is unknown or learned via some other means. This usually occurs when we redistribute a static route into BGP, and the origin of the route will be incomplete. This is indicated with an "?" in the BGP table.
RTA# router bgp 100 neighbor 190.10.50.1 remote-as 100 neighbor 170.10.20.2 remote-as 300 network 150.10.0.0 redistribute static ip route 190.10.0.0 255.255.0.0 null0 RTB# router bgp 100 neighbor 150.10.30.1 remote-as 100 network 190.10.50.0 RTE# router bgp 300 neighbor 170.10.20.1 remote-as 100 network 170.10.0.0
y y y y
RTA will reach 170.10.0.0 via: 300 i (which means the next AS path is 300 and the origin of the route is IGP). RTA will reach 190.10.50.0 via: i (which means the entry is in the same AS and the origin is IGP). RTE will reach 150.10.0.0 via: 100 i (the next AS is 100 and the origin is IGP). RTE will also reach 190.10.0.0 via: 100 ? (the next AS is 100 and the origin is incomplete "?," coming from a static route).
23
This Configuration Lab covers the origin attribute. The origin attribute provides information about the origin of the route. This attribute is checked as Step 5 of the BGP Best Path Algorithm. The origin of a route can be one of three values: IGP, EGP, and Incomplete.
y y y
IGPThe route is interior to the originating AS. This value is set when the network network-number BGP router configuration command is used to inject the route into Border Gateway Protocol (BGP). The Interior Gateway Protocol (IGP) origin type is represented by the letter i in the output of the show ip bgp EXEC command. EGPThis value is set when a route is learned via an Exterior Gateway Protocol (EGP). The EGP origin type is represented by the letter e in the output of the show ip bgp EXEC command. IncompleteThe origin of the route is unknown or learned in some other way. An origin of Incomplete occurs when a route is redistributed into BGP. The Incomplete origin type is represented by the ? symbol in the output of the show ip bgp EXEC command.
The path with the lowest origin type is preferred : IGP is lower than EGP, and EGP is lower than Incomplete. Refer to the lab topology diagram below. The objectives of this lab are as follows:
y y
On R5, verify the origin attribute for the routes from AS 4. Change the origin attribute for these routes from IGP (I) to Incomplete (?).
The topology graphic shows how the network is configured for BGP connectivity.
Following is a list of the Cisco IOS commands that are available to use in this lab:
24
Configuration Commands: Show Commands
y y y y y y y y y y y
enable configure terminal exit ip as-path access-list access-list-number {permit|deny} asregular-expression network network-number route router bgp autonomous-system number route-map match as-path set origin incomplete clear ip bgp * help (?)
y y y y y y y
show running-config show ip route bgp show ip bgp show ip bgp neighbors show ip bgp summary show ip interface brief show version
This section covers the community attribute. The community attribute is a way to group destinations in a certain community and apply routing decisions (accept, prefer, redistribute, and so on) according to those communities.
BGP communities allow routers to filter incoming or outgoing BGP routes. The community attribute is a transitive, optional attribute in the range 0 to 4,294,967,200. We can use route maps to set the community attributes. The route map set command has the following syntax: set community community-number [additive] A few predefined well known communities (community-number) include the following:
y y y
no-exportDo not advertise to external Border Gateway Protocol (eBGP) peers. no-advertiseDo not advertise this route to any peer. internetAdvertise this route to the Internet community; any router belongs to it.
Examples of route maps where community is set follow: route-map communitymap match ip address 1 set community no-advertise or route-map setcommunity match as-path 1 set community 200 additive Use of the keyword "additive", will have community 200 added to the existing communities in the community attribute of the route. If the keyword is not used, then 200 will replace any currently existing community. Even if we set the community attribute, this attribute is not sent to neighbors by default. In order to send the attribute to your neighbor, you have to use the following:
25
neighbor {ip-address|peer-group-name} send-community Here's an example: RTA# router bgp 100 neighbor 3.3.3.3 remote-as 300 neighbor 3.3.3.3 send-community neighbor 3.3.3.3 route-map setcommunity out
This lab covers the community attribute. The community attribute provides a way of grouping destinations (called communities) to which routing decisions (such as acceptance, preference, and redistribution) can be applied. Route maps are used to set the community attribute. The transitive attribute range is: 0 4,294,967,200. There are three predefined communities :
y y y
no-export - Do not advertise this route to eBGP peers. no-advertise - Do not advertise this route to any peer. internet - Advertise this route to the Internet community; all routers in the network belong to it.
Refer to the lab topology diagram below. In the lab exercise, R7, in AS 1, wants to set a policy such that R6, its AS 2 neighbor, will not advertise its network131.108.0.0to any of its peers. This lab has the following objective:
The topology graphic shows how the network is configured for BGP connectivity.
26
Following is a list of the Cisco IOS commands that are available to use in this lab: Configuration Commands:
y y y y y y y y y y y
enable configure terminal exit access-list access-list-number permit network-number [mask network-mask] neighbor ip-address route-map neighbor ip-address send-community router bgp autonomous-system number route-map match ip address set community no-advertise clear ip bgp ip-address
This section covers the AS-PATH attribute. Whenever a route update passes through an autonomous system (AS), the AS number is prepended to that update.
The AS-PATH attribute is actually the list of AS numbers that a route has traversed in order to reach a destination. An AS-SET is an ordered mathematical set {} of all the ASs that have been traversed. An example of AS-SET is given later.
27
In the above illustration, network 190.10.0.0 is advertised by Router B (RTB) in AS200, when that route traverses AS300 and RTC will append its own AS number to it. So when 190.10.0.0 reaches RTA it will have two AS numbers attached to it: first 200; then 300. So as far as RTA is concerned, the path to reach 190.10.0.0 is (300,200). The same applies for 170.10.0.0 and 180.10.0.0. RTB will have to take path (300,100); that is, traverse AS300 and then AS100 in order to reach 170.10.0.0. RTC will have to traverse path (200) in order to reach 190.10.0.0 and path (100) in order to reach 170.10.0.0. AS attribute lab commands
Following is a list of the Cisco IOS commands that are available to use in this lab: Configuration Commands:
y y y y y y y y y y y y
enable configure terminal exit set as-path ip as-path access-list access-list-number {permit|deny} as-regular-expression neighbor ip-address route-map router bgp autonomous-system number route-map match as-path set as-path trace clear ip bgp ip-address
The BGP nexthop attribute is the next-hop IP address that is going to be used to reach a certain destination.
28
For external BGP (eBGP), the next hop is always the IP address of the neighbor specified in the neighbor command. In the above example, Router C (RTC) will advertise 170.10.0.0 to RTA with a next hop of 170.10.20.2 and RTA will advertise 150.10.0.0 to RTC with a next hop of 170.10.20.1. For internal BGP (iBGP), the protocol states that the next hop advertised by eBGP should be carried into iBGP. Because of that rule, RTA will advertise 170.10.0.0 to its iBGP peer RTB with a next hop of 170.10.20.2. So according to RTB, the next hop to reach 170.10.0.0 is 170.10.20.2 and not 150.10.30.1. You should ensure that RTB can reach 170.10.20.2 via an Interior Gateway Protocol (IGP); otherwise RTB will drop packets destined to 170.10.0.0 because the next-hop address would be inaccessible. For example, if RTB is running IGRP you could also run Interior Gateway Routing Protocol (IGRP) on RTA network 170.10.0.0. You would want to make IGRP passive on the link to RTC so BGP is only exchanged
RTA# router bgp 100 neighbor 170.10.20.2 remote-as 300 neighbor 150.10.50.1 remote-as 100 network 150.10.0.0 RTB# router bgp 100 neighbor 150.10.30.1 remote-as 100 RTC# router bgp 300 neighbor 170.10.20.1 remote-as 100 network 170.10.0.0 *RTC will advertise 170.10.0.0 to RTA with a nexthop= 170.10.20.2. *RTA will advertise 170.10.0.0 to RTB with a nexthop= 170.10.20.2. (The external NextHop via eBGP is sent via iBGP.) Special care should be taken when dealing with multiaccess and NBMA networks, as described in the following sections.
29
The following example shows how the nexthop will behave on a multiaccess network such as Ethernet.
Assume that RTC and RTD in autonomous system 300 (AS300) are running Open Shortest Path First (OSPF). RTC is running BGP with RTA. RTC can reach network 180.20.0.0 via 170.10.20.3. When RTC sends a BGP update to RTA regarding 180.20.0.0 it will use as next hop 170.10.20.3 and not its own IP address (170.10.20.2). This is because the network between RTA, RTC, and RTD is a multiaccess network, and it makes more sense for RTA to use RTD as a next hop to reach 180.20.0.0 rather than making an extra hop via RTC. *RTC will advertise 180.20.0.0 to RTA with a NextHop 170.10.20.3. If the common media to RTA, RTC, and RTD was not multiaccess, but nonbroadcast multiaccess (NBMA), then further complications will occur.
If the common media as you see in the "cloud" above is a Frame Relay or any NBMA cloud, then the exact behavior will occur as if we were connected via Ethernet. RTC will advertise 180.20.0.0 to RTA with a next hop of 170.10.20.3. The problem is that RTA does not have a direct PVC to RTD, and cannot reach the next hop. In this case routing will fail.
30
In order to remedy this situation a command called "next-hop-self" is created.
Because of certain situations with the nexthop as we saw in the previous example, a command called nexthop-self is created. The syntax is: neighbor {ip-address|peer-group-name} next-hop-self The neighbor next-hop-self command allows us to force BGP to use a specified IP address as the next hop rather than letting the protocol choose the next hop. In the previous example, the following configuration solves our problem: RTC# router bgp 300 neighbor 170.10.20.1 remote-as 100 neighbor 170.10.20.1 next-hop-self RTC advertises 180.20.0.0 with a nexthop = 170.10.20.2
One of the main enhancements of Border Gateway Protocol 4 (BGP4) over BGP3 is CIDR. CIDR, or supernetting, is a new way of looking at IP addresses. There is no notion of classes anymore (class A, B or C). For example, network 192.213.0.0, which used to be an illegal Class B network, is now a legal supernet represented by 192.213.0.0/16, where the 16 is the number of bits in the subnet mask counting from the far left of the IP address. This is similar to 192.213.0.0 255.255.0.0.
Aggregates are used to minimize the size of routing tables. Aggregation is the process of combining the characteristics of several different routes in such a way that a single route can be advertised. In the example below, Router B (RTB) is generating network 160.10.0.0. We will configure RTC to propagate a supernet of that route 160.0.0.0 to RTA.
31
network 160.10.0.0 RTC# router bgp 300 neighbor 3.3.3.3 remote-as 200 neighbor 2.2.2.2 remote-as 100 network 170.10.0.0 aggregate-address 160.0.0.0 255.0.0.0
There is a wide range of aggregate commands. It is important to understand how each one works in order to have the desired aggregation behavior. The first command is the one used in the previous example: aggregate-address address mask This will advertise the prefix route, and all of the more specific routes. The command aggregate-address 160.0.0.0 255.0.0.0 propagates an additional network 160.0.0.0 but does not prevent 160.10.0.0 from being also propagated to RTA. The outcome of this is that both networks 160.0.0.0 and 160.10.0.0 are propagated to RTA. This is what we mean by advertising the prefix and the more specific route. You cannot aggregate an address if you do not have a more specific route of that address in the BGP routing table. For example, RTB cannot generate an aggregate for 160.0.0.0 if it does not have a more specific entry of 160.0.0.0 in its BGP table. The more specific route could have been injected into the BGP table via incoming updates from other autonomous systems (ASs), from redistributing an Interior Gateway Protocol (IGP) or static into BGP or via the network command (network 160.10.0.0). In case we would like RTC to propagate network 160.0.0.0 only and NOT the more specific route then we would have to use the following: aggregate-address address mask summary-only This advertises the prefix only; all the more specific routes are suppressed. The command aggregate 160.0.0.0 255.0.0.0 summary-only propagates network 160.0.0.0 and suppresses the more specific route 160.10.0.0. Please note that if we are aggregating a network that is injected into our BGP via the network statement (example: network 160.10.0.0 on RTB), then the network entry is always injected into BGP updates even though we are using the aggregate-address address mask summaryonly command. The upcoming CIDR example discusses this situation. aggregate-address address mask as-set This command advertises the aggregate entry in the BGP routing table along with the as-set information. An as-set is a mathmetical set of ASs. This is discussed further in the section, CIDR Example 2, below. If you want to suppress more specific routes when doing the aggregation, you can define a route map and apply it to the aggregates. This allows you to be selective about which more specific routes to suppress.
32
aggregate-address address-mask suppress-map map-name This command advertises the prefix route and the more specific routes but it suppresses advertisement according to a route map. In the previous diagram, to aggregate 160.0.0.0 and suppress the more specific route 160.20.0.0 and allow 160.10.0.0 to be propagated, you can use the following route map: route-map CHECK permit 10 match ip address 1 access-list 1 permit 160.20.0.0 0.0.255.255 access-list 1 deny 0.0.0.0 255.255.255.255 By definition of suppress-map, any packets permitted by the access list would be suppressed from the updates. Then apply the route map to the aggregate statement: RTC# router bgp 300 neighbor 3.3.3.3 remote-as 200 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 remote-as 100 network 170.10.0.0 aggregate-address 160.0.0.0 255.0.0.0 suppress-map CHECK Another variation follows: aggregate-address address mask attribute-map map-name This allows you to set the attributes (e.g. origin) when aggregates are sent out. The following route map when applied to the aggregateaddress address mask attribute-map map-name command sets the origin of the aggregates to IGP. route-map SETORIGIN set origin igp aggregate-address 160.0.0.0 255.0.0.0 attribute-map SETORIGIN
CIDR Example 1
In this example, RTB will advertise the prefix 160.0.0.0 and suppress all the more specific routes. The problem here is that network 160.10.0.0 is local to AS200, meaning AS200 is the originator of 160.10.0.0. You cannot have RTB generate a prefix for 160.0.0.0 without generating an entry for 160.10.0.0, even if you use the aggregate-address address mask summary only command, because RTB is the originator of 160.10.0.0. There are two solutions to this problem.
33
The first solution is to use a static route and redistribute it into BGP. The outcome is that RTB will advertise the aggregate with an origin of incomplete (?). RTB# router bgp 200 neighbor 3.3.3.1 remote-as 300 redistribute static !-- This generates an update for 160.0.0.0 !-- with the origin path as *incomplete* ip route 160.0.0.0 255.0.0.0 null0 In the second solution, in addition to the static route we add an entry for the network command. This has the same effect except that the origin of the update is set to IGP. RTB# router bgp 200 network 160.0.0.0 mask 255.0.0.0 !-- This marks the update with origin IGP neighbor 3.3.3.1 remote-as 300 redistribute static ip route 160.0.0.0 255.0.0.0 null0
CIDR Example 2 (as-set) You can use the as-set statement in aggregation to reduce the size of the path information by listing the AS number only once, regardless of how many times it may have appeared in multiple paths that were aggregated. The aggregate-address as-set command is used in situations where aggregation of information causes loss of information regarding the path attribute. In the following example, RTC is getting updates about 160.20.0.0 from RTA and updates about 160.10.0.0 from RTB. Suppose RTC wants to aggregate network 160.0.0.0/8 and send it to RTD. RTD would not know what the origin of that route is. By adding the aggregate as-set statement, you force RTC to generate path information in the form of a set. All the path information is included in that set, regardless of which path came first.
RTB# router bgp 200 network 160.10.0.0 neighbor 3.3.3.1 remote-as 300 RTA# router bgp 100 network 160.20.0.0 neighbor 2.2.2.1 remote-as 300
34
Case 1: RTC does not have an as-set statement. RTC will send an update 160.0.0.0/8 to RTD with path information (300) as if the route originated from AS300. RTC# router bgp 300 neighbor 3.3.3.3 remote-as 200 neighbor 2.2.2.2 remote-as 100 neighbor 4.4.4.4 remote-as 400 aggregate 160.0.0.0 255.0.0.0 summary-only !-- this causes RTC to send RTD updates about 160.0.0.0/8 with no indication !-- that 160.0.0.0 is actually coming from two different autonomous !-- systems, this may create loops if RTD has an entry back into AS100. Case 2: RTC# router bgp 300 neighbor 3.3.3.3 remote-as 200 neighbor 2.2.2.2 remote-as 100 neighbor 4.4.4.4 remote-as 400 aggregate 160.0.0.0 255.0.0.0 summary-only aggregate 160.0.0.0 255.0.0.0 as-set !-- causes RTC to send RTD updates about 160.0.0.0/8 with an !-- indication that 160.0.0.0 belongs to a set {100 200}.
This lab covers aggregate addresses and route aggregation. Aggregation is the process of combining several different routes in such a way that a single route can be advertised. In so doing, it will reduce the size of routing tables. Refer to the lab topology diagram below.
y y
Examine the BGP information for specific networks on R8. Use the aggregate-address command with the suppress-map keyword to suppress the route 132.108.50.0/24, yet still advertise 132.108.10.0/24 along with the aggregate route.
Remember that when using suppress-map, anything the access-list permits gets suppressed and anything the access-list denies gets unsuppressed.
35
The topology graphic shows how the network is configured for BGP connectivity.
Configuration Commands:
y y y y y y y y y y y
enable configure terminal exit access-list access-list-number {permit|deny} aggregate-address neighbor router bgp autonomous-system number route-map map-tag [permit|deny] [sequence-number] match ip address {access-list-number|name} clear ip bgp * help (?)
Sending and receiving Border Gateway Protocol (BGP4) updates can be controlled by using numerous different filtering methods. BGP updates can be filtered based on route information, on path information, or on communities. All methods will achieve the same results; choosing one over the other depends on the specific network configuration. This section, BGP Filtering Methods, covers the following topics:
y y y y
36
To restrict the routing information that the router learns or advertises, you can filter BGP based on routing updates to or from a particular neighbor. In order to achieve this, an access list is defined and applied to the updates to or from a neighbor. Use the following command in the router configuration mode: neighbor {ip-address|peer-group-name} distribute-list access-list-number {in|out} In the following example, Router B (RTB) is originating network 160.10.0.0 and sending it to RTC. If RTC wanted to stop those updates from propagating to autonomous system 100 (AS100), we would have to apply an access list to filter those updates and apply it when talking to RTA: RTC# router bgp 300 network 170.10.0.0 neighbor 3.3.3.3 remote-as 200 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 distribute-list 1 out access-list 1 deny 160.10.0.0 0.0.255.255 access-list 1 permit 0.0.0.0 255.255.255.255 !-- filter out all routing updates about 160.10.x.x Using access lists is a bit tricky when you are dealing with supernets that might cause some conflicts. Assume in the above example that RTB has different subnets of 160.10.X.X and our goal is to filter updates and advertise only 160.0.0.0/8, (this notation means that we are using 8 bits of subnet mask starting from the far left of the IP address; this is equivalent to 160.0.0.0 255.0.0.0). The command access-list 1 permit 160.0.0.0 0.255.255.255 permits 160.0.0.0/8,160.0.0.0/9, and so on. To restrict the update to only 160.0.0.0/8, use an extended access list of the following format: access-list 101 permit 160.0.0.0 0.255.255.255 255.0.0.0 0.0.0.0. This list permits 160.0.0.0/8 only. Another type of filtering is path filtering, which is described in the next section.
37
You can specify an access list on both incoming and outgoing updates based on the BGP AS paths information. In the above figure, you can block updates about 160.10.0.0 from going to AS100 by defining an access list on RTC that prevents any updates that have originated from AS200 from being sent to AS100. To do this, use the following statements: ip as-path access-list access-list-number neighbor{ip-address|peer-group-name} filter-list access-list-number {in|out} The following example stops RTC from sending RTA updates about 160.10.0.0 RTC# router bgp 300 neighbor 3.3.3.3 remote-as 200 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 filter-list 1 out !-- "1" refers to the access list below ip as-path access-list 1 deny ^200$ ip as-path access-list 1 permit .* In the above example, access list 1 states: Deny any updates with path information that start with 200 (^) and end with 200 ($). The ^200$ is called a regular expression, with ^ meaning "starts with" and $ meaning "ends with." Because RTB sends updates about 160.10.0.0 with path information starting with 200 and ending with 200, this update matches the access list and will be denied. The .* is another regular expression with the period meaning "any character" and the * meaning "the repetition of that character." So .* is actually any path information that is needed to permit all other updates to be sent. What would happen if instead of using ^200$, you used ^200? If you have an AS400 (see figure above), updates originated by AS400 will have path information of the form (200, 400) with 200 being first and 400 being last. Those updates will match the access list ^200 because they start with 200. Hence, they will also be prevented from being sent to RTA. A good way to check whether we have implemented the correct regular expression is to use the show ip bgp regexp regular-expression command. This shows all the paths that have matched the configured regular expression. The next section explains what is involved in creating a regular expression.
A regular expression is a pattern to match against an input string. By building a regular expression, you specify a string that input must match. In case of BGP, you are specifying a string consisting of path information that an input should match. The previous example specified the string ^200$ and wanted path information coming inside updates to match it
38
in order to perform a decision. The regular expression is composed of the following:
y y y y
Range A range is a sequence of characters contained within left and right square brackets (example: [abcd]).
Atom An atom is a single character, such as the following: . (matches any single character) ^ (matches the beginning of the input string) $ (matches the end of the input string) \ (matches the character) - (matches a comma (,), left brace ({), right brace (}), the beginning of the input string, the end of the input string, or a space
Piece A piece is an atom followed by one of the following symbols: * (matches 0 or more sequences of the atom) + (matches one or more sequences of the atom) ? (matches the atom or the null string) Branch A branch is a 0 or more concatenated pieces. Examples of regular expressions follow: a* (any occurrence of the letter "a," including none) a+ (at least one occurrence of the letter "a" should be present) ab?a (matches "aa" or "aba") _100_ (via AS100) ^100$ (origin AS100 only) ^100 .* (coming from AS100) ^$ (originated from this AS)
In addition to route filtering and as-path filtering, another method is community filtering. Community filtering has been discussed previously; following are a few examples of how to use it.
39
In the network illustrated above, RTB is to set the community attribute to the BGP routes it is advertising such that RTC would not propagate these routes to its external peers. The no-export community attribute is used:
RTB# router bgp 200 network 160.10.0.0 neighbor 3.3.3.1 remote-as 300 neighbor 3.3.3.1 send-community neighbor 3.3.3.1 route-map setcommunity out route-map setcommunity match ip address 1 set community no-export access-list 1 permit 0.0.0.0 255.255.255.255 Note that we have used the route-map and set community commands in order to set the community to no-export. Note also that we had to use the neighbor send-community command in order to send this attribute to RTC. When RTC gets the updates with the attribute no-export, it will not propagate them to its external peer RTA. In the example below, RTB has set the community attribute to 100 200 additive. The value 100 200 will be added to any existing community value before being sent to RTC. RTB# router bgp 200 network 160.10.0.0 neighbor 3.3.3.1 remote-as 300 neighbor 3.3.3.1 send-community neighbor 3.3.3.1 route-map setcommunity out route-map setcommunity match ip address 2 set community 100 200 additive access-list 2 permit 0.0.0.0 255.255.255.255 A community list is a group of communities that we use in a match clause of a route map that allows us to do filtering or set attributes based on different lists of community numbers. The community list command syntax follows: ip community-list community-list-number {permit|deny} community-number
40
For example, you can define the following route map, "match-on-community": route-map match-on-community match community 10 !-- 10 is the community-list number set weight 20 ip community-list 10 permit 200 300 !-- 200 300 is the community number We can use the above in order to filter or set certain parameters like weight and metric based on the community value in certain updates. In the second example above, RTB was sending updates to RTC with a community of 100 200. If RTC wants to set the weight based on those values, you could do the following: RTC# router bgp 300 neighbor 3.3.3.3 remote-as 200 neighbor 3.3.3.3 route-map check-community in route-map check-community permit 10 match community 1 set weight 20 route-map check-community permit 20 match community 2 exact set weight 10 route-map check-community permit 30 match community 3 ip community-list 1 permit 100 ip community-list 2 permit 200 ip community-list 3 permit internet In the above example, any route that has 100 in its community attribute will match list 1 and will have the weight set to 20. Any route that has only 200 as community will match list 2 and will have weight 20. The keyword "exact" states that community should consist of 200 only and nothing else. The last community list is here to make sure that other updates are not dropped. Remember that anything that does not match is dropped by default. The keyword "internet" means "all routes," because all routes are members of the Internet community.
The focus of this Border Gateway Protocol (BGP) Configuration Lab introduces AS-PATH filtering and Prefix filtering to control the sending and receiving of BGP information.
y y
AS-PATH filtering uses an access list to filter BGP updates based on the value of the AS-PATH attribute. Prefix filtering is used to restrict the routing information that a router learns or advertises. This filter can be applied to routing updates or to a
Configuration Commands:
y y y y y y y
enable configure terminal exit neighbor ip-address filter-list filter-list-number neighbor ip-address distribute-list distribute-list-number in router bgp autonomous-system number clear ip bgp ip-address soft in
41
The neighbor command can be used in conjunction with route maps to perform either filtering or parameter setting on incoming and outgoing updates. Route maps associated with the neighbor statement have no effect on incoming updates when matching based on the IP address: neighbor ip-address route-map route-map-name
Assume in the above diagram that we want Router C (RTC) to learn from autonomous system 200 (AS200) about networks that are local to AS200 and nothing else. Also, we want to set the weight on the accepted routes to 20. We can achieve this with a combination of neighbor and as-path access lists. RTC# router bgp 300 network 170.10.0.0 neighbor 3.3.3.3 remote-as 200 neighbor 3.3.3.3 route-map stamp in route-map stamp match as-path 1 set weight 20 ip as-path access-list 1 permit ^200$ Any updates that originate from AS200 will be permitted. Any other updates will be dropped. The next example does the following::
y y
Updates originating from AS200 will be accepted with weight 20. Updates originating from AS400 will be dropped because they do not match either ip as-path accesslist.
42
y
Updates from AS100 will have a weight of 10 because they will match ip as-path access-list 2.
RTC# router bgp 300 network 170.10.0.0 neighbor 3.3.3.3 remote-as 200 neighbor 3.3.3.3 route-map stamp in route-map stamp permit 10 match as-path 1 set weight 20 route-map stamp permit 20 match as-path 2 set weight 10 ip as-path access-list 1 permit ^200$ ip as-path access-list 2 permit ^200 600 .*
Use of set as-path prepend Command with Route Maps In some situations, we are forced to manipulate the path information in order to manipulate the BGP decision process. The command that is used with a route map follows: set as-path prepend as_number as_number Suppose in the diagram below that RTC is advertising its own network 170.10.0.0 to two different ASs: AS100 and AS200. When the information is propagated to AS600, the routers in AS600 will have network reachability information about 170.10.0.0 via two different routes, the first route is via AS100 with path (100, 300) and the second one is via AS400 with path (400, 200, 300). Assuming that all other attributes are the same, AS600 will pick the shortest path and will choose the route via AS100.
AS600 will be getting all its traffic via AS100. If we want to influence this decision from the AS300 end we can make the PATH through AS100 look like it is longer than the PATH going through AS400. We can do this by prepending AS numbers to the existing path info advertised to AS100. A common practice is to repeat our own AS number using the following:
43
RTC# router bgp 300 network 170.10.0.0 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 route-map SETPATH out route-map SETPATH set as-path prepend 300 300 Because of the above configuration, AS600 will receive updates about 170.10.0.0 via AS100 with a path information of: (100, 300, 300, 300), which is longer than (400, 200, 300) received from AS100.
Consider the above diagram: RTA and RTC are running external BGP (eBGP), and RTB and RTC are running eBGP. RTA and RTB are running some kind of Interior Gateway Protocol (IGP) (for example, Routing Information Protocol [RIP], Interior Gateway Routing Protocol [IGRP]). By definition, eBGP updates have a distance of 20, which is lower than the IGP distances. Default distance is 120 for RIP, 100 for IGRP, 90 for Enhanced Interior Gateway Routing Protocol (EIGRP), and 110 for Open Shortest Path First (OSPF). RTA will receive updates about 160.10.0.0 via two routing protocols:
y y
RTA will pick eBGP via RTC because of the lower distance. By default, BGP has the following distances:
y y y
But these could be changed by the distance command: distance bgp external-distance internal-distance local-distance If we want RTA to learn about 160.10.0.0 via RTB (IGP), then we have two options:
y y
Change eBGP's external distance or IGP's distance, which is not recommended. Use BGP backdoor.
44
BGP backdoor makes the IGP route the preferred route. Use the network address backdoor command. The configured network is the network that we would like to reach via IGP. For BGP this network will be treated as a locally assigned network except it will not be advertised in BGP updates. RTA# router eigrp 10 network 160.10.0.0 router bgp 100 neighbor 2.2.2.1 remote-as 300 network 160.10.0.0 backdoor Network 160.10.0.0 is treated as a local entry, but is not advertised as a normal network entry. RTA learns 160.10.0.0 from RTB via EIGRP with distance 90, and also learns it from RTC via eBGP with distance 20. Normally eBGP is preferred, but because of the backdoor statement, EIGRP is preferred.
A BGP peer group is a group of BGP neighbors with the same update policies. Update policies are usually set by route maps, distribute lists, filter lists, and so on. Instead of defining the same policies for each separate neighbor, we define a peer group name and we assign these policies to the peer group. Members of the peer group inherit all the configuration options of the peer group. Members can also be configured to override these options if these options do not affect outbound updates; you can only override options set on the inbound. To define a peer group, use the following command: neighbor peer-group-name peer-group In the following example, we will see how peer groups are applied to internal and external BGP neighbors.
RTC#
45
router bgp 300 neighbor internalmap peer-group neighbor internalmap remote-as 300 neighbor internalmap route-map SETMETRIC out neighbor internalmap filter-list 1 out neighbor internalmap filter-list 2 in neighbor 5.5.5.2 peer-group internalmap neighbor 5.6.6.2 peer-group internalmap neighbor 3.3.3.2 peer-group internalmap neighbor 3.3.3.2 filter-list 3 in In the above configuration, a peer group named "internalmap" is defined along with some policies for that group, such as: a route map SETMETRIC to set the metric and two different filter lists 1 and 2. The peer group is applied to all internal neighbors RTF, RTG, and RTE. A separate filter-list 3 is defined for neighbor RTE, and will override filter-list 2 inside the peer group. The only options that can be overridden are options that affect inbound updates. Now, look at how peer groups are used with external neighbors. In the same diagram we will configure RTC with a peer-group externalmap and we will apply it to external neighbors. RTC# router bgp 300 neighbor externalmap peer-group neighbor externalmap route-map SETMETRIC neighbor externalmap filter-list 1 out neighbor externalmap filter-list 2 in neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 peer-group externalmap neighbor 4.4.4.2 remote-as 600 neighbor 4.4.4.2 peer-group externalmap neighbor 1.1.1.2 remote-as 200 neighbor 1.1.1.2 peer-group externalmap neighbor 1.1.1.2 filter-list 3 in In the above configurations, we have defined the remote-as statements outside the peer group because we have to define different external ASs. Also we did an override for the inbound updates of neighbor 1.1.1.2 by assigning filter-list 3.
This section covers Border Gateway Protocol (BGP4) confederation. Confederation and the following subject, route reflectors, are designed for Internet service providers (ISPs) who would like to further control the explosion of internal BGP (iBGP) peering inside their autonomous systems (ASs).
BGP confederation is implemented in order to reduce the internal BGP (iBGP) mesh inside an AS. The trick is to divide an AS into multiple ASs and Assign the whole group to a single confederation. Each AS by itself will have iBGP fully meshed within and has connections to other ASs inside the confederation. Even though these ASs will have external BGP (eBGP) peers to ASs within the confederation, they exchange routing as if they were using iBGP; next-hop, metric, and local preference information are preserved. To the outside world, the confederation (the group of ASs) will look like a single AS.
46
To configure a BGP confederation, use the following: bgp confederation identifier autonomous-system The confederation identifier will be the AS number of the confederation group. The group of ASs will look to the outside world as one AS, with the AS number being the confederation identifier. Peering within the confederation between multiple ASs is done via the following command: bgp confederation peers autonomous-system [autonomous-system] The following is an example of confederation:
Let us Assume that you have an AS 500 consisting of nine BGP speakers (other non-BGP speakers exist also, but we are interested only in the BGP speakers that have eBGP connections to other ASs). If you want to make a full iBGP mesh inside AS500, then you would need nine peer connections for each router, eight iBGP peers, and one eBGP peer to external ASs. By using confederation, we can divide AS500 into multiple ASs: AS50, AS60, and AS70. We give the AS a confederation identifier of 500. The outside world will see only one AS500. For each AS50, AS60, and AS70, we define a full mesh of iBGP peers and we define the list of confederation peers using the bgp confederation peers command. Let's look at a sample configuration of routers RTC, RTD, and RTA. Note that RTA has no knowledge of ASs 50, 60, or 70. RTA has knowledge only of AS500. RTC# router bgp 50 bgp confederation identifier 500 bgp confederation peers 60 70 neighbor 128.213.10.1 remote-as 50 (iBGP connection within AS50) neighbor 128.213.20.1 remote-as 50 (iBGP connection within AS50) neighbor 129.210.11.1 remote-as 60 (BGP connection with confederation peer 60) neighbor 135.212.14.1 remote-as 70 (BGP connection with confederation peer 70) neighbor 5.5.5.5 remote-as 100 (eBGP connection to external AS100)
47
RTD# router bgp 60 bgp confederation identifier 500 bgp confederation peers 50 70 neighbor 129.210.30.2 remote-as 60 (iBGP connection within AS60) neighbor 128.213.30.1 remote-as 50(BGP connection with confederation peer 50) neighbor 135.212.14.1 remote-as 70 (BGP connection with confederation peer 70) neighbor 6.6.6.6 remote-as 600 (eBGP connection to external AS600) RTA# router bgp 100 neighbor 5.5.5.4 remote-as 500 (eBGP connection to confederation 500)
As demonstrated in the iBGP section, a Border Gateway Protocol (BGP) speaker will not advertise a route learned via another iBGP speaker to a third iBGP speaker. By relaxing this restriction a bit and by providing additional control, we can allow a router to advertise (reflect) iBGP-learned routes to other iBGP speakers. This will reduce the number of iBGP peers within an AS.
In normal cases, a full iBGP mesh should be maintained between Router A (RTA), RTB, and RTC within AS100. By utilizing the route reflector concept, RTC could be elected as a RR and have a partial iBGP peering with RTA and RTB. Peering between RTA and RTB is not needed because RTC will be a route reflector for the updates coming from RTA and RTB. neighbor route-reflector-client The router with the above command would be the RR and the neighbors pointed at would be the clients of that RR. In our example, RTC would be configured with the neighbor route-reflector-client command pointing at the RTA and RTB IP addresses. The combination of the RR and its clients is called a cluster. RTA, RTB, and RTC above would form a cluster with a single RR within AS100. Other iBGP peers of the RR that are not clients are called nonclients. An AS can have more than one RR; a RR would treat other RRs just like any other iBGP speaker. Other RRs could belong to the same cluster (client group) or to other clusters. In a simple configuration, the AS could be divided into multiple clusters; each RR will be configured with other RRs as nonclient peers in a fully meshed
48
topology. Clients should not peer with iBGP speakers outside their cluster.
Consider the above diagram. RTA, RTB, and RTC form a single cluster with RTC being the RR. According to RTC, RTA and RTB are clients and anything else is a nonclient. Remember that clients of an RR are pointed at using the neighbor route-reflector-client command. RTD is the RR for its clients RTE and RTF; RTG is a RR in a third cluster. Note that RTD, RTC, and RTG are fully meshed but routers within a cluster are not. When a route is received by a RR, it will do the following, depending on the peer type:
y y y
Route from a nonclient peer: reflect to all the client peers within the cluster. Route from a client peer: reflect to all the client and nonclient peers. Route from an external BGP (eBGP) peer: send the update to all client and nonclient peers.
The following is the relative BGP configuration of routers RTC, RTD, and RTB: RTC# router bgp 100 neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 route-reflector-client neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 route-reflector-client neighbor 7.7.7.7 remote-as 100 neighbor 4.4.4.4 remote-as 100 neighbor 8.8.8.8 remote-as 200 RTB# router bgp 100 neighbor 3.3.3.3 remote-as 100 neighbor 12.12.12.12 remote-as 300 RTD# router bgp 100 neighbor 6.6.6.6 remote-as 100 neighbor 6.6.6.6 route-reflector-client
49
neighbor 5.5.5.5 remote-as 100 neighbor 5.5.5.5 route-reflector-client neighbor 7.7.7.7 remote-as 100 neighbor 3.3.3.3 remote-as 100 As the iBGP-learned routes are reflected, it is possible to have the routing information loop. The RR scheme has a few methods to avoid this: 1. Originator ID: This is an optional, nontransitive BGP attribute that is four bytes long and is created by a RR. This attribute will carry the router-id (RID) of the originator of the route in the local AS. Thus, because of poor configuration, if the routing information comes back to the originator, it will be ignored. Cluster list: This is discussed in the next section.
2.
Usually, a cluster of clients will have a single RR. In this case, the cluster will be identified by the router ID of the RR. In order to increase redundancy and avoid single points of failure, a cluster might have more than one RR. All RRs in the same cluster need to be configured with a 4-byte cluster ID so that a RR can recognize updates from RRs in the same cluster. A cluster list is a sequence of cluster IDs that the route has passed. When a RR reflects a route from its clients to nonclients outside the cluster, it will append the local cluster ID to the cluster list. If this update has an empty cluster list, the RR will create one. Using this attribute, a RR can identify if the routing information is looped back to the same cluster because of poor configuration. If the local cluster ID is found in the cluster list, the advertisement will be ignored. In the above diagram, RTD, RTE, RTF, and RTH belong to one cluster, with both RTD and RTH being RRs for the same cluster. Note the redundancy in that RTH has a fully meshed peering with all the RRs. In case RTD goes down, RTH will take its place. The following are the configuration of RTH, RTD, RTF and RTC: RTH# router bgp 100 neighbor 4.4.4.4 remote-as 100 neighbor 5.5.5.5 remote-as 100 neighbor 5.5.5.5 route-reflector-client neighbor 6.6.6.6 remote-as 100 neighbor 6.6.6.6 route-reflector-client neighbor 7.7.7.7 remote-as 100 neighbor 3.3.3.3 remote-as 100
50
neighbor 9.9.9.9 remote-as 300 bgp cluster-id 10 RTD# router bgp 100 neighbor 10.10.10.10 remote-as 100 neighbor 5.5.5.5 remote-as 100 neighbor 5.5.5.5 route-reflector-client neighbor 6.6.6.6 remote-as 100 neighbor 6.6.6.6 route-reflector-client neighbor 7.7.7.7 remote-as 100 neighbor 3.3.3.3 remote-as 100 neighbor 11.11.11.11 remote-as 400 bgp cluster-id 10 RTF# router bgp 100 neighbor 10.10.10.10 remote-as 100 neighbor 4.4.4.4 remote-as 100 neighbor 13.13.13.13 remote-as 500 RTC# router bgp 100 neighbor 1.1.1.1 remote-as 100 neighbor 1.1.1.1 route-reflector-client neighbor 2.2.2.2 remote-as 100 neighbor 2.2.2.2 route-reflector-client neighbor 4.4.4.4 remote-as 100 neighbor 7.7.7.7 remote-as 100 neighbor 10.10.10.10 remote-as 100 neighbor 8.8.8.8 remote-as 200 There was no need for the cluster command for RTC because only one RR exists in that cluster. An important thing to note is that peer groups were not used in the above configuration. If the clients inside a cluster do not have direct iBGP peers among one another and they exchange updates through the RR, peer groups should not be used. If peer groups were to be configured, then a potential withdrawal to the source of a route on the RR would be sent to all clients inside the cluster and could cause problems. The router sub command bgp client-to-client reflection is enabled by default on the RR. If BGP client-to-client reflection was turned off on the RR and redundant BGP peering was made between the clients, then using peer groups would be acceptable.
It is normal in an AS to have BGP speakers that do not understand the concept of route reflectors. We will call these routers conventional BGP speakers. The RR scheme will allow such conventional BGP speakers to coexist. These routers could be either members of a client group or a nonclient group. This would allow easy and gradual migration from the current iBGP model to the route reflector model. One could start creating clusters by configuring a single router as RR and making other RRs and their clients normal iBGP peers. Then more clusters could be created gradually.
51
In the above diagram, RTD, RTE, and RTF have the concept of route reflection. RTC, RTA and RTB are what we call conventional routers and cannot be configured as RRs. Normal iBGP mesh could be done between these routers and RTD. Later on, when we are ready to upgrade, RTC could be made a RR with clients RTA and RTB. Clients do not have to understand the route reflection scheme; it is only the RRs that would have to be upgraded. The following is the configuration of RTD and RTC: RTD# router bgp 100 neighbor 6.6.6.6 remote-as 100 neighbor 6.6.6.6 route-reflector-client neighbor 5.5.5.5 remote-as 100 neighbor 5.5.5.5 route-reflector-client neighbor 3.3.3.3 remote-as 100 neighbor 2.2.2.2 remote-as 100 neighbor 1.1.1.1 remote-as 100 neighbor 13.13.13.13 remote-as 300
RTC# router bgp 100 neighbor 4.4.4.4 remote-as 100 neighbor 2.2.2.2 remote-as 100 neighbor 1.1.1.1 remote-as 100 neighbor 14.14.14.14 remote-as 400 When we are ready to upgrade RTC and make it a RR, we would remove the iBGP full mesh and have RTA and RTB become clients of RTC.
We have mentioned so far two attributes that are used to prevent potential information looping: originator ID and
52
cluster list. Another means of controlling loops is to put more restrictions on the set clause of outbound route maps. The set clause for outbound route maps does not affect routes reflected to iBGP peers. More restrictions are also put on Nexthopself, which is a per-neighbor configuration option. When used on RRs, the Nexthopself will affect only the next hop of eBGP-learned routes because the next hop of reflected routes should not be changed.
This section covers route-flap dampening. Route-flap dampening is a Border Gateway Protocol (BGP4) feature designed to minimize the propagation of flapping routes across an internetwork. A route is considered to be flapping when it is repeatedly available, then unavailable, then available, then unavailable, and so on.
Route dampening (introduced in Cisco IOS Version 11.0) is a mechanism to minimize the instability caused by route flapping and oscillation over the network. To accomplish this, criteria are defined to identify poorly behaved routes. A route that is flapping gets a penalty for each flap (1000). The penalty is stored in the BGP routing table; as soon as the cumulative penalty reaches a predefined "suppress-limit," the advertisement of the route will be suppressed. The penalty will be exponentially decayed based on a preconfigured "half-life time." After the penalty decreases below a predefined "reuse limit," the route advertisement will be unsuppressed, that is, the route is added back to the BGP table and once again used for forwarding. Routes external to an autonomous system (AS), learned via internal BGP (iBGP) will not be dampened. This is to avoid the iBGP peers having higher penalty for routes external to the AS. The penalty will be decayed at a granularity of 5 seconds and the routes will be unsuppressed at a granularity of 10 seconds. The dampening information is kept until the penalty becomes less than half of "reuse-limit" , at that point the information is purged from the router.
Initially, dampening will be off by default. This might change if there is a need to have this feature enabled by default. The following are the commands used to control route dampening:
y y y
bgp dampening (turns on dampening) no bgp dampening (turns off dampening) bgp dampening half-life-time (changes the half-life time)
The command that sets all parameters at the same time follows:
bgp dampening half-life-time reuse-value suppress-value max-suppress-time half-life-time range is 145 minutes; current default is 15 minutes reuse-value range is 120000; default is 750 suppress-value range is 120000; default is 2000 max-suppress-time maximum duration a route can be suppressed; range is 1255, default is four times half-life time
53
RTB# hostname RTB interface Serial0 ip address 203.250.15.2 255.255.255.252 interface Serial1 ip address 192.208.10.6 255.255.255.252 router bgp 100 bgp dampening network 203.250.15.0 neighbor 192.208.10.5 remote-as 300 RTD# hostname RTD interface Loopback0 ip address 192.208.10.174 255.255.255.192 interface Serial0/0 ip address 192.208.10.5 255.255.255.252 router bgp 300 network 192.208.10.0 neighbor 192.208.10.6 remote-as 100 Router B (RTB) is configured for route dampening with default parameters. Assuming the external BGP (eBGP) link to RTD is stable, the RTB BGP table would look like the following: RTB# show ip bgp BGP table version is 24, local router ID is 203.250.15.2 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete Network *> 192.208.10.0 *> 203.250.15.0 Next Hop 192.208.10.5 0.0.0.0 Metric LocPrf Weight Path 0 0 300 i 0 32768 i
In order to simulate a route flap, use clear ip bgp 192.208.10.6 on RTD. The RTB BGP table will look like the following: RTB# show ip bgp BGP table version is 24, local router ID is 203.250.15.2 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete Network h 192.208.10.0 *> 203.250.15.0 Next Hop 192.208.10.5 0.0.0.0 Metric LocPrf Weight Path 0 0 300 i 0 32768 i
The BGP entry for 192.208.10.0 has been put in a history state, which means that the router does not have a best
54
path to the route, but information about the route flapping still exists. RTB# show ip bgp 192.208.10.0 BGP routing table entry for 192.208.10.0 255.255.255.0, version 25 Paths: (1 available, no best path) 300 (history entry) 192.208.10.5 from 192.208.10.5 (192.208.10.174) Origin IGP, metric 0, external Dampinfo: penalty 910, flapped 1 times in 0:02:03 The route has been given a penalty for flapping but the penalty is still below the suppress limit. The route is not yet suppressed. If the route flaps a few more times, you will see the following: RTB# show ip bgp BGP table version is 32, local router ID is 203.250.15.2 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete Network *d 192.208.10.0 *> 203.250.15.0 Next Hop 192.208.10.5 0.0.0.0 Metric LocPrf Weight Path 0 0 300 i 0 32768 i
RTB# show ip bgp 192.208.10.0 BGP routing table entry for 192.208.10.0 255.255.255.0, version 32 Paths: (1 available, no best path) 300, (suppressed due to dampening) 192.208.10.5 from 192.208.10.5 (192.208.10.174) Origin IGP, metric 0, valid, external Dampinfo: penalty 2615, flapped 3 times in 0:05:18 , reuse in 0:27:00 The route has been dampened. The route will be reused when the penalty reaches the "reuse value," in our case 750 (default). The dampening information will be purged when the penalty becomes less than half of the reuse limit, in our case (750/2=375). The following are the commands used to show and clear flap statistics information:
y y y y y y y y y y y
show ip bgp flap-statistics displays flap statistics for all the paths show ip bgp flap-statistics regexp regexp displays flap statistics for all paths that match the regular expression show ip bgp flap-statistics filter-list list displays flap statistics for all paths that pass the filter show ip bgp flap-statistics address mask displays flap statistics for a single entry show ip bgp flap-statistics address mask longer-prefixes displays flap statistics for more specific entries show ip bgp neighbors [dampened-routes] | [flap-statistics] displays flap statistics for all paths from a neighbor clear ip bgp flap-statistics clears flap statistics for all routes clear ip bgp flap-statistics regexp regexp clears flap statistics for all the paths that match the regular expression clear ip bgp flap-statistics filter-list list clears flap statistics for all the paths that pass the filter clear ip bgp flap-statistics address mask clears flap statistics for a single entry clear ip bgp address flap-statistics clears flap statistics for all paths from a neighbor
BGP4 multipath support provides BGP load balancing between multiple exterior BGP (eBGP) sessions. If there are multiple eBGP sessions between the local Autonomous System (AS) and the neighboring AS, multipath support allows BGP to load balance among these sessions.
55
When a BGP speaker learns two identical eBGP paths for a prefix from a neighboring AS, it will choose the path with the lowest route ID as the best path. This best path is installed in the IP routing table. If BGP multipath support is enabled and the eBGP paths are learned from the same neighboring AS, instead of one best path being picked, multiple paths are installed in the IP routing table. Depending on the switching mode, either per-packet or per-destination load balancing is performed among the multiple paths, providing greater overall redundancy. A maximum of six paths is supported. The maximum-paths router configuration command controls the number of paths allowed. By default, BGP will install only one path to the IP routing table.
The Border Gateway Protocol (BGP4) advertises routes from its BGP table to external peers (peers in different autonomous systems [AS]) by default. The use of route maps or other filters allows you to permit or deny the advertisement of selected routes.
Conditional Advertisement Feature The BGP conditional advertisement feature provides additional control of route advertisement, depending on the existence of other prefixes in the BGP table. Normally, routes are propagated regardless of the existence of a different path. The BGP conditional advertisement feature uses the non-exist-map and advertise-map keywords to track routes by the route prefix. Non-exist-map is used to check whether a particular route prefix exists; advertise-map is used to propagate a second route, based on whether the initial route is present in the BGP table. If a route prefix was not present in the non-exist-map, the route specified by the advertise-map is announced. If the route prefix is present, the route in the advertise-map is not announced. This feature can be useful in a multihomed network, in which some prefixes are to be advertised to one of the providers, only if information from the other provider is missing. This condition would indicate a failure in the peering session, or partial reachability. If the same information is advertised to all providers in a multihomed environment, the information is duplicated in the global BGP table. When the BGP conditional advertisement feature is used, only partial routes are advertised to each provider, and the size of the global BGP table is not increased with redundant information. The administrator can also guarantee the path that inbound traffic will follow because only specific paths are advertised to providers. Example An example of the use of conditional advertisements might be outlined in a multihomed network that has external BGP (eBGP) connections to two different Internet Service Providers (ISPs). Typically, some prefixes are advertised to ISP 1, while other prefixes are advertised to ISP 2. The conditional advertisement feature allows you to advertise ISP 1 prefixes to ISP 2 if certain information (such as a link to ISP 1) is not present in the BGP table. All traffic will then flow through ISP 2 if the link to ISP 1 fails. The traffic patterns will return to normal when the link to ISP 1 is back online. The conditional BGP announcements are sent in addition to the normal announcements that a BGP router sends to its peers. AS path list information cannot be used for conditional advertisement because the IP routing table does not contain AS path information.
Configuring Conditional Advertisement To enable conditional advertisement, use the following command in router configuration mode:
56
neighbor ip-address advertise-map map1-name non-exist-map map2-name
To conditionally advertise a set of routes, use the following router configuration command:
o o o o
The variable ip-address equals the neighbor's IP address. The variables map1-name, map2-name equal the names of route maps. The route map associated with non-exist-map specifies the prefix that the BGP speaker will keep track of. The route-map associated with advertise-map specifies the prefix that will be advertised when the prefix in the 'non-exist-map' no longer exists. The prefix tracked by the BGP speaker must not be present in the BGP table for the conditional advertisement to take place.
Example In the following example, the router will advertise 152.108.4.0/22 to its neighbor only if 162.108.10.0/24 is not present in the BGP table.
router bgp 3 neighbor 162.108.21.8 remote-as 4 neighbor 162.108.21.8 advertise-map MAP1 non-exist-map MAP2 ! route-map MAP1 permit 10 match ip address 1 ! route-map MAP2 permit 10 match ip address 2 ! access-list 1 permit 152.108.4.0 0.0.3.255 access-list 2 permit 162.108.10.0 0.0.0.255
57
The Open Shortest Path First (OSPF) routing protocol is based on link-state technology, as opposed to distance-vector protocols such as Interior Gateway Routing Protocol (IGRP) and Routing Information Protocol (RIP). OSPF offers several advantages over distance-vector protocols. It has faster convergence, supports larger internetworks, and is less susceptible to bad routing information. Some of the features of OSPF follow:
y y y y
Hierarchical routing Classless behavior, allowing support of variable-length subnet masks (VLSMs) and discontiguous networks The use of multicast addresses in order to reduce the effect of non-OSPF routing devices Authentication for secure routing
OSPF is a routing protocol that calls for the sending of link-state advertisements (LSAs) to all other routers within the same hierarchical area. An area is a group of contiguous networks and attached hosts. OSPF LSAs include information on attached interfaces, metrics used, and other variables. As OSPF routers accumulate information, the routers use the SPF algorithm to calculate the shortest path to each node. This is different from the way distance-vector protocols work. Distance-vector protocols send all or a portion of their routing tables in routing-update messages to their neighbors. Configuring and troubleshooting OSPF networks is more complex than with its distance-vector counterparts.
y y y y y y
Routers running OSPF will send OSPF hello packets to all OSPF-enabled interfaces. Routers sharing a common data link will become OSPF neighbors if their hello packets contain certain information that is mutually agreed upon. OSPF neighboring routers may form an OSPF adjacency if it is determined that there are certain commonalties between the routers exchanging hellos and the network over which the hellos are exchanged. Not all neighboring routers will form adjacencies. Routers will send (flood) LSAs over all adjacencies. All routers will build identical databases the LSAs. Shortest-path trees are calculated from the newly assembled routing tables.
58
Hellos OSPF neighbors are identified by their router IDs. A router ID is an IP address by which the router is uniquely identified within the OSPF domain. A Cisco router selects its router ID as the highest IP address on any loopback interfaces configured on the router. If no loopback interfaces are configured on the router, the router chooses the highest IP address of any of its physical interfaces. Routers that share a common segment may become neighbors on that segment. Neighbors are discovered via the OSPF Hello protocol and are recorded in a neighbor table. The Hello protocol:
y y y y
Provides a way to discover OSPF neighbors Acts as a keepalive between neighbors Ensures bi-directional communication between neighbors Is used for designated router (DR) and backup designated router (BDR) election on certain types of networks
Hello packets are sent out all OSPF-enabled interfaces. They are sent out periodically with a special multicast address as the destination. Routers will become neighbors when they see themselves (their own router ID) in their neighbors hello packets and they agree upon certain parameters included in the hello packets. Neighbor negotiation will take place on the primary IP address only, not over secondary addresses. If secondary addresses are configured on the interface, they are restricted to be in the same OSPF area as the primary address. Two routers will become neighbors if the following parameters are agreed upon:
y y y y
Area ID The two routers sharing a common network segment must have their interfaces configured to be in the same area. Authentication OSPF allows for configuration of a password for a specified area. Routers that want to become neighbors must exchange the same password over the common segment. Hello and Dead intervals The hello interval is the amount of time between hello packets that a router sends out on an OSPFenabled interface. The dead interval is the amount of time, in seconds, that a router will wait for a hello packet from a neighbor before declaring the neighbor down. These interval times are included in the hello packet and must be agreed upon by neighbors. Stub area flag Two neighboring routers must also agree on the stub area flag in the hello packets in order to become neighbors. (Stub areas will also be discussed later.)
All of the above parameters are included in hello packets. Also included in hello packets are the following:
y y y y y y
The router ID of the originating router The address and mask of the originating interface Router priority, which is used for DR election (discussed later) The DR and BDR Flag bits for option capabilities; one of these is the stub area flag mentioned above Router IDs of the originating router neighbors
Network Types After two-way communication between neighbors is established, OSPF routers move on to the next step, which is building adjacencies. Adjacent routers are routers that go beyond the hello protocol exchange and proceed into the database exchange process. As previously mentioned, not all neighboring routers become adjacent. Whether or not an adjacency is formed depends on the type of network to which the neighboring routers are connected. The types of networks that OSPF defines follow:
y y y y
Point-to-point networks Broadcast networks Non-Broadcast Multi-Access networks (NBMA) Point-to-multipoint networks
Point-to-point networks, such as serial lines, connect a single pair of routers. OSPF will always form an adjacency with the neighbor on the other side of a point-to-point interface. There is no concept of DR or BDR on point-to-point networks. The destination address of OSPF packets on these networks will always be sent to 224.0.0.5, otherwise known as the ALLSPFRouters multicast address.
59
Broadcast networks, such as Ethernet, Token Ring, and Fiber Distributed Data Interface (FDDI), are multi-access, meaning they are able to connect more than two devices; a packet sent by one router will be received by all connected routers. On broadcast networks, OSPF will elect a DR and a BDR. Hello packets on broadcast networks are sent to the destination address of 224.0.0.5. All packet originated by the DR and BDR are also sent to the this address. All other non-DR and non-BDR routers will send link-state updates to the address 224.0.0.6, also known as AllDRouters. NBMA networks, such as Frame Relay, ATM, and X.25, can connect multiple devices, but they have no broadcast capability. (For more information on Frame Relay, please read the Frame Relay document.) A packet sent by a router will not be received by all the other routers attached to the network. Special care should be taken when configuring OSPF over NBMA networks. OSPF considers these media to be just like any other broadcast media such as Ethernet or Token Ring. As a result, extra configuration may be required for NBMA networks. OSPF routers elect a DR and BDR, and all OSPF packets are unicast. Point-to-multipoint networks are NBMA networks in which the networks are treated as a collection of point-to-point links. Routers on these networks do not elect a DR and BDR because the network is seen as point-to-point links. OSPF packets are multicast on these networks. Designated Router and Backup Designated Router The DR and BDR are elected on broadcast networks in order to prevent certain problems. First, if every router attached to a broadcast network formed an adjacency with every other router attached to the network, there would be n(n - 1)/2 adjacencies. Second, if a router flooded its LSAs to all of the router neighbors and all routers in turn flooded the LSA to their neighbors, there would be multiple copies of the same LSA on the same network. The idea behind the DR is that every router attached to the network would form an adjacency with the DR. Only the DR would send LSA to the rest of the attached network. OSPF also elects a BDR in the event that the DR fails. This prevents routers from having to reelect a DR and reforming adjacencies with the new DR. Instead, the routers attached to the network form an adjacency with both the DR and BDR. If the DR goes down, the BDR becomes the DR; since the other routers already have a formed adjacency with the BDR, there is little, if any, network unavailability. DR and BDR election is done via the Hello protocol. Hello packets are exchanged via IP multicast packets on each segment. The router with the highest OSPF priority on the segment will become the DR. Default priority is one for Cisco router interfaces. This process is repeated for the BDR. If the priorities are the same, the router with the highest router ID will become the DR. A single DR/BDR pair is elected on each attached segment. A router that is the DR of one segment may not be the DR or BDR of another attached segment. Setting the OSPF priority of an interface can be done with the interface subcommand: ip ospf priority [value] A priority value of zero indicates that the interface will not be elected as the DR or BDR. Note that once a DR and a BDR have been elected, a new router coming on line that has a higher priority will not override the DR and BDR. When the new OSPF router becomes active and discovers its neighbors, it checks for valid DR and BDR. If the DR and BDR exist, the new router will accept them. Routers that are not the DR or BDR are known as DRother.
60
In the diagram above, the router that will be elected DR for Segment 1 will be Router_F. This is because the priorities of all the router interfaces are equal (P = 1 on all the interfaces). This results in the router with the highest router ID (RID) as being elected the designated router. Router_F has the highest RID and is, therefore, the DR. On segment 2, Router_C does not have the highest RID, but it still is elected the DR because its OSPF interface priority, which is 2, is higher than all the rest. The diagram below shows the resulting adjacencies that will be formed on segment 1 of the diagram above. Note that the routers that are not DR will form adjacencies only with the DR. In this illustration, the BDR is not shown, but adjacencies would also be formed with the BDR.
Building Adjacencies After neighbor discovery takes place and bi-directional communication is established (a router sees its own router ID in neighbor hello packet), neighboring routers attempt to synchronize their link-state databases. When database synchronization in successful, the neighbors are fully adjacent. Neighbors on point-to-point and point-to-multipoint networks always become adjacent unless the parameters of the hello packets are not agreed upon. On broadcast networks and NBMA networks, the DR and BDR become adjacent with all neighbors. No adjacencies will be formed between the DRothers. The following are states through which OSPF routers will transition neighbors before being considered fully adjacent:
y y y y
y y y y
Flooding
Down This is the initial state of the neighbor, indicating no information has been received from any router on the segment. Attempt On NBMA networks, where neighbors are manually configured, this state indicates that no recent information has been received from the neighbor. An effort is made to contact the neighbor by sending hello packets. Init This state indicates that a hello is received from a neighbor; however, bi-directional communication is not yet established. Two-way The router has seen itself in the neighbor hello packets. Bi-directional communication is now established. On broadcast networks DR and BDR are elected at the end of this state. When this state ends, a decision is made whether or not to proceed in building an adjacency. The decision is based on whether the neighbor is a DR or BDR or the network link is point-topoint. ExStart The router and its neighbor establish a master/slave relationship and determine the initial sequence number that is going to be used in the exchange of database description packets. Exchange Routers will describe their entire link-state database by sending database description packet to neighbors that are in the exchange state. Loading Routers build a link-state request list and retransmission list. Any information that looks outdated or incomplete will be put on the request list. Any update that has not been acknowledged will be put on the retransmission list. Full The adjacency is now complete. Adjacent routers will have identical link-state databases.
The OSPF link-state database consists of all the LSAs the router has received. Each node in the network maintains an identical link-state database. A change in the topology means a change in one or more of the LSAs. Flooding is the process by which these new LSAs are sent throughout the network in order to ensure that the databases in all routers remain identical. Areas Because of its complexity with multiple databases and flooding algorithms, OSPF can be memory and processor intensive. The demand for memory and processor utilization grows as the network grows.
61
OSPF uses areas to reduce the strain on router memory and processor utilization. An area is a logical grouping of routers that break the OSPF network into subdomains. Routers must share identical databases with routers in its area only, not with the entire network. This reduces the memory demand. The smaller database results in a smaller number of LSAs to process, thereby reducing the demand of processing power. Most flooding is also limited within an area. Areas are interface specific and are identified with an area ID. The introduction of areas also introduces a different type of traffic. Intra-area traffic consists of packets that are contained within an area; inter-area packets travel between routers in different areas. External traffic consists of packets that travel between routers belonging to an OSPF domain and another autonomous system. Backbone If more than one area is configured, one of these areas must be defined as area 0. Area 0 is known as the backbone area. All other areas must be logically connected to area 0 either physically or through a virtual-link. Virtual-links are explained below. Each area gives routing information to area 0 which in turn disseminates that information to all other connected areas. For this reason, all inter-area traffic must pass through area 0. Non-backbone areas cannot exchange packets directly with one another. Virtual Links As mentioned above, all other areas must be physically connected to the backbone area, area 0. In some cases where this is not possible, a virtual link can be used. The virtual link will provide a link to the backbone through a nonbackbone area. Virtual links are also used to connect two parts of a partitioned backbone through a nonbackbone area.
As shown in the above diagram, virtual links can be established between two area border routers (ABRs) that have a common area, with one ABR connected to the backbone. The transit area is defined as the area between two ends of a virtual link. The transit area must be connected to area 0 to have full routing information and cannot be a stub area. OSPF classifies virtual links as point-to-point networks with no IP subnets associated with them. Router Types As mentioned above, areas are interface specific, meaning that a router can have one interface configured in one area and a second interface configured in a second area. Therefore, routers can be categorized in relation to areas. There are three types of OSPF routers.
y y y
Internal routers (IRs) An internal router is a router with all of its interfaces in the same area. Area border routers (ABRs) An ABR is a router that has interfaces in multiple areas. An ABR must always have at least one interface in the backbone area. Autonomous system boundary routers (ASBRs) ASBRs are routers that act as gateways between OSPF and other routing protocols or other OSPF routing processes. In other words, redistribution takes place on the ASBRs.
62
All valid LSAs received by a router are stored in a link-state database. These LSAs describe the topology of an area. Routers use the LSAs to calculate the shortest path tree. The list of LSAs in the database can be viewed with the command show ip ospf database. This list shows only the information in the LSA header, but it also contains LSAs from multiple areas if the router were an ABR. More detailed information of each LSA can be viewed with different commands, which will be explained later. An example output of the show ip ospf database command follows: Router_B#show ip ospf database OSPF Router with ID (170.170.3.2) (Process ID 7) Router Link States (Area 0) Link ID ADV Router Age Seq# Checksum Link count 170.170.3.2 170.170.3.2 17 0x80000002 0x8B6 1 170.170.8.4 170.170.8.4 217 0x80000003 0xAA02 1 170.170.13.3 170.170.13.3 218 0x80000002 0x5156 1 Net Link States (Area 0) Link ID ADV Router Age Seq# Checksum 170.170.3.3 170.170.13.3 18 0x80000002 0xA0B2 Summary Net Link States (Area 0) Link ID ADV Router Age Seq# Checksum 170.170.7.0 170.170.8.4 240 0x80000001 0x6ED0 Summary ASB Link States (Area 0) Link ID ADV Router Age Seq# Checksum 170.170.11.6 170.170.8.4 129 0x80000001 0xF73C Type-5 AS External Link States Link ID ADV Router Age Seq# Checksum Tag 200.200.200.0 170.170.11.6 135 0x80000001 0xE4FA 0 Router_B# As can be seen from the information in the database in the above diagram, there are different types of LSAs defined by OSPF. Each type describes a different portion of the OSPF network. The table below lists the different LSA types and type codes and how the link-state is identified. Following the table is a description of the LSAs. Different LSA Types Type Code 1 2 3 4 LSA Router LSA Network LSA Network summary LSA ASBR summary LSA Link-State ID Originating router ID of the router Interface IP address of the DR Destination network number Router ID of AS boundary router
63
5 7 AS external LSA NSSA external LSA External network number External network number
Router LSAs are generated by every router. The router LSA is a list of links attached to the router, as well as the state of the link and the outgoing OSPF cost associated with the link. To view details of the router LSA, use the show ip ospf database router command. Router LSA
Network LSAs are generated by the DR on a multi-access segment. They are the representation of the multi-access segment and all the routers attached to the segment. Segments that do not have a DR, such as point-to-point, will not have a network LSA. To view detailed information of the network LSA, use the show ip ospf database network command. Network LSA
Network summary LSAs are generated by ABRs. This is how network reachability information is advertised. ABRs are responsible for injecting information into the backbone and the backbone will pass the information on to other areas. The show ip ospf database summary command can be used to view detailed information of the summary LSA. Network Summary LSA
64
ASBR summary LSAs are also generated by the ABR. This LSA describes the location of an ASBR, not a network. The details can be viewed with the show ip ospf database asbr-summary command. ASBR Summary LSA
Autonomous System (AS) External LSAs are originated by the ASBRs and describe a network outside of the AS. They can be viewed with the show ip ospf database external command. AS External LSA
Not-So-Stubby Area (NSSA) external LSAs are originated by the ASBR within the NSSA. These types of LSAs are flooded only throughout the NSSA. These are unlike external LSAs, which are flooded throughout the entire network.
Stub Areas ASBR routers will flood external routes throughout the OSPF domain. For this reason, OSPF allows certain areas to be configured as stub areas. Stub areas are areas into which external LSAs are not flooded. Routing from these areas to other parts of the OSPF network is done via the default route. The advantage to using stub areas is that the reduction of the link-state database reduces the requirements for memory. All OSPF routers inside a stub area must be configured as stub routers. Since all interfaces belonging to the area will start exchanging hello packets, the stub flag must be set in order to successfully form a neighbor relationship.
65
Also, virtual links cannot be configured within or transit a stub area. Examples of stub areas and how to configure them will be shown in the "Configuring OSPF" section. Totally Stubby Areas Totally stubby areas are areas into which external LSAs and summary LSAs (inter-area routes) are not flooded. The only thing injected into the totally stubby area are intra-area routes and the default route (0.0.0.0). The default route is the only type 3 (summary) LSA that the ABR will allow into the totally stubby area. An example of totally stubby areas and their configuration is discussed in the "Configuring OSPF" section. Not-So Stubby Areas In some cases, it may be necessary to connect a stub area to an external AS and redistribute the external routes into OSPF. Unfortunately, this means that the stub area router will become an ASBR, meaning the area can no longer be a stub area. NSSAs allow external routers to be advertised into the OSPF AS while retaining the characteristics of a stub area. The ASBR in the NSSA will originate type 7 LSAs. These external NSSA LSAs are flooded throughout the NSSA but are blocked at the ABR. The ABR will translate this into a type 5 LSA and flood it into the other areas. An example of NSSAs and their configuration is discussed in the "Configuring OSPF" section. OSPF On-Demand Circuits OSPF demand circuit is an enhancement to the OSPF protocol that allows efficient operation over on-demand circuits such as ISDN and dial-up lines. Prior to this feature, periodic hellos and LSA updates would be exchanged between routers that connected the on-demand link, even when there were no changes in the Hello or LSA information. With this feature, periodic Hellos are suppressed and periodic refresh of LSAs are not flooded over demand circuits. These packets bring up the link only when they are exchanged for the first time, or when there is a change in the information they contain.
Frame Relay
Background Frame Relay is a high-performance WAN protocol that operates at the physical and data link layers of the OSI reference model. Frame Relay originally was designed for use across Integrated Services Digital Network (ISDN) interfaces. Today, it is used over a variety of other network interfaces as well. This chapter focuses on Frame Relay's specifications and applications in the context of WAN services. Frame Relay is an example of a packet-switched technology. Packet-switched networks enable end stations to dynamically share the network medium and the available bandwidth. Variable-length packets are used for more efficient and flexible transfers. These packets then are switched between the various network segments until the destination is reached. Statistical multiplexing techniques control network access in a packet-switched network. The advantage of this technique is that it accommodates more flexibility and more efficient use of bandwidth. Most of today's popular LANs, such as Ethernet and Token Ring, are packet-switched networks. Frame Relay often is described as a streamlined version of X.25, offering fewer of the robust capabilities, such as windowing and retransmission of last data, that are offered in X.25. This is because Frame Relay typically operates over WAN facilities that offer more reliable connection services and a higher degree of reliability than the facilities available during the late 1970s and early 1980s that served
66
as the common platforms for X.25 WANs. As mentioned earlier, Frame Relay is strictly a Layer 2 protocol suite, whereas X.25 provides services at Layer 3 (the network layer) as well. This enables Frame Relay to offer higher performance and greater transmission efficiency than X.25 and makes Frame Relay suitable for current WAN applications, such as LAN interconnection. Initial proposals for the standardization of Frame Relay were presented to the Consultative Committee on International Telephone and Telegraph (CCITT) in 1984. Due to lack of interoperability and lack of complete standardization, however, Frame Relay did not experience significant deployment during the late 1980s. A major development in Frame Relay's history occurred in 1990 when Cisco Systems, Digital Equipment, Northern Telecom, and StrataCom formed a consortium to focus on Frame Relay technology development. This consortium developed a specification that conformed to the basic Frame Relay protocol that was being discussed in CCITT but extended the protocol with features that provide additional capabilities for complex internetworking environments. These Frame Relay extensions are referred to collectively as the Local Management Interface (LMI). Since the consortium's specification was developed and published, many vendors have announced their support of this extended Frame Relay definition. ANSI and CCITT have subsequently standardized their own variations of the original LMI specification, and these standardized specifications now are more commonly used than the original version. Internationally, Frame Relay was standardized by the International Telecommunications Union - Telecommunications Sector (ITU-T). In the United States, Frame Relay is an American National Standards Institute (ANSI) standard. Frame Relay Devices Devices attached to a Frame Relay WAN fall into two general categories: data terminal equipment (DTE) and data circuit-terminating equipment (DCE). DTEs generally are considered to be terminating equipment for a specific network and typically are located on the premises of a customer. In fact, they may be owned by the customer. Examples of DTE devices are terminals, personal computers, routers, and bridges. DCEs are carrier-owned internetworking devices. The purpose of DCE equipment is to provide clocking and switching services in a network, which are the devices that actually transmit data through the WAN. In most cases, these are packet switches. The following figure shows the relationship between the two categories of devices. DCEs generally reside within carrier-operated WANs.
The connection between a DTE device and a DCE device consists of both a physical-layer component and a link-layer component. The physical component defines the mechanical, electrical, functional, and procedural specifications for the connection between the devices. One of the most commonly used physical-layer interface specifications is the recommended standard (RS)-232 specification. The link-layer component defines the protocol that establishes the connection between the DTE device, such as a router, and the DCE device, such as a switch. This chapter examines a commonly utilized protocol specification used in WAN networking---the Frame Relay protocol. Frame Relay Virtual Circuits Frame Relay provides connection-oriented data link layer communication. This means that a defined communication exists between each pair of devices and that these connections are associated with a connection identifier. This service is implemented by using a Frame Relay virtual circuit, which is a logical connection created between two data terminal equipment (DTE) devices across a Frame Relay packetswitched network (PSN). Virtual circuits provide a bi-directional communications path from one DTE device to another and are uniquely identified by a data-link connection identifier (DLCI). A number of virtual circuits can be multiplexed into a single physical circuit for transmission across the network. This capability often can reduce the equipment and network complexity required to connect multiple DTE devices. A virtual circuit can pass through any number of intermediate DCE devices (switches) located within the Frame Relay PSN.
67
Frame Relay virtual circuits fall into two categories: switched virtual circuits (SVCs) and permanent virtual circuits (PVCs). Switched Virtual Circuits (SVCs) Switched virtual circuits (SVCs) are temporary connections used in situations requiring only sporadic data transfer between DTE devices across the Frame Relay network. A communication session across an SVC consists of four operational states:
y y y y
Call Setup---The virtual circuit between two Frame Relay DTE devices is established. Data Transfer---Data is transmitted between the DTE devices over the virtual circuit. Idle---The connection between DTE devices is still active, but no data is transferred. If an SVC remains in an idle state for a defined period of time, the call can be terminated. Call Termination---The virtual circuit between DTE devices is terminated.
After the virtual circuit is terminated, the DTE devices must establish a new SVC if there is additional data to be exchanged. It is expected that SVCs will be established, maintained, and terminated using the same signaling protocols used in ISDN. Few manufacturers of Frame Relay DCE equipment, however, support Switched Virtual Connections. Therefore, their actual deployment is minimal in today's Frame Relay networks. Permanent Virtual Circuits (PVCs) Permanent virtual circuits (PVCs) are permanently established connections that are used for frequent and consistent data transfers between DTE devices across the Frame Relay network. Communication across a PVC does not require the call setup and termination states that are used with SVCs. PVCs always operate in one of the following two operational states:
y y
Data Transfer---Data is transmitted between the DTE devices over the virtual circuit. Idle---The connection between DTE devices is active, but no data is transferred. Unlike SVCs, PVCs will not be terminated under any circumstances due to being in an idle state.
DTE devices can begin transferring data whenever they are ready because the circuit is permanently established. Data-Link Connection Identifier (DLCI) Frame Relay virtual circuits are identified by data-link connection identifiers (DLCIs). DLCI values typically are assigned by the Frame Relay service provider (for example, the telephone company). Frame Relay DLCIs have local significance, which means that the values themselves are not unique in the Frame Relay WAN. Two DTE devices connected by a virtual circuit, for example, may use a different DLCI value to refer to the same connection. The following figure illustrates how a single virtual circuit may be assigned a different DLCI value on each end of the connection.
A single Frame Relay virtual circuit can be assigned different DLCIs on each end of a VC.
Congestion-Control Mechanisms Frame Relay reduces network overhead by implementing simple congestion-notification mechanisms rather than explicit, per-virtual-circuit flow control. Frame Relay typically is implemented on reliable network media, so data integrity is not sacrificed because flow control can be left to higher-layer protocols. Frame Relay implements two congestion-notification mechanisms:
y y
FECN and BECN each are controlled by a single bit contained in the Frame Relay frame header. The Frame Relay frame header also contains a Discard Eligibility (DE) bit, which is used to identify less important traffic that can be dropped during periods of congestion.
68
The FECN bit is part of the Address field in the Frame Relay frame header. The FECN mechanism is initiated when a DTE device sends Frame Relay frames into the network. If the network is congested, DCE devices (switches) set the value of the frames' FECN bit to 1. When the frames reach the destination DTE device, the Address field (with the FECN bit set) indicates that the frame experienced congestion in the path from source to destination. The DTE device can relay this information to a higher-layer protocol for processing. Depending on the implementation, flow-control may be initiated, or the indication may be ignored. The BECN bit is part of the Address field in the Frame Relay frame header. DCE devices set the value of the BECN bit to 1 in frames traveling in the opposite direction of frames with their FECN bit set. This informs the receiving DTE device that a particular path through the network is congested. The DTE device then can relay this information to a higher-layer protocol for processing. Depending on the implementation, flow-control may be initiated, or the indication may be ignored. Frame Relay Discard Eligibility (DE) The Discard Eligibility (DE) bit is used to indicate that a frame has lower importance than other frames. The DE bit is part of the Address field in the Frame Relay frame header. DTE devices can set the value of the DE bit of a frame to 1 to indicate that the frame has lower importance than other frames. When the network becomes congested, DCE devices will discard frames with the DE bit set before discarding those that do not. This reduces the likelihood of critical data being dropped by Frame Relay DCE devices during periods of congestion. Frame Relay Error Checking Frame Relay uses a common error-checking mechanism known as the cyclic redundancy check (CRC). The CRC compares two calculated values to determine whether errors occurred during the transmission from source to destination. Frame Relay reduces network overhead by implementing error checking rather than error correction. Frame Relay typically is implemented on reliable network media, so data integrity is not sacrificed because error correction can be left to higher-layer protocols running on top of Frame Relay. Frame Relay Local Management Interface (LMI) The Local Management Interface (LMI) is a set of enhancements to the basic Frame Relay specification. The LMI was developed in 1990 by Cisco Systems, StrataCom, Northern Telecom, and Digital Equipment Corporation. It offers a number of features (called extensions) for managing complex internetworks. Key Frame Relay LMI extensions include global addressing, virtual-circuit status messages, and multicasting. The LMI global addressing extension gives Frame Relay data-link connection identifier (DLCI) values global rather than local significance. DLCI values become DTE addresses that are unique in the Frame Relay WAN. The global addressing extension adds functionality and manageability to Frame Relay internetworks. Individual network interfaces and the end nodes attached to them, for example, can be identified by using standard address-resolution and discovery techniques. In addition, the entire Frame Relay network appears to be a typical LAN to routers on its periphery. LMI virtual circuit status messages provide communication and synchronization between Frame Relay DTE and DCE devices. These messages are used to periodically report on the status of PVCs, which prevents data from being sent into black holes (that is, over PVCs that no longer exist). The LMI multicasting extension allows multicast groups to be assigned. Multicasting saves bandwidth by allowing routing updates and address-resolution messages to be sent only to specific groups of routers. The extension also transmits reports on the status of multicast groups in update messages. Frame Relay Network Implementation A common private Frame Relay network implementation is to equip a T1 multiplexer with both Frame Relay and non-Frame Relay interfaces. Frame Relay traffic is forwarded out the Frame Relay interface and onto the data network. Non-Frame Relay traffic is forwarded to the appropriate application or service, such as a private branch exchange (PBX) for telephone service or to a video-teleconferencing application. A typical Frame Relay network consists of a number of DTE devices, such as routers, connected to remote ports on multiplexer equipment via traditional point-to-point services such as T1, fractional T1, or 56 K circuits. An example of a simple Frame Relay network is shown in the following figure.
A simple Frame Relay network connects various devices to different services over a WAN.
69
The majority of Frame Relay networks deployed today are provisioned by service providers who intend to offer transmission services to customers. This is often referred to as a public Frame Relay service. Frame Relay is implemented in both public carrier-provided networks and in private enterprise networks. The following section examines the two methodologies for deploying Frame Relay. Public Carrier-Provided Networks In public carrier-provided Frame Relay networks, the Frame Relay switching equipment is located in the central offices of a telecommunications carrier. Subscribers are charged based on their network use but are relieved from administering and maintaining the Frame Relay network equipment and service. Generally, the DCE equipment also is owned by the telecommunications provider. DCE equipment either will be customer-owned or perhaps owned by the telecommunications provider as a service to the customer. The majority of today's Frame Relay networks are public carrier-provided networks. Private Enterprise Networks More frequently, organizations worldwide are deploying private Frame Relay networks. In private Frame Relay networks, the administration and maintenance of the network are the responsibilities of the enterprise (a private company). All the equipment, including the switching equipment, is owned by the customer. Frame Relay Frame Formats To understand much of the functionality of Frame Relay, it is helpful to understand the structure of the Frame Relay frame. The first illustration in this section depicts the basic format of the Frame Relay frame, and the next illustrates the LMI version of the Frame Relay frame. Flags indicate the beginning and end of the frame. Three primary components make up the Frame Relay frame: the header and address area, the user-data portion, and the frame-check sequence (FCS). The address area, which is 2 bytes in length, is comprised of 10 bits representing the actual circuit identifier and 6 bits of fields related to congestion management. This identifier commonly is referred to as the data-link connection identifier (DLCI). Each of these is discussed in the descriptions that follow. Standard Frame Relay Frame Standard Frame Relay frames consist of the fields illustrated below. Five fields comprise the Frame Relay frame.
70
The following descriptions summarize the basic Frame Relay frame fields illustrated above.
y y
Flags---Delimits the beginning and end of the frame. The value of this field is always the same and is represented either as the hexadecimal number 7E or the binary number 01111110. Address---Contains the following information: o DLCI: The 10-bit DLCI is the essence of the Frame Relay header. This value represents the virtual connection between the DTE device and the switch. Each virtual connection that is multiplexed onto the physical channel will be represented by a unique DLCI. The DLCI values have local significance only, which means that they are unique only to the physical channel on which they reside. Therefore, devices at opposite ends of a connection can use different DLCI values to refer to the same virtual connection. o Extended Address (EA): The EA is used to indicate whether the byte in which the EA value is 1 is the last addressing field. If the value is 1, then the current byte is determined to be the last DLCI octet. Although current Frame Relay implementations all use a two-octet DLCI, this capability does allow for longer DLCIs to be used in the future. The eighth bit of each byte of the Address field is used to indicate the EA. o C/R: The C/R is the bit that follows the most significant DLCI byte in the Address field. The C/R bit is not currently defined. o Congestion Control: This consists of the three bits that control the Frame Relay congestion-notification mechanisms. These are the FECN, BECN, and DE bits, which are the last three bits in the Address field. Forward-explicit congestion notification (FECN) is a single bit field that can be set to a value of 1 by a switch to indicate to an end DTE device, such as a router, that congestion was experienced in the direction of the frame transmission from source to destination. The primary benefit of the use of the FECN and BECN fields is the ability of higher-layer protocols to react intelligently to these congestion indicators. Today, DECnet and OSI are the only higher-layer protocols that implement these capabilities. Backward-explicit congestion notification (BECN) is a single bit field that, when set to a value of 1 by a switch, indicates that congestion was experienced in the network in the direction opposite of the frame transmission from source to destination. Discard eligibility (DE) is set by the DTE device, such as a router, to indicate that the marked frame is of lesser importance relative to other frames being transmitted. Frames that are marked as "discard eligible" should be discarded before other frames in a congested network. This allows for a fairly basic prioritization mechanism in Frame Relay networks.
y y
Data---Contains encapsulated upper-layer data. Each frame in this variable-length field includes a user data or payload field that will vary in length up to 16,000 octets. This field serves to transport the higher-layer protocol packet (PDU) through a Frame Relay network. Frame Check Sequence---Ensures the integrity of transmitted data. This value is computed by the source device and verified by the receiver to ensure integrity of transmission.
LMI Frame Format Frame Relay frames that conform to the LMI specifications consist of the fields illustrated in the following figure:
Nine fields comprise the Frame Relay that conforms to the LMI format.
y y y
Flag---Delimits the beginning and end of the frame. LMI DLCI---Identifies the frame as an LMI frame instead of a basic Frame Relay frame. The LMI-specific DLCI value defined in the LMI consortium specification is DLCI = 1023. Unnumbered Information Indicator---Sets the poll/final bit to zero.
71
y y y y
Protocol Discriminator---Always contains a value indicating that the frame is an LMI frame. Call Reference---Always contains zeros. This field currently is not used for any purpose. Message Type---Labels the frame as one of the following message types: o Status-inquiry message: Allows a user device to inquire about the status of the network. o Status message: Responds to status-inquiry messages. Status messages include keep-alives and PVC status messages. Information Elements---Contains a variable number of individual information elements (IEs). IEs consist of the following fields: o IE Identifier: Uniquely identifies the IE. o IE Length: Indicates the length of the IE. o Data: Consists of one or more bytes containing encapsulated upper-layer data. Frame Check Sequence (FCS)---Ensures the integrity of transmitted data.
72
Enhanced Interior Gateway Routing Protocol (EIGRP) is an enhanced version of IGRP developed by Cisco; it provides superior convergence properties and operating effiency, and combines the advantages of link-state protocols with those of distance-vector protocols. Major Revisions of EIGRP There are two major revisions of EIGRP, Versions 0 and 1. Cisco IOS versions earlier than 10.3(11), 11.0(8), and 11.1(3) run the earlier version of EIGRP. Some explanations in this module may not apply to the earlier versions. It is highly recommended to use the later version of EIGRP; it includes many performance and stability enhancements.
Early routing protocols were based on distance vectors; they were very simple and easy to implement but had the severe drawbacks of counting to infinity and routing loops. These problems were reduced using techniques such as split horizon, holddowns, and so on. These techniques, however, introduced long convergence times. Routing protocols based on link states have been implemented to address the problem of slow convergence in distance-vector protocols, but they add complexity in configuration and troubleshooting. EIGRP is an advanced distance-vector protocol that scales well, is easy to configure, and provides extremely quick convergence times with minimal network traffic. EIGRP is a classless protocol, meaning that it supports variable-length subnet mask (VLSM) and aggregation. An aggregate is a summarized group of addresses. In addition, EIGRP implements modules for IP, Internetwork Packet Exchange (IPX), and AppleTalk, which are responsible for the protocol-specific routing tasks. This training module includes information about only the IP module. Typically, distance-vector protocols maintain the information about only one path to the destination, the best path. This information consists of total metric (distance) and the next hop (vector) to the destination. For example:
If a router, Router_A, learns about a destination (Network A) from two different routers, it chooses the best path by examining the total metric of each path. After it chooses the best path, it discards any information about the alternate (nonbest) path. In the above example, the best path would be through Router_C (assuming hop count as a metric). All information about the path through Router_B is discarded. If the path between Router_A and Network A is somehow broken, Router_A removes the route from its routing table after a certain amount of time, usually three update periods (in the case of Routing Information Protocol (RIP), this time would be 90 seconds, not including any hold-down timers). After this route is removed from the routing table, Router_A then learns about Network A via Router_B (because Router_B has been sending periodic updates). It could take from 90 to 120 seconds before Router_A installs the new route to Network A through Router_B. EIGRP, on the other hand, builds a topology table from information it learns from each of its neighbors. The information sent by EIGRP is nonperiodic and contains only new information. Using the Diffusing Update Algorithm (DUAL), EIGRP then chooses a best path (successor) and alternate loop-free paths (feasible successors) that allow for fast convergence. This information is kept in a topology table separate from the routing table. Upon losing a route to a destination, EIGRP looks for feasible successors in its topology table. If a feasible successor does exist, EIGRP begins using it immediately. If no feasible successors exist, EIGRP queries its neighbors. For all the above to be accomplished, the components of EIGRP must provide:
73
y y y y y
Reliable transport mechanism Neighbor discovery/recovery process, which allows EIGRP routers to discover and track other EIGRP speaking routers that are on directly connected networks; part of this process must be done reliably (guaranteed) A way to discover which paths are loop free A process to clear bad routes from the topology table of all routers on the network A process for querying neighbors to find paths for lost destinations
The DUAL algorithm cannot be effective if messages are not transmitted reliably. Therefore, a reliable transport for ordered delivery and acknowledgment must be part of EIGRP. The Reliable Transport Protocol is a component of EIGRP that guarantees the delivery and order of EIGRP packets. EIGRP updates and hellos are destined to the multicast address 224.0.0.10. Each EIGRP neighbor receiving a multicast reliable packet will unicast an acknowledgment. State variables, such as sequence number and acknowledgment number, are maintained on a per-neighbor basis to ensure ordered delivery. EIGRP uses multiple packet types for reliable transport, all of which are identified by protocol number 88 in the IP header.
y y y y
Hello packets are used to discover and recover neighbors. They are multicast and use unreliable delivery (no acknowledgment necessary). Acknowledgments are used for reliable delivery and are always unicast. Updates are used to convey route information. Updates are transmitted only when there is a change in the topology; they contain only the changed information, and they are sent only to routers that require the information. If only one router requires the update information, the updates are unicast; otherwise the updates are multicast. Updates use reliable delivery. Queries and replies are used by DUAL. Queries can be multicast or unicast, and replies are always unicast. Queries and replies use reliable delivery.
Because EIGRP updates are nonperiodic and contain information only on paths that have changed, EIGRP relies on neighbor relationships to reliably propagate routing-table changes throughout the network. When an EIGRP router is initialized, it starts sending hello packets. Hello packets, when used for neighbor discovery, are always sent multicast addressed (224.0.0.10). The hello packet includes the EIGRP K-values (discussed later). Two routers will become neighbors only if the K-values in the hello packets are the same. This scenario enforces consistent metric usage throughout the network. Upon startup, two routers will become EIGRP neighbors when they see each other's hello packets on a common network. Hello packets are sent out, by default, once every five seconds on high-bandwidth media and every 60 seconds on low-bandwidth media. The rate at which the hello packets are sent is called the hello interval. This interval can be changed on a per-interface basis with the interface subcommand ip hello-interval eigrp. When a router receives a hello packet, the packet includes a hold time, the amount of time for which a router will consider a neighbor up without receiving a hello. Because the hold time is included in the hello packet, it is possible for two routers to become EIGRP neighbors even though the hello and hold timers do not match. The hold time is typically three times the hello interval. The hold time can be changed on a per-interface basis with the ip hold-time eigrp subinterface command. Information about each neighbor is maintained in a neighbor table, which can be viewed with the show ip eigrp neighbor command. The following is an example of the neighbor information for Router_B in the network shown above:
Router_B#show ip eigrp neighbor IP-EIGRP neighbors for process 7 H Address Interface 2 170.170.3.4 1 170.170.3.3 0 170.170.1.1 Router_B# Et0 Et0 Se0
RTO
Seq Num 8 18 17
74
The following is a description of what is included in the EIGRP neighbor table: Show IP EIGRP Neighbors Field Descriptions Field Process 7 Address Interface Hold Time Uptime SRTT RTO Description Autonomous system number specified in the IP router configuration command IP address of the enhanced IGRP peer Interface on which the router is receiving hello packets from the peer Length of time, in seconds, that the router will wait to hear from the peer before declaring it down; if the peer is using the default hold time, this number will be less than 15; if the peer configures a nondefault hold time, it will be reflected here Elapsed time, in hours, minutes, and seconds, since the local router first heard from this neighbor Smooth round-trip timethe number of milliseconds it takes for an IP-enhanced IGRP packet to be sent to this neighbor and for the local router to receive an acknowledgment of that packet Retransmission timeout, in millisecondsthe amount of time the router waits before retransmitting a packet from the retransmission queue to a neighbor Number of IP-enhanced IGRP packets (update, query, and reply) that the router is waiting to send
Q Count
Seq Num
Sequence number of the last update, query, or reply packet that was received from this neighbor
When a router receives the hello packet from a new neighbor, EIGRP attempts to exchange routing updates with the neighbor. The updates contain all routes known by the sending routers and the metrics of those routes. When an EIGRP router receives updates from its neighbors, it builds a second table, the topology table, from which it builds a routing (forwarding) table. The topology table contains information needed to build a set of metrics and next hops to each reachable network, including:
y y y y y y y y
Lowest bandwidth on the path to the destination Total delay Path reliability Path loading Minimum-path maximun transmission unit (MTU) Feasible distance Reported distance Route source
We will see later how we can view the contents of the topology table.
As with most other routing protocols, the best path to a destination is the path with the lowest metric. EIGRP has the ability to use several variables to compute the metric to a destination network. The first five listed above are those variables: bandwidth, delay, reliability, load, and MTU. Only bandwidth and delay are used by default. It is highly recommended that the defaults be maintained, because using other variables can result in unknown problems in your network. The values of bandwidth and delay are determined from the bandwidth and delay values associated with the router interfaces. There are default values, but the values can be changed per interface with the bandwidth and delay subinterface commands. The formula for computing EIGRP metrics follows: Metric {[K1 * Bandwidth + (K2 * Bandwidth)/(256 Load) + K3 * Delay] * [K5/(Reliability + K4)]} * 256 The default K-values follow: K1 = 1; K2 = 0; K3 = 1; K4 = 0; K5 = 0; therefore, the metric formula can be simplified to: Metric = (Bandwidth + Delay) * 256
75
Bandwidth = 10000000/Minimum bandwidth along path; and Delay = Sum of delays along path. Therefore, the final metric formula becomes: ([10000000/Minimum bandwidth] + Sum of delay/10) * 256 Note: Formula uses the bandwidth in kilobits per second and delay as configured on the interface, which is in microseconds. Metric example:
In this example, the total cost (metric) for Router_A to get to Network A through Router_B would be: Minimum bandwidth = 128kbps Total delay = 100 + 100 + 1000 = 1200/10 ms ([10000000/128] + 1200/10) * 256 = 20030720 The total cost to the same destination through Router_C follows: Minimum bandwidth = 512kbps Total delay = 1000 + 100 + 100 = 1200/10 ms ([10000000/512] + 1200/10) * 256 = 5030720 The path through Router_C has the lowest cost. Router_A would, therefore, choose the path through Router_C as the best path and put it in its routing table. This path would then be known as the successor (explained later). In the above topology, the metric of Router_B to Network A would be 307200. Router_C would also have a metric of 307200 to Network A.
(successor, feasible distance, reported distance, and feasible successor explained) Successor is the best path to a given destination; it is the path that is installed into the routing table. Feasible distance (FD) is the lowest calculated metric to each destination. In the above metric example, the path through Router_C was the best path because the calculated metric was the lowest through that path. Router_A would have a feasible distance of 5307200 for Network A. Reported distance (RD) is the metric to a destination as advertised by a neighbor. In the above metric example, the metric that Router_C calculates for Network A is 307200. Router_A would see this metric as a reported distance. A feasible successor (FS) is a path whose reported distance is less than the feasible distance. This condition, reported distance < feasible distance, is also known as the feasibility condition. A path that satisfies the feasible condition is considered loop free. A path that has a distance larger than the feasible distance could possibly be through this router, causing a loop. In the above metric example, the reported distance of Router_A for the path through Router_B is 307200 (the same as the path through Router_C). This value is less than the feasible distance of Router_A of 5307200, a value that meets the feasibility condition, and Router_B will be the feasible successor of Router_A for Network A.
76
All of the calculated metrics and distances defined above as well as some additional information can be viewed in an EIGRP router by issuing the show ip eigrp topology command. The following is the output when issuing a show ip eigrp topology on Router_A (Network A in this case is 170.170.4.0/24): Router_A#show ip eigrp topology IP-EIGRP Topology Table for process 7 Codes: P - Passive, A - Active, U - Update, Q - Query, R - Reply, r - Reply status P 170.170.1.0/24, 1 successors, FD is 20256000 via Connected, Serial0 P 170.170.2.0/24, 1 successors, FD is 5025536 via Connected, Serial1 P 170.170.3.0/24, 1 successors, FD is 5281536 via 170.170.2.3 (5281536/281600), Serial1 via 170.170.1.2 (20281600/281600), Serial0 P 170.170.4.0/24, 1 successors, FD is 5307136 via 170.170.2.3 (5307136/307200), Serial1 via 170.170.1.2 (20307200/307200), Serial0 Router_A# From the above output, we can see that for network 170.170.4.0, Router_A has a FD of 5307136, which is also the metric of the best route (route through serial 0). The reported distance of Router_B is 307200. Because 307200 is less than 5307136, the feasibility condition is met and the route through Router_B is a FS. Note that show ip eigrp topology shows successors as well as FSs for each destination. Let's look at the topology table of another router. If we display the topology table for Router_B, we will see the following: Router_B#show ip eigrp topology IP-EIGRP Topology Table for process 7 Codes: P - Passive, A - Active, U - Update, Q - Query, R - Reply, r - Reply status P 170.170.1.0/24, 1 successors, FD is 2169856 via Connected, Serial0 P 170.170.2.0/24, 1 successors, FD is 5281536 via 170.170.3.3 (5281536/5255936), Ethernet0 via 170.170.1.1 (20512000/5255936), Serial0 P 170.170.3.0/24, 1 successors, FD is 281600 via Connected, Ethernet0 P 170.170.4.0/24, 1 successors, FD is 307200 via 170.170.3.4 (307200/281600), Ethernet0 Now for network 170.170.4.0, we see a successor via 170.170.3.4 and a FD of 307200. However, we don't see any FSs. Looking at the topology above, we see that Router_B should also hear about the network 170.170.4.0 from Router_A. This route is not displayed because it is not a FS. We can see all routes for destinations, including those that are not FSs, by using the show ip eigrp topology all-links command. The following is the output for Router_B: Router_B#show ip eigrp topology all-links IP-EIGRP Topology Table for process 7 Codes: P - Passive, A - Active, U - Update, Q - Query, R - Reply, r - Reply status P 170.170.1.0/24, 1 successors, FD is 2169856, serno 15 via Connected, Serial0 P 170.170.2.0/24, 1 successors, FD is 5281536, serno 18 via 170.170.3.3 (5281536/5255936), Ethernet0 via 170.170.1.1 (20512000/5255936), Serial0 P 170.170.3.0/24, 1 successors, FD is 281600, serno 1 via Connected, Ethernet0 via 170.170.1.1 (20537600/5281536), Serial0 P 170.170.4.0/24, 1 successors, FD is 307200, serno 11 via 170.170.3.4 (307200/281600), Ethernet0
77
via 170.170.1.1 (20563200/5307136), Serial0
Now we can see that Router_B has learned about network 170.170.4.0 through a second path. But we also see that the reported distance for this second path is 5307136, greater that the FD; therefore, this second path is not a FS.
As mentioned previously, if a router loses its best path (successor) to a given destination, it will check its EIGRP topology table for a FS. If one exists, it becomes the successor and the router can begin using it immediately. What happens if a router does not have a FS? EIGRP then needs a process to clear the bad route from the topology table of all routers in the network and a process to find new paths to the destination. These processes are defined by the DUAL state machine. An EIGRP route is said to be in a passive state when EIGRP is not performing any DUAL computations for it. Certain events can occur to make EIGRP reassess the FS list for a given destination: 1. 2. When there is a direct change to the topology table, such as changing the state of a directly connected link When an EIGRP update, query, or reply packet is received
The reassessment could result in an existing FS becoming the successor, in which case the FD is updated and updates are sent to all neighbors. The route remains passive during the reassessment. If a FS cannot be found in the topology table, EIGRP begins performing a DUAL computation and the route becomes active. When a route is active, a router sends queries to all of its neighbors. Each neighbor then performs its own local computation. If the neighbor has at least one FS for the destination in question, it sends a reply, containing its metric to the destination, to the router that originated the query. If the neighbor does not have any FSs, it also generates queries to all of its neighbors, meaning that the route becomes active in the neighboring router as well. In some instances, it may take a very long time before a querying router receives a reply from one or more of its neighbors. Possible causes are that the network is very large, the network has low-quality links, or high CPU utilization, and so on. If a querying router does not receive all expected replies within a certain amount of time, the route is declared "stuck in active" (SIA). At this point, the router that originated the query will reset the neighbor that hasn't responded to the query. A query example follows:
In the above diagram, Router_A has two possible paths to get to network 170.170.4.0/24. If we calculate the metrics for this network, we find that the the best path of Router_A to Network A is through Router_B with a metric of 20307200. Therefore, the FD is 20307200. Router_C is reporting a distance of 20537600, the distance for its best path through Router_D. As far as Router_A is concerned, the distance reported from Router_C is greater than the FD; therefore, the feasibility condition is not met and Router_A does not have any FS.
78
If the link between Router_A and Router_B fails, as shown in first diagram, Router A checks its topology table for a FS. We determined earlier that Router_A does not have a FS, so it queries all the other EIGRP neighbors, as shown in the second diagram. Router_C then checks its topology table for a valid successor or FS. If either one is found, Router_C sends a reply to Router_A, as shown in the diagram below. The reply includes the metric of Router_C to the network.
Router_A then installs the new route to the network in its topology table and into its routing table.
Two types of summarization can be used with EIGRP: autosummarization and manual summarization. By default, EIGRP autosummarizes on major network boundaries when it is first configured. This behavior is similar to other distancevector protocols, such as RIP and IGRP. An example of autosummarization follows:
In the topology shown in the above diagram, Router_B advertises only 180.180.0.0/16 to Router_D because Router_B is a boundary between two major networks, network 180.180.0.0 and network 170.170.0.0. For this same reason, Router_B advertises only network 170.170.0.0/16 to Router_A. Because Router_B is doing the summarization, it installs a route for the summarized address with a next hop of null0 (see the output below). Router_B#sh ip route 170.170.0.0 Routing entry for 170.170.0.0/16, 3 known subnets Attached (1 connections)
79
Variably subnetted with 2 masks Redistributing via eigrp 7 D 170.170.0.0/16 is a summary, 00:00:27, Null0 C 170.170.3.0/24 is directly connected, Ethernet0 D 170.170.4.0/24 [90/307200] via 170.170.3.4, 00:00:42, Ethernet0 Router_B#
Router_B#show ip eigrp topology 170.170.0.0 255.255.0.0 IP-EIGRP topology entry for 170.170.0.0/16 State is Passive, Query origin flag is 1, 1 Successor(s), FD is 281600 Routing Descriptor Blocks: 0.0.0.0 (Null0), from 0.0.0.0, Send flag is 0x0 Composite metric is (281600/0), Route is Internal Vector metric: Minimum bandwidth is 10000 Kbit Total delay is 1000 microseconds Reliability is 255/255 Load is 1/255 Minimum MTU is 1500 Hop count is 0 To get Router_B to advertise the subnets of networks 170.170.0.0 and 180.180.0.0, we could turn off autosummarization with the EIGRP no auto-summary configuration command. This step is usually desirable in topologies that have discontiguous networks. If in the above example of autosummarization Router_A were also configured to run EIGRP on its Ethernet link and autosummarization turned off, Router_B would receive the subnet information about 190.190.1.0/24 from Router_A. However, even if Router_B has autosummarization enabled, Router_B would not summarize 190.190.1.0 down to 190.190.0.0/16 when it advertised the network to Router_C because network 190.190.0.0/16 is not directly connected to Router_B. EIGRP autosummarizes internal networks; it does not autosummarize external networks. An external network is a network that originated in another autonomous system and was redistributed into this EIGRP network (redistribution is discussed in the next section) but which will not be summarized automatically. External networks can be summarized manually with the EIGRP ip summary-address eigrp interface subcommand. An example of manual summarization follows: In the topology of the previous figure, if the network 190.190.1.0 is not part of the EIGRP process running on Router_A, the only way to get the network advertised is to redistribute it into EIGRP, making it an external route as far as Router_B is concerned. See below: Router_B#sh ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - IS-IS, L1 - ISIS level-1, L2 - ISIS level-2, * - candidate default U - per-user static route Gateway of last resort is not set D EX D C D C D 190.190.0.0/24 is subnetted, 1 subnets 190.190.1.0 [170/2560256256] via 180.180.1.1, 00:00:31, Serial0 180.180.0.0/16 is variably subnetted, 2 subnets, 2 masks 180.180.0.0/16 is a summary, 04:10:02, Null0 180.180.1.0/24 is directly connected, Serial0 170.170.0.0/16 is variably subnetted, 3 subnets, 2 masks 170.170.0.0/16 is a summary, 04:09:47, Null0 170.170.3.0/24 is directly connected, Ethernet0 170.170.4.0/24 [90/307200] via 170.170.3.4, 04:10:02, Ethernet0
As we can see above, Router_B receives all the subnet information for network 190.190.1.0; the network is not autosummarized. It shows up as an external network denoted by the D EX flag in the above output.
80
Now if we configure the following ip summary-address eigrp 7 190.190.0.0 255.255.0.0 command, under the serial interface of Router_A, we will see the following in the routing table of Router_B: Router_B#sh ip route Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP i - ISIS, L1 - ISIS level-1, L2 - ISIS level-2, * - candidate default U - per-user static route Gateway of last resort is not set D D C D C D 190.190.0.0/16 [90/2560256256] via 180.180.1.1, 00:00:08, Serial0 180.180.0.0/16 is variably subnetted, 2 subnets, 2 masks 180.180.0.0/16 is a summary, 00:51:59, Null0 180.180.1.0/24 is directly connected, Serial0 170.170.0.0/16 is variably subnetted, 3 subnets, 2 masks 170.170.0.0/16 is a summary, 00:00:08, Null0 170.170.3.0/24 is directly connected, Ethernet0 170.170.4.0/24 [90/307200] via 170.170.3.4, 00:51:59, Ethernet0
Notice that the network has been summarized and is now seen as an internal EIGRP network. The ip summary-address eigrp command allows us not only to summarize subnets down to the major network, but also to aggregate major networks into a single supernet. For example, networks 200.200.64.0/24 through 200.200.95.0/25 could be aggregated by the single route 200.200.64.0/18.
It is possible for a router to be running multiple routing processes. These processes could be different processes of the same routing protocol or different routing protocols altogether. When the processes are different protocols, the router needs a way to determine which route to install in the routing table. This determination is done with the administrative distance. Administrative distance is a number given to each protocolthe lower the administrative distance, the more believable the protocol is as far as the router is concerned. The administrative distance is a number that is local to the router; it is not included in any advertisements. EIGRP has an administrative distance of 90; external EIGRP has an administrative distance of 170. A list of route sources and their default administrative distances follows:
Connected Route Static Route EIGRP Summary Route External BGP EIGRP Internal Route IGRP OSPF ISIS RIP EGP EIGRP External Route Internal BGP
81
Redistribution is the means of taking routes learned via a given routing protocol and advertising those routes via another routing protocol or a different process of the same protocol. When routes are redistributed into EIGRP, they become EIGRP external routes, as we saw in previous examples. These external routes will have an administrative distance of 170. The redistribute command is used to redistribute routes into EIGRP. If the routes that we want to redistribute into EIGRP are learned via a protocol that does not have EIGRP-compatible metrics (IGRP is the only other protocol that has EIGRP-compatible metrics), we must tell EIGRP what metric to use when it advertises the route. We do this as part of the redistribute command or with the default-metric command. We will discuss redistribution in slightly more detail in one of the configuration labs.
Like other Interior Gateway Protocols (IGPs), EIGRP load balances across equal cost paths to a given destination. By default, EIGRP installs up to four equal cost paths into the routing table. The variance command allows EIGRP to load balance over unequal cost paths. Variance is a multiplier by which we multiply the best metric to a given destination. Any path to the same destination that has a metric less than the best path multiplied by variance will be installed in the routing table. For example, we have four paths to a given destination with the following metrics: Path 1: 500 Path 2: 500 Path 3: 1100 Path 4: 2000 By default, the router with the above metrics to a given destination installs Paths 1 and 2 in the routing table because they have equal metrics. If variance 3 is configured under router eigrp for this router, then Paths 1, 2, and 3 are all installed in the routing table because 1100 < (3 x 500). However, Path 4 is not installed because 2000 is not less then 3 x 500. The router then divides the traffic between Paths 1, 2, and 3 by dividing the metric for each path into the largest metric allowed, with variance and rounding down to the nearest integer. For example: Path 1 1100/500 = 2 Path 2 1100/500 = 2 Path 3 1100/1100= 1 Thus we will send two packets via Path 1, one packet via Path 2, and one packet via Path 3. Then the router starts with Path 1 again.
A default route is a route in the routing table that is used as a last resort for a particular destination, meaning that there aren't any more specific routes in the routing table for the destination. A default route can be injected into EIGRP in three different ways. One way is to use the ip default-network command. This command works the same way that it does for IGRP. The second way is to have a static default route and redistribute it into EIGRP. For example: ip route 0.0.0.0 0.0.0.0 x.x.x.x (x.x.x.x is the next hop) router eigrp 10 redistribute static metric 10000 1 255 1 1500 Finally, a third way is to use manual summarization to generate a default route. For example: int s 0 ip summary-address eigrp 7 0.0.0.0 0.0.0.0
This third method is desirable when we want to limit to whom we want to send the default route. Because the ip summary-address eigrp command is an interface subcommand, we have flexibility on a per-interface basis.
82