VIRTUALRACK for Network Engineers: October 2011

Monday, October 31, 2011

notes: NBAR

Cisco offers multiple approaches to identify packets to mark. For example, packets can be classified and marked if they match a particular access list or if they come into a router on a particular interface. However, one of the most powerful Cisco IOS tools for performing packet classification is Network-Based Application Recognition (NBAR), which can look beyond Layer 4 information, all the way up to the application layer, where NBAR can recognize such packet attributes as character strings in a URL.

- NBAR is a classification engine that can identify traffic/protocols at an application level.
- NBAR looks into the TCP/UDP payload itself and classifies packets based on content within the payload such as that transaction
identifier, message type, or other similar data.
- NBAR natively supports many predefined application/protocols, which can be seen with "match protocol ?"
- A PDLM (Packet Description Language Modules) is a file that can extend the protocols that NBAR can recognize.
- New PDLMs can be downloaded from Cisco.com and can be loaded from flash memory.
- NBAR protocol discovery can be used to track and provide statistics on which protocols transits an interface.
- Custom NBAR mappings allow well-known protocols to be defined in the network as NBAR protocols with "ip nbar port-map".

- "match protocol http" explained:
> Using NBAR to match HTTP traffic provides 3 match criteria’s:
> Domain Hostname - The URL portion between 'http://' and the first slash '/'
> URL-entry - The URL portion after the first slash '/'
> Mime type - The media content of a website.

-----------
COMMANDS
-----------

- Shows the default NBAR port mappings for applications

sh ip nbar port-map

- Shows the version of the PDLM's

sh ip nbar version

- Shows traffic classes and statistics NBAR discovered

sh ip nbar protocol-discovery

- Matches NBAR applications in a class-map

class-map {name}
match protocol {protocol}

- Specifies where to load a new PDLM from

ip nbar pdlm {unc path}

- Maps well-known port/s of a protocol to an NBAR application

ip nbar port-map custom {name} {tcp|udp} {port|range}

- Enables NBAR protocol discovery

interface s0/0
ip nbar protocol-discovery

notes: Cisco Modular QoS CLI (MQC)

- MQC is short for Modular Quality of Service CLI (Command Line Interface).
- MQC provides a framework for multiple QOS methods to be applied in the same direction on the same interface in contrast to legacy QOS mechanisms.

Mechanics of MQC
MQC separates the classification function of a QoS tool from the action (PHB) that the QoS tool wants to perform. To do so, there are three major commands with MQC, with several subordinate commands:
■ The class-map command defines the matching parameters for classifying packets into
service classes.
■ The PHB actions (marking, queuing, and so on) are configured under a policy-map
command.
■ The policy map is enabled on an interface by using a service-policy command.

Class-maps

- The purpose of class-maps are to classify traffic.
- Class-map names are Case-Sensitive.
- The match sub-commands are used to specify various criteria for classifying packets.
- If a packet matches the specified criteria, that packet is considered a member of the class.
- If a packet does not match the class criteria, it is evaluated against the next class.
- Packets that fail to match any of the class-maps are classified as members of the default traffic class.
- If more than one "match" criterion exists in the class-map, a evaluation instruction should be specified.
- The instruction could be one of the following: ('match-all' is the default)
- match-any - The traffic being evaluated by the class-map must match one of the "match" statements.
- match-all - The traffic being evaluated by the class-map must match ALL of the "match" statements.

Policy-maps

- Are used to configure the QOS features that should be associated with the traffic that has been classified with class-maps.
- Policy-map names are Case-Sensitive.
- Multiple class-map can be referenced, which will be evaluated sequentially top-down.

MQC Class-Default
- MQC always has a default class created named 'class-default'.
- Any traffic not matched by a higher class will belong to the class class-default.
- If no other class-maps were defined in a policy-map, ALL traffic will belong to the class class-default.

Steps to configure MQC policies:
1. Define traffic classifications using class-maps.
2. Create the policy-map, and apply the QOS features to the individual class-maps.
3. Apply the policy-map to a interface inbound or outbound.

- MQC Classification options
> Access-lists
> DSCP
> IP Precedence
> NBAR (see below)
> Packet Length
> FR-DE
> Interface
> QOS-group

- MQC Marking options
> Atm-clp
> Cos
> Discard-class
> Dscp
> Fr-de

- Matching VOIP traffic can be done in two ways:
> Matching UDP/RTP headers and RTP port numbers:

class-map VOIP
match ip rtp 16384 16383

> Using NBAR (Specifies matching for RTP voice payload type values 0-23)

class-map VOIP
match ip rtp audio

-----------
COMMANDS
-----------

- Shows the configured class-map/s

sh class-map [name]

- Shows the configured policy-map/s

sh run policy-map [name]

- Shows the policy-map info and counters associated with the interface

sh policy-map interface {int}

- Creates a class-map for classification, (default = match-all)

class-map [match-all | match-any] {name}

- Specifies the various match criteria

match {options}

- Creates a policy-map

policy-map {name}

- References previously created class-maps

class {name | class-default}

- Specifies a specific QOS feature for the class

{bandwidth | priority | shape | policy}

References nested policy-maps

service-policy {nested-policy}

- Applies a policy-map to an interface

interface s0/0
service-policy {input | output} {policy-name}

Saturday, October 29, 2011

notes: Qos Packet Headers

ToS byte - 1-byte field. The ToS byte was intended to be used as a field to mark a packet for treatment with QoS

IP Precedence – 3 bit field in ToS Byte of the IP header originally used to classify & prioritize types of traffic.

DiffServ – maintains the interoperability w non-Diffserv compliant devices (IP Precedence)

- standardized a redefinition of the ToS byte.

The ToS byte itself was renamed the Differentiated Services (DS) field, and IPP was replaced with a 6-bit field (high-order bits 0–5) called the Differentiated Services Code Point (DSCP) field. Later, RFC 3168 defined the low-order 2 bits of the DS field for use with the QoS Explicit Congestion Notification (ECN) feature.

DSCP Settings and Terminology

1. Expedited Forwarding (EF) - defines a DSCP of decimal 46, with a name . According to that RFC, packets marked as EF should be given queuing preference so that they experience minimal latency, but the packets should be policed to prevent
them from taking over a link and preventing any other types of traffic from exiting an interface during periods when this high-priority traffic reaches or exceeds the interface bandwidth. These suggested settings, and the associated QoS behavior recommended when using each setting, are called Per-Hop Behaviors (PHBs) by DiffServ. (The particular example listed in this paragraph is called the Expedited Forwarding PHB.)
- The EF traffic class is given strict priority queueing above all other traffic classes.
- The design aim of EF is to provide a low loss, low latency, low jitter, end-to-end expedited service through the network.
- The EF traffic class is suitable for voice, video and other real-time services.
2. Class Selector PHB - Class Selector (CS) PHBs, that provide backward compatibility
with IPP. A C&M feature can set a CS DSCP value, and if another router or switch just looks at the IPP field, the value will make sense from an IPP perspective.
- Each IP precedence value gets mapped to a DiffServ value known as Class-Selector code-points.
- The CS code-points above are in the form ‘xxx000'.
- The first three bits ‘xxx’ are the IP precedence bits for backwards compatibility, while the last 3 bits are set to zero.
- If a packet is received from a non-DiffServ aware router that used IP precedence markings, the DiffServ router can still understand the encoding as a Class-Selector code-point.

3. Assured Forwarding
- The AF behaviour allows the operator to provide assurance of delivery as long as the traffic does not exceed the subscribed rate.
- Traffic that exceeds the subscription rate faces a higher probability of being dropped during times of congestion.
- The DiffServ architecture defines 4 separate classes in the AF PHB (Per Hop Behaviour).
- Within each class (1 to 4), packets are given a drop precedence (1 to 3) (low=1, medium=2 or high=3).
- The 1st three bits of the six-bit DSCP field define the class, the next two bits define the drop-probability, and the last bit is reserved (= zero).
- AF is presented in the format AFxy, where ‘x’ represents the AF-class (HIGHER class value is more PREFERRED) and ‘y’ represents the drop-probability (HIGHER value is more likely to be DROPPED).

- AF23, for example, denotes class 2 and a high drop preference of 3.
- If AF23 was competing with AF21, AF23 will be dropped before AF21, since they in the same class and AF23 has higher drop value.
- But if AF33 and AF21 was competing, AF33 is a more important class, therefore AF21 will be dropped first.
- A nice formula to work out the decimal value of the AF bits, will be 8x+2y. Example AF31 = (8*3) + (2*1), thus AF31 = 26.
- Alternatively if the predefined DiffServ values are not used, any of the 64 DSCP values (0-63) can be used, by configuring just
that decimal value. (The higher the decimal value the more preferred)

Ethernet LAN Class of Service
Ethernet supports a 3-bit QoS marking field, but the field only exists when the Ethernet header includes either an 802.1Q or ISL trunking header. IEEE 802.1Q defines its QoS field as the 3 mostsignificant bits of the 2-byte Tag Control field, calling the field the user-priority bits. ISL defines the 3 least-significant bits from the 1-byte User field, calling this field the Class of Service (CoS).
Generally speaking, most people (and most IOS commands) refer to these fields as CoS, regardless of the type of trunking. Figure 12-2 shows the general location of the CoS field inside ISL and 802.1P headers.

notes: QoS Overview

Voice, video, and data travel side by side over today’s converged networks. Some of these traffic types (for example, VoIP) need better treatment (that is, higher priority) than other types of traffic (for example, FTP). Fortunately, Cisco offers a suite of QoS tools for providing special treatment for special traffic.
In the absence of QoS, traffic might suffer from one or more of the following symptoms:

1. Delay (latency): Excessive time required for a packet to traverse the network
2. Delay variation (jitter): The uneven arrival of packets, which in the case of VoIP can be interpreted by the listener as dropped voice packets
3. Packet loss: Dropping packets, especially problematic for User Datagram Protocol (UDP) traffic (for example, VoIP), which does not retransmit dropped packets.

- The TX-Ring/Hardware queue is always FIFO. It can be seen with the "sh controllers" command.
- QOS affects how traffic is processed in the output queue/software queue before the hardware queue.
- Queueing is always applied outbound to the interface.
- Shaping is always applied outbound to the interface.
- Policing can be applied inbound or outbound to the interface.
- The default input hold-queue limit is 75 packets. 10 packets for async interfaces.
- The default output hold-queue limit is 40 packets. 10 packets for async interfaces.
- A length of 1000 will normally resolve problems caused by input queue drops of TCP ACKs, but will introduce bigger delay.

commands:

- Shows the TX queue length for an interface

sh controllers Se0/0 | i tx_limit

- Changes the (default=6) telnet marking for telnets from the local router

ip telnet tos {tos-value} 

- Changes the TX queue length for an interface
- Sets the length of time used for load calculations
- This command limits the size of the IP queue on an interface

interface S0/0

tx-ring-limit {number}

load-interval seconds 

hold-queue {length} {in|out}

notes: EIGRP Error Messages

Some EIGRP error messages that occur in the log have mystified many network admin-istrators. This section discusses some of the most common EIGRP errors that appear and the meanings behind these EIGRP error messages:

DUAL-3-SIA—

This message means that the primary route is gone and no feasible successor is available. The router has sent out the queries to its neighbor and has not heard the reply from a particular neighbor for more than three minutes. The route state is now stuck in active state. A more detailed discussion about this error is in the "Troubleshooting EIGRP Neighbor Relationships" section.

Neighbor not on common subnet—

This message means that the router has heard a hello packet from a neighbor that is not on the same subnet as the router. A more detailed discussion about this error also can be found in the "Troubleshooting EIGRP Neighbor Relationships" section.

DUAL-3-BADCOUNT—

Badcount means that EIGRP believes that it knows of more routes for a given network than actually exist. It's typically (not always) seen in conjunction with DUAL-3-SIAs, but it is not believed to cause any problems by itself.

Unequal, <route>, dndb=<metric>, query=<metric>—

This message is informa-tional only. It says that the metric the router had at the time of the query does not match the metric that it had when it received the reply.

DUAL-3-INTERNAL: IP-EIGRP Internal Error—

This message indicates that there is an EIGRP internal error. However, the router is coded to fully recover from this internal error. The EIGRP internal error is caused by software problem and should not affect the operation of the router. The plan of action is to report this error to the TAC and have the experts decode the traceback message. Have them identify the bug number and upgrade Cisco IOS Software accordingly.

IP-EIGRP: Callback: callbackup_routes—

At some point, EIGRP attempted to install routes to the destinations and failed, most commonly because of the existence of a route with a better administrative distance. When this occurs, EIGRP registers its route as a backup route. When the better route disappears from the routing table, EIGRP is called back through callbackup_routes so that it can attempt to reinstall the routes that it is holding in the topology table.

Error EIGRP: DDB not configured on interface—

This means that when the router's interface receives an EIGRP hello packet and the router goes to associate the packet with a DDB (DUAL descriptor block) for that interface, it does not find one that matches. This means that the router is receiving a hello packet on the interface in which doesn't have EIGRP configured.

Poison squashed—

The router threads a topology table entry as a poison in reply to an update (the router set up for poison reverse). While the router is building the packet that contains the poison reverse, the router realizes that it doesn't need to send it. For example, if the router receives a query for that route from the neighbor, it is currently threaded to poison.

notes: Troubleshooting EIGRP Route Flapping

This section discusses how to troubleshoot consistent EIGRP route flapping. The most important tool for troubleshooting this problem is the show ip eigrp event command. This command reveals which neighbor is updating and the metric with which it's updating

debugging and verification

show ip route

Router B#debug ip routing 1
access-list 1 permit 192.168.1.0 0.0.0.255
access-list 1 deny any
show ip eigrp event

Solution:

Investigate the cause of route flapping it can be layer 1 to 2 issue. or loop issues.

notes: Troubleshooting EIGRP Route Summarization

Summarization is extremely important in a well-designed EIGRP network. Summarization is one of the few weapons to prevent stuck in active problems. Most summarization problems are the result of a misconfiguration of the router.

1. EIGRP Summarization Route Problem—Cause: Subnetworks of Summary Route Don't Exist in Routing Table.

verification:

show run
show ip route

Solution:

The solution to this problem is to configure an interface that falls in the summary range. You can configure a loopback interface with address w/in the summary range to generate the summary route configured.

2. EIGRP Summarization Route Problem—Cause: Too Much Summarization
Another EIGRP summarization route problem stems from when the summary route covers more subnetworks than exist.

debugs and verification:

show ip route x.x.x.x

solution

This problem is more of a design issue. The main issue is that Router B's summary route is too broad and includes nonexistent subnets. Also, Router A is sending a more general summary route (default route) to Router B. The solution is to have Router B send out only the summary route that covers the 172.16.1.0 through 172.16.15.0 networks.

notes: Troubleshooting EIGRP Redistribution Problems

In many instances, a problem occurs when redistributing from another routing protocol into EIGRP.

scenario1: the router is the border router between three routing protocols, RIP, OSPF, and EIGRP.

Router A wants to redistribute all the routes in the RIP domain into the EIGRP domain. The problem is that the network 150.150.0.0/16 is not getting redistributed into the EIGRP domain.

debugs and verification:

show ip eigrp topo 150.150.0.0 255.255.255.0

show ip route 150.150.0.0 255.255.0.0 -shows it was learned via ospf, since ospf has a lower admin distance than rip. the route via ospf was used.

solution:

The resolve this problem, you must make Router A install the RIP route instead of the OSPF route. One way to do this is to configure a distribute list under OSPF to not install the 150.150.0.0/16 route.

Scenario2: redistributing other protocols to eigrp without default metric command.

although the redistribute ospf command is configured under EIGRP, there is no configuration of the default-metric command. When redistributing between different routing protocols, the default-metric com-mand must be configured. When one routing protocol is being redistributed into another, the router doesn't have a way to translate the routing metric from one routing protocol into another. The default-metric command is used so that the network administrator can manually initialize the routing metric during route redistribution.

solution:
The fix for this problem: Configure a default metric under EIGRP in Router B.

notes: Troubleshooting EIGRP Route Installation

1. EIGRP Is Not Installing Routes—Cause: Auto or Manual Summarization

debugs and verification

show run | section router bgp
show up route x.x.x.x

solution:

The solution to this problem, based on this cause, is more of a design issue. Two places in the network must not send the same summary routes to one another. configure the no auto-summary command.

2. EIGRP Is Not Installing Routes—Cause: Higher Administrative Distance

deugs and verification

show ip eigrp topology 150.150.0.0 255.255.0.0

solution:

To fix this problem, you must change the administrative distance of the routing protocols so that external EIGRP routes are preferred. To do so, use the distance command to manipulate the administrative distance of a routing protocol.

distance 180 192.168.2.2 255.255.255.255

The distance command sets the RIP administrative distance to 180 for any updates coming from 192.168.2.2. This allows the external EIGRP routes (administrative distance of 170) coming from Router C to be preferred over RIP routes.

3. EIGRP Is Not Installing Routes—Cause: Duplicate Router IDs

Many times, EIGRP will not install routes because of a duplicate router ID problem. EIGRP does not use router ID as extensively as OSPF. EIGRP uses the notion of router ID only on external routes to prevent loops. EIGRP chooses the router ID based on the highest IP address of the loopback interfaces on the router. If the router doesn't have any loopback interfaces, the highest active IP address of all the interfaces is chosen as the router ID for EIGRP

debugs and verification:

sho run | section router eigrp
debug ip eigrp
show ip eigrp topology 150.150.0.0 255.255.0.0

solution:
The solution to the duplicate router ID problem is to change the IP address of the loopback interface of Router X or to change the IP address of Ethernet 0 in Router A. The rule of thumb: Never configure the same IP address on two places in the network.

Monday, October 24, 2011

notes: Troubleshooting EIGRP Route Advertisement

1. EIGRP Is Not Advertising Routes to Its Neighbors—Cause: Distribute List

debugs and verification

show run | section router eigrp

debug ip eigrp

router eigrp 1 

network 192.168.3.0 

network 10.0.0.0 

distribute-list 1 out

!

 access-list 1 permit 192.168.3.160 0.0.0.15

Solution: Ensure that the distribute list allows the rang of network that needed to be advertised.

2. EIGRP Is Not Advertising Routes to Its Neighbors—Cause: Discontiguous Networks

another issue with EIGRP not advertising the network could be manual summarization configured on the interface or auto-summarization across a major network boundary

debugs and verification:

show run | section router eigrp

debug ip eirgrp

Solution:

a. One is to configure the command no auto-summary under router eigrp. This command tells EIGRP not to autosummarize to major network boundaries.
b. to change the IP address of the serial interfaces on each side of the link and use different IP that is different with existing.

3. EIGRP Is Not Advertising Routes to Neighbors—Cause: Split-Horizon Issues

EIGRP has its own split-horizon command. This command, configured under the inter-face,

debugs and verification:

show run | section router eigrp

debug ip eirgrp

Solution:

a. disabling Split horizon

interface serial 0 

ip address 172.16.2.1 255.255.255.0 

no IP split-horizon EIGRP 1

b. Another fix for the split-horizon problem is to configure subinterfaces on the hub router and assign different IP address subnets for each subinterface.

4. EIGRP advertises unexpected routes to its neighbors.

Router B# interface ethernet 0 

ip address 192.168.130.1 255.255.255.0 

interface serial 0 

ip address 10.1.1.2 255.255.255.0 

router eigrp 1      network 192 .168.130.0       

network 10.0.0.0 

ip route 192.168.1.0 255.255.255.0 ethernet 0 

ip route 192.168 2.0 255.255.255.0 ethernet 0   

ip route 192.168 3.0 255.255.255.0 ethernet 0   

ip route 192.168 4.0 255.255.255.0 ethernet 0 

.

.

.

ip route 192.168.127.0 255.255.255.0 ethernet 0

without inserting the redistribute static command under the router eigrp command in Router B, Router B automatically redistributes all the 127 static routes configured to Router A. This can cause unnecessary routes being advertised inadvertently throughout the entire network. The cause of the problem is that the static routes are configured with the outbound interface. In this case, the router thinks that all the static routes are directly connected to the Ethernet 0 interface. These Ethernet interfaces also are covered under the router EIGRP process by the network 192.168.130.0 command. Because Ethernet 0 is considered to run EIGRP, all the networks connected to it by a static route also are considered to belong to the EIGRP process. The router then advertises all these static routes even though redistribute static is not configured.

solution:

a. to configure a distribute list that prevents the router from advertising all those static routes

router eigrp 1       

network 192.168.130.0 

network 10.0.0.0 

distribute-list 1 out   

!

 access-list 1 deny 192.168.0.0 0.0.127.255   

access-list 1 permit any

b. or to change the static routes to reference the next-hop IP addresses instead of an interface. This way, the router will not advertise all these static routes and flood the entire network with unnecessary routes.

ip route 192.168.1.0 255.255.255.0 192.168.130.2 

ip route 192.168.2.0 255.255.255.0 192.168.130.2 

ip route 192.168.3.0 255.255.255.0 192.168.130.2 

ip route 192.168.4.0 255.255.255.0 192.168.130.2 

.

.

.

ip route 192.168.127.0 255.255.255.0 192.168.130.2

5. EIGRP Is Advertising Routes with Unexpected Metric

advertise an unexpected metric to its neighbors. The EIGRP metric is the basis of route selection done by EIGRP, which selects the route with the lowest EIGRP metric to the destination network. An unexpected EIGRP metric being sent or received on the router might alter route selection to the destination network. The end result might be suboptimal routing.

debugs and verification:

show ip route x.x.x.x

show eigrp topology x.x.x.x y.y.y.y

Solution: Ensure there are no off-set list configured that changes the metrics of the outgoing interface.

Tuesday, October 18, 2011

notes: Troubleshooting EIGRP Neighbor Relationships

1. EIGRP Neighbor Problem—Cause: Unidirectional Link

A one-way neighbor relationship usually is caused by a unidirectional connection between the neighbors. The cause for unidirectional connection is usually a Layer 2 problem.

verification:

show ip eigrp neighbor

RtrB#show ip eigrp neighbors

IP-EIGRP neighbors for process 1

H Address   Interface Hold Uptime  SRTT   RTO  Q  Seq

                                   (sec) (ms)  Cnt Num

1 10.88.18.2    S0    14   00:00:15  0    5000  4    0

the fact that the SRTT timer is 0 indicates that no acknowledge-ment packets are being received. The Q count is not decrementing, which indicates that the router is trying to send EIGRP packets but no acknowledgement is being received. RTR B will retry 16 times to resend the packet; eventually, RTR B will reset the neighbor relationship with the log indicating RETRY LIMIT EXCEEDED, and the process starts again. Also, keep in mind that the 16 times retransmission of the same packet is done using unicast, not multicast. Therefore, the RETRY LIMIT EXCEEDED message indicates a problem with transmitting unicast packets over the link, and this is most likely a Layer 1 or Layer 2 problem.

Solution:

RtrB#show ip eigrp neighbors

IP-EIGRP neighbors for process 1

H Address   Interface Hold Uptime  SRTT   RTO  Q  Seq

                      (sec)        (ms)       Cnt Num

1 10.88.18.2    S0    14   01:26:30 149   894  0  291

Notice that the Q count column is 0 and that the SRTT and RTO have valid values now.

2. EIGRP Neighbor Problem—Cause: Uncommon Subnet

Many times, EIGRP won't establish neighbor relationships because the neighbors are not in the same subnet. Usually, the cause of this problem is router misconfiguration. When EIGRP has problems establishing neighbor relationships because of an uncommon subnet, the following error message appears:

IP-EIGRP: Neighbor ip address not on common subnet for interface

3 possible cause

a. The IP address has been misconfigured on interfaces.

Solution: configure the correct IP address on interface.

b. The primary and secondary IP addresses of the neighboring interface don't match.

EIGRP sources the hello packet from the primary address of the interface. If the primary network address on one router is used as a secondary network address on the second router, and vice versa, no neighbor relationship will be formed and the routers will complain about the neighbor not being on a common subnet.

Solution: Ensure Primary and secondary IP matches on both sides of the link.

c. A switch or hub between the EIGRP neighbor connection is misconfigured or is leaking multicast packet to other ports.

If a single LAN hub connects the EIGRP neighbors for different LAN segment, the hub passes broadcast and multicast packets to other ports between two logical LAN seg-ments. So, the multicast EIGRP hello from LAN segment 1 will be seen on the neighbor located in LAN segment 2 if a single hub connects all the LAN devices on different LAN segments. The solution is to break up the broadcast domain by using a separate hub for each LAN segment or simply configuring no eigrp log-neighbor-warnings under EIGRP con-figuration to stop seeing the error message.

3. EIGRP Neighbor Problem—Cause: Mismatched Masks

Solution:

The solution for this problem: Configure the right subnet mask on Router.

4. EIGRP Neighbor Problem—Cause: Mismatched K Values

For EIGRP to establish its neighbors, the K constant value to manipulate the EIGRP metric must be the same.

Troubleshooting this problem requires careful scrutiny of the router's configuration. The solu-tion for this problem is to change all the K values to be the same on all the neighboring routers.

5. EIGRP Neighbor Problem—Cause: Mismatched AS Number

EIGRP won't form any neighbor relationships with neighbors in different autonomous systems.

Solution: Ensure AS are the same.

6. EIGRP Neighbor Problem—Cause: Stuck in Active

Sometimes, EIGRP resets the neighbor relationship because of a "stuck in active" condition. The error message is

%DUAL-3-SIA: Route network mask stuck-in-active state in IP-EIGRP AS. Cleaning up

Reviewing the EIGRP DUAL Process

To resolve an EIGRP stuck in active error, you need to understand the DUAL process in EIGRP. Refer to Chapter 6 for thorough coverage of the DUAL process, although it is reviewed here as well.

EIGRP is an advanced distance-vector protocol; it doesn't have LSA flooding, like OSPF, or a link-state protocol to tell the protocol the overall view of the network. EIGRP relies only on its neighbors for information on network reachability and availability. EIGRP keeps a list of backup routes called feasible successors. When the primary route is not available, EIGRP immediately uses the feasible successor as the backup route. This shortens convergence time. Now, if the primary route is gone and no feasible successor is available, the route is in active state. The only way for EIGRP to converge quickly is to query its neighbors about the unavailable route. If the neighbor doesn't know the status of the route, the neighbor asks its neighbors, and so on, until the edge of the network is reached. The query stops if one of the following occurs:

All queries are answered from all the neighbors.

The end of network is reached.

The lost route is unknown to the neighbors.

The problem is that, if there are no query boundaries, EIGRP potentially can ask every router in the network for a lost route. When EIGRP first queries its neighbor, a stuck in active timer starts. By default, the timer is three minutes. If, in three minutes, EIGRP doesn't receive the query response from all its neighbors, EIGRP declares that the route is stuck in active state and resets the neighbor that has not responded to the query.

Determining Active/Stuck in Active Routes with show ip eigrp topology active

You must answer two questions to troubleshoot the EIGRP stuck in active problem:

Why is the route active?

Why is the route stuck?

Determining why the route is active is not a difficult task. Sometimes, the route that constantly is going active could be due to flapping link. Or, if the route is a host route (/32 route), it's possible that it is from a dial-in connection that gets disconnected. However, trying to deter-mine why the active route becomes stuck is a much harder task—and more important to learn. Usually, an active route gets stuck for one of the following reasons:

Bad or congested links

- Low router resources, such as low memory or high CPU on the router

- Long query range

- Excessive redundancy

By default, the stuck in active timer is only three minutes. In other words, if the EIGRP neighbor doesn't hear a reply for the query in three minutes, neighbors are reset. This adds difficulty in troubleshooting EIGRP stuck in active because every time an active route is stuck, you have only three minutes to track down the active route query path and hopefully find the cause.

The tool that you need to troubleshoot the EIGRP stuck in active error is the show ip eigrp topology active command. This command shows what routes are currently active, how long the routes have been active, and which neighbors have and have not replied to the query. From the output, you can determine which neighbors have not replied to the query, and you can track the query path and find out the status of the query by hopping to the neighbors that have not replied.

Methodology for Troubleshooting the Stuck in Active Problem

The methods for troubleshooting an EIGRP stuck in active problem and the show ip eigrp topology active command are useful only when the problem is happening. When the stuck in active event is over and the network stabilizes, it is extremely difficult, if not impossible, to backtrack the problem and find out the cause.

Solution:

The ultimate solution for preventing the EIGRP stuck in active problem is to manually sum-marize the routes whenever possible and to have a hierarchical network design. The more network EIGRP summarizes, the less work EIGRP has to do when a major convergence takes place. Therefore, this reduces the number of queries being sent out and ultimately reduces the occurrence of an EIGRP stuck in active error.

notes: Troubleshooting BGP Filtering

1. Problem: Standard Access List Fails to Capture Subnets

debugs and verification:

R1# router bgp 1    

neighbor 131.108.1.2 remote-as 2    

neighbor 131.108.1.2 distribute-list 1 in 

!

access-list 1 permit 13.13.0.0 0.0.255.255

distribute-list 1 means that any BGP updates that come from 131.108.1.2 will be examined by access list 1.

Access list 1 has a permit statement for 13.13.0.0 with an exact match of the first two octets (13.13); it doesn't care about the last two octets (0.0).

using standard access-list doesnt care about the mask so show ip bgp command output shows, some subnets of 13.13.0.0 with some variable subnets.

Solution:

- use extended access-list.

access-list 101 permit ip 13.13.0.0 0.0.255.255 255.255.0.0 0.0.0.0

The extended access list has two parts:

The network part— 13.13.0.0 0.0.255.255, which allows 13.13.x.x, where x can any number between 0 and 255.
The mask part— 255.255.0.0 0.0.0.0. With all 0s in wildcard, the mask can only be 255.255.0.0, meaning /16.

2. Problem: Extended Access Lists Fails to Capture the Correct Masked Route

To reduce the size of Internet BGP/routing tables, BGP operators are forced to advertise aggregated prefixes and suppress subnetted IP blocks. To achieve this, almost all ISPs expect their peering ISPs and customers to advertise aggregated blocks of, say, /21 (255.255.248.0) of IP blocks and will refuse to accept any prefix with a mask greater than /21. Proper BGP filtering must be in place at peering points so that prefixes with masks greater than /21 can be filtered out and only prefixes with masks less than /21 are accepted.

verification:

show ip bgp

Solution:

The two solutions are as follows:

a. Use an extended access list.

An extended access list that would permit any IP network whose mask is /21 or lower (20, 19, and so on) is configured as follows:

access-list 101 permit ip 0.0.0.0 255.255.255.255 255.255.248.0 255.255.248.0

0.0.0.0 255.255.255.255 means any IP network.

255.255.248.0 255.255.248.0 means that a mask of this prefix can be only /21 or lower (/20, /19, and so on). Cisco IOS Software has an implicit deny at the end of each access list, so all prefixes whose masks are greater than 21 are denied.

router bgp 109 

neighbor 131.108.1.2 remote-as 110 

neighbor 131.108.1.2 distribute-list 101 Out

b. Use a prefix list.

Apart from distribute lists, prefix lists can be used to achieve the same goal.

You can apply the following prefix list to R1 and R2 in a similar fashion as a distribute list with both the neighbor statement and with a route map:

ip prefix-list FILTERING seq 5 permit 0.0.0.0/0 le  21

0.0.0.0 means any prefix, and /0 le 21 means that the mask of any prefix could be from 0 and less than or equal (le) to 21. All other higher-masked prefixes (/22, /25, /26, and so on) will be denied because of an implicit deny at the end of each Cisco IOS Software filter.

The distribute list and prefix list take effect when updates come from a neighbor. If BGP updates already have been received, applying the distribute list or prefix list will have no effect. To receive updates from neighbors, routers must restart the BGP session by using the commands clear ip bgp neighbor or clear ip bgp neighbor soft in, if soft reconfiguration is enabled. Refer to the Cisco IOS Software manual for more details on this command. A recent feature of Cisco IOS Software called route refresh automatically requests fresher updates from a neighbor when any policy, such as a distribute list or a prefix list, gets applied. This feature does not require clearing of the current BGP session.

3. Problem: AS_PATH Filtering Using Regular Expressions

All BGP updates that contain an announcement of IP prefixes have an AS_PATH field that lists all the autonomous systems that this update has traversed. BGP operators use filtering against this AS_PATH field to allow or deny IP prefixes and also to apply BGP policy based on AS_PATH filtering. This method offers greater flexibility in applying just a single line of filtering and not listing all IP prefixes, as in the case of distribute lists or prefix lists.

notes: Troubleshooting BGP Best-Path Calculation Issues

1. Problem: Path with Lowest RID Is Not Chosen as Best

This is the scenario in which two or more paths from EBGP neighbors have identical BGP attributes and BGP best-path selection is done based on the RID. The BGP best-path selection rule states that, in case all other attributes are identical, the path with the lowest RID should be selected as best. In this case, the path with the highest RID is selected as best.

In Cisco IOS Software, if BGP selects a best path based on the RID and a new path comes in with a lower RID, with all other attributes being equal, the previously selected best path will not be toggled and will remain unchanged. This is done intentionally in Cisco IOS Software to maintain stability in BGP paths because newly selected paths must be advertised to all BGP neighbors, and the previous one must be withdrawn. To avoid this churn, BGP in Cisco IOS Software does not select a new best path if the previous path selected was done based on RID.

debugs and verification:

show ip bgp a.b.c.d - resuls shows that the one with highest RID is the best route.

Solution:

bgp bestpath compare-routerid

this command compare the RIDs of all the paths and pick the lowest RID as the best in BGP best-path calculation. The effect of this configuration change takes place when the BGP scanner runs. (It runs every minute in Cisco IOS Software.)

2. Problem: Lowest MED Not Selected as Best Path

One BGP rule that must be kept in mind is the rule of MED comparison. By default, Cisco IOS Software will not compare the MEDs if two paths came from different autonomous systems.

debugs and verification:

show ip bgp a.b.c.d

example scenario:

The output in 15-95 shows that R1 has three paths in this order:

Path 1: This path is from R5 (RID 5.5.5.5), with a MED of 30.

Path 2: This path is from R4 (RID 4.4.4.4), with a MED of 40.

Path 3: This path is from R3 (RID 3.3.3.3) with a MED of 50.

If the best-path selection algorithm described in Chapter 14 were run, the following would be the selection process:

- Path 1 is compared with Path 2. All BGP attributes are the same except for the MED. However, these two paths came from different autonomous systems—110 and 111, respectively—so the MED will not be the tiebreaker and will be ignored. The tiebreaker will be the RID. Based on the RID, path 2 has a lower RID (4.4.4.4) than path 1 (5.5.5.5). Therefore, path 2 is the winner.

- The winner of Step 1, path 2, is compared with path 3. Again, the MED will be ignored because of a different AS_PATH. The lower RID of path 3 (3.3.3.3) will win again path 2's RID (4.4.4.4).

- Path 3 is selected as best even though it has a higher MED than any of the paths (MED 50).

Solution:

bgp always-compare-med

The best path is the one that has the lowest MED. As stated earlier, choosing the path with the lowest MED could be crucial if links between autonomous systems are of different bandwidth and a path from a higher-bandwidth neighbor is sending a lower MED.

In addition, one important design recommendation is that the command bgp always-compare-med should be enabled on all the routes in an AS running BGP; otherwise, packet forwarding loops might occur. For example, Router A running this command might point its best path to Router B, whereas Router B without this command might point the best path back to Router A, resulting in a routing loop.

notes: Troubleshooting Inbound IP Traffic Flow Issues Because of BGP Policies

1. Problem: Multiple Connections Exist to an AS, but All the Traffic Comes in Through One BGP Neighbor, X, in the same AS—Cause: Either BGP Neighbor at X Has a BGP Policy Configured to Make Itself Preferred over the Other Peering Points, or the Networks Are Advertised to Attract Traffic from Only X.

debugs and verification:

There might be multiple reasons for this behavior, but two of the most common scenarios are as follows:

Case1: Upstream AS has the BGP policy configured so that all updates from your AS at location X get the LOCAL_PREFERENCE higher than at all other neighbors with your AS. This results in making X the preferred exit point from upstream AS 110 to your AS for some local subnets.

show ip bgp a.b.c.d

Case2: Your AS is influencing traffic by advertising different MED values for the prefix some of your local subnet at different locations.

Solution

Return traffic influence can be desired as in Case 2, or it might happen as in Case 1.

In Case 1, in which upstream AS changed its BGP policy by altering the LOCAL_PREFERENCE, BGP does not offer any commands for your AS to influence the upstream AS policy. Each AS can force its own policy, and the outside AS cannot change that. The solution for the Case 1 problem lies with the local AS administrator requesting AS 110 to remove any policy that affects your local AS.

In Case 2, your AS announced a MED and upstream AS was not configured to change LOCAL_PREFERENCE (as in Case 1).

If the MED announcement is not producing the desired behavior for your local AS inbound traffic management, these MEDs should be removed, and the normal BGP policies of upstream AS should decide on the best entry into AS 109.

In larger BGP networks with numerous exit points and multiple BGP AS connections, traffic balance could become a challenge. Therefore, careful BGP policies and peering agreements must be created between BGP speakers, and traffic flow must be carefully observed.

2. Problem: Multiple Connections Exist to Several BGP Neighbors, but Most of the Traffic from Internet to 100.100.100.0/24 Always Comes in Through One BGP Neighbor from AS 110—Cause: Route Advertisements for 100.100.100.0/24 in AS 109 Attract Internet Traffic Through That BGP Neighbor in AS 110

Topology:

When a BGP prefix is observed from a global Internet point-of-view, few BGP attributes stay intact from the originator of that prefix. For example, AS_PATH, ORIGIN_CODE and AGGREGATOR are the most common BGP attributes that get carried no matter how many autonomous systems a BGP update crosses. The most popular attributes, LOCAL_PREFERENCE and MED, do not cross an AS boundary. Therefore, they do not play any role in influencing return traffic from sources multiple autonomous systems away.

the most common BGP attributes that get used in the BGP best-path algorithm are LOCAL_PREFERENCE, AS_PATH and MED. Out of these, AS_PATH is the only attribute that stays intact from the originator of the prefix to any Internet BGP speaker.

Solution:

a. AS 109 advertises network 100.100.100.0/24 with a much longer AS_PATH list to all BGP neighbors except AS 110. If autonomous systems 110, 112, and 113 do not make any additional changes in the BGP policy, autonomous systems 112 and 113 always go through AS 110 to reach 100.100.100.0/24.

This results in all traffic to network 100.100.100.0/24 entering AS 109 to traverse AS 110; the links between AS 109 and AS 111 for redundancy.

b. AS 109 advertises 100.100.100.0/24 only to AS 110, not to BGP neighbor AS 111. Therefore, traffic from the Internet sees only one path to reach 100.100.100.0/24—through AS 110 to AS 109. However, this case loses redundancy if AS 109 loses its BGP session with AS 110.

notes: Troubleshooting Load-Balancing Scenarios in Small BGP Networks

1. Problem: Load Balancing and Managing Outbound Traffic from a Single Router When Dual Homed to Same ISP—Cause: BGP Installs Only One Best Path in the Routing Table

In multihomed scenarios, a common concern that enterprise network operators face is improperly utilizing the external links going to the ISP. Typically, enterprise customers dual-home to either the same or different ISPs to load-share outgoing and incoming traffic.

debugs and verification:

show ip bgp a.b.c.d

show ip route a.b.c.d

Solution:

Cisco IOS Software allows, by configuration, the installation of more than one route for the same prefix, This does come with a tight check: Multiple paths that are candidates to go in the routing table have the exact same BGP attribute except for the router ID (RID). If two or more paths have identical attributes except for the RID, they can go in the routing table and load sharing can be achieved for traffic going to that prefix.

maximum-path n

The maximum-path n command allows two equal BGP paths to be installed in the routing table. Cisco IOS Software allows a maximum of six equal paths. the BGP output, only one path has "best" in its output, but both have "multipath" and thus both will be installed in the routing table.

2. Problem: Load Balancing and Managing Outbound Traffic in an IBGP Network—Cause: By Default, IBGP in Cisco IOS Software Allows Only a Single Path to Get Installed in the Routing Table Even Though Multiple Equal BGP Paths Exist

If multiple paths are received from different IBGP neighbors for the same prefix, only one best path will be selected and installed in the routing table. This results in other alternate paths being unused.

debugs and verification:

show ip bgp a.b.c.d

show ip route a.b.c.d

Solution:

maximum-path ibgp n

For maximum-paths ibgp to work, the following conditions must be met:

In both paths, all BGP attributes—LOCAL_PREF, MED,ORIGIN, and AS_PATH (entire AS_PATH)—must be identical.

Both paths must be learned through IBGP.

Both paths must be synchronized.

Both paths must have a reachable next hop.

Both paths must have an EQUAL IGP cost to the next hop.

notes: Troubleshooting Outbound IP Traffic Flow Issues Because of BGP Policies

1. Problem: Multiple Exit Points Exist but Traffic Goes Out Through One or Few Exit Routers—Cause: BGP Policy Definition Causes Traffic to Exit from One Place

Solution:

Using the BGP attribute LOCAL_PREFERENCE is done commonly to predictably control the traffic leaving the local AS

by using route-map to match againts the network prefix or AS path.

- With the size of the BGP routing table today, it is difficult to manage traffic on a prefix-by-prefix basis.
- BGP attribute manipulation based on AS_PATH is a fairly common practice among savvy BGP operators because wildcard operations allow covering a larger number of prefixes to be checked in fewer lines of configuration.

2. Problem: Traffic Takes a Different Interface from What Shows in Routing Table—Cause: Next Hop of the Route Is Reachable Through Another Path

debugs and verification:

show ip bgp a.b.c.d.

show ip route next-hop ip

traceroute

Solution:

A router might provide a route to BGP neighbor but might never be in a forwarding path to reach that route. This is because packets are forwarded to the next-hop address of the actual route, which might not be the same router that gave the route in the first place.

3. Problem: Multiple BGP Connections to the Same BGP Neighbor AS, but Traffic Goes Out Through Only One Connection—Cause: BGP Neighbor Is Influencing Outbound Traffic by Sending MED or Prepended AS_PATH.

Typically, BGP networks are multihomed to different ISPs or the same ISP to provide redundancy or to load-share traffic. In some scenarios, the BGP network might be dual homed to the same ISP and might be running BGP with that ISP. Instead of load sharing traffic to the ISP over multiple connections, traffic might exit only from one connection.

Solution:

it can be solved in a number of ways.

a. Request upstream AS to send the proper MED for each prefix.

MED exchange with an EBGP peer is a tricky and bilateral game. Typically, BGP carriers accept MEDs only on a mutual basis in a process in which both carriers accept each other's MED. Accepting MED means that BGP carriers carry each other's traffic through the backbone and try to route the traffic in an optimal fashion.

b. Don't accept MED from upstream AS

Request upstream AS either to not send the MED or to manually set the MED to 0 at peering points X, Y, and Z and for all prefixes from upstream AS 110. This results your local AS picking the closest exit point, X, Y, or Z, for Prefixes P1, P2, and P3 through the lowest IGP (OSPF, IS-IS, and so on) cost to reach these exit points. Manually setting the MED to 0 can be done through a route map.

route-map influencing_traffic permit 10
set metric 0
!
R1# router BGP 109
neighbor 4.4.4.4 remote-as 110
neighbor 4.4.4.4 route-map influencing_traffic in

This route map should be applied at all EBGP connections between your AS and upstream AS.

c. Manually change LOCAL_PREFERENCE for P1, P2, and P3 at all the exit points, X, Y, and Z.

To use this solution, local AS must know which exit point is closer to which prefix.

4. Problem: Asymmetrical Routing Occurs and Causes a Problem Especially When NAT and Time-Sensitive Applications Are Used—Cause: Outbound and Inbound Advertisement

Asymmetric routing means that packets flowing to a given destination don't use the same exit point as the packets coming back from that same destination. This is not a problem in itself, but it can cause some issues when Network Address Translation (NAT) or a time-sensitive application is involved.

debugs and verification:

- traceroute

Solution:

The asymmetrical routing issue is a fairly difficult problem to tackle and sometimes is un-avoidable. Asymmetrical routing might be an issue in cases of NAT when only one device maintains the NAT table; therefore, packets must come in and out of the same device. Time-sensitive applications also might face problems when the exit path offers good throughput but the entry path is sluggish, making the overall round-trip time (RTT) bad.

Example topology:

viable solutions:

1. In the BGP configuration of AS 109, only R1 advertises 131.108.1.0/24 to R3 in AS 110. AS 110 will have only one way to reach 131.108.1.0/24, and that is through the R3–R1 link, ensuring symmetrical routing.

2. Both R1 and R2 are running EBGP with R3 and R4, respectively. From R1, adver-tise 131.108.1.0/24 to R3 with a MED of 1; from R2, advertise 131.108.1.0/24 to R4 with a MED of 20. AS 110 will have two advertisements, but the path from the lower MED (R1) will win and, in case the R1–R3 BGP connection fails, the path from R2 to R4 will be used. The use of the MED is discussed in detail in previous sections.

3. Using the as-path prepend option in Cisco IOS Software, R2 advertises 131.108.1.0/24 with the

AS_PATH list containing AS 109 several times.

router bgp 109

network 131.108.1.0 mask 255.255.255.0

neighbor 4.4.4.4 remote-as 110

neighbor 4.4.4.4 route-map SYMMETRICAL out

!

route-map SYMMETRICAL permit 10

match ip address 1

set as-path prepend 109 109 109

route-map SYMMETRICAL permit 20

!

access-list 1 permit 131.108.1.0

In short, proper BGP announcements must be made at exit points and routes must be learned at the right place of the AS. Smaller enterprise networks can achieve this rather easily with the prepended AS path solution, but larger enterprise and ISP networks face a bigger challenge to ensure symmetrical routing. This is because ISPs have a larger number of prefixes to advertise, a larger number of exit points, and a larger number of BGP peering relationships. Unless symmetrical routing is not a must, especially in the case of NAT, most networks today run fine with asymmetrical routing.

Sunday, October 16, 2011

note: Troubleshooting BGP Route-Reflection Issues

Route reflectors (RR), discussed in RFCs 1966 and 2796, are used to avoid IBGP full mesh in an AS, as required by RFC 1771. Route reflection ensures that all IBGP speakers in an AS receive BGP updates from all parts of the network without having to run IBGP between all the routers in the network. Route reflection reduces the number of required IBGP connections and also offers faster convergence in an IBGP network when compared with a full-mesh IBGP network.

Route-reflector clients (RRCs) typically peer IBGP with one or more RR, and they can have EBGP connections unconditionally. Logical BGP connections between RR and RRC typically follow the physical connection topology. These are some of the common rules that help BGP operators troubleshoot BGP route-reflector issues

1. Problem: Configuration Mistakes—Cause: Failed to Configure IBGP Neighbor as a Route-Reflector Client

The neighbor IP address must be the same in the route-reflector-client statement as in the remote-as configuration. The Cisco IOS Software BGP parser detects the misconfigured RRC IP address if BGP does not have an IBGP neighbor configured with this address.

Solution:

A BGP operator accidentally might configure a different IP address in the RRC than is configured in the neighbor statement where the remote AS is configured. If this problem is detected, the IP address must be corrected.

2. Problem: Route-Reflector Client Stores an Extra BGP Update—Cause: Client-to-Client Reflection

Debug and Verification:

show ip bgp a.b.c.d

Solution:

Turning off client-to-client reflection solves this problem. This problem arises only when an RRC peers IBGP with another RRC. When an RRC peers only with the RR, BGP does not run into this issue.

3. Problem: Convergence Time Improvement for RR and Clients—Cause: Use of Peer Groups

When an RR is serving many clients, any update that it receives from IBGP/EBGP peers must be generated and propagated as separate updates for each RRC. If the number of BGP updates and RRCs is large, this process could become CPU-intensive for the RR. This results in slower propagation of BGP updates and hence results in slower convergence in the network overall. Peer-group clubs configure BGP neighbors in one group. Any common update that needs to go to all members of the peer group are processed only once, and all members receive the copy of that processed update. A router that has a peer group does not process update for all members of the group, resulting in huge CPU processing savings. Overall convergence of the networks improves greatly.

Solution:

When peering to several neighbors, use the Cisco IOS Software BGP peer group feature to avoid the processing duplication required to generate the same update to every neighbor. In peer groups, BGP neighbors (in this case, all RRCs) are listed as members of a peer group that share the same outbound policy. RR computes an update for the first member of the peer group and simply replicates the same update to all members. This greatly reduces the number of CPU cycles that the RR has to spend to compute update for each RRC. In addition, using peer groups speeds up the process of propagating BGP updates to RRCs; therefore, RRCs converge faster in case of any churn. Peer groups can be used in normal IBGP and EBGP scenarios to get this benefit, with the condition that all peer-group members are configured with same outbound policy.

4. Problem: Loss of Redundancy Between Route Reflectors and Route-Reflector Client—Cause: Cluster List Check in RR Drops Redundant Route from Other RR

A cluster is made up of an RR and its clients. A cluster can have one or more RR and is identified by a cluster ID that is the router ID of the RR. Because each RR has a unique router ID, each cluster has only one RR by default. Network operators must manually configure identical cluster IDs on two or more RRs to configure them in the same cluster. When a BGP update traverses from an RR to other neighbors, RR adds its cluster ID in the list called the cluster list, which contains all cluster IDs that any BGP update has traversed. The cluster list is synonymous with the AS_PATH list, which contains AS lists that any update has traversed. Just as in AS_PATH loop detection, in which updates are dropped if the AS_PATH contains a local AS, the cluster list detects loops if they contain a local cluster ID.

debug and verification:

show ip bgp a.b.c.d

debug ip bgp update

Solution:

result of the cluster list check.

It is recommended that in cases similar to those depicted in Figure 15-33, RRs should not be put in the same cluster. The cluster ID will be picked as the router ID (RID) of each RR and is guaranteed to be unique because all RIDs are unique in any network.

RRs should not be put in the same cluster. The cluster ID will be picked as the router ID (RID) of each RR and is guaranteed to be unique because all RIDs are unique in any network.

notes: Troubleshooting BGP Route Not Installing in Routing Table

If the BGP process fails to create an IP routing table entry, all traffic destined for missing IP subnets in the routing table will be dropped. This is a generic behavior of hop-by-hop IP packet forwarding done by routers

1. IBGP-Learned Route Not Getting Installed in IP Routing Table—Cause: IBGP Routes Are Not Synchronized

IBGP will not install or propagate a route to other BGP speakers unless IBGP-learned routes are synchronized. Synchronization means that for an IBGP-learned route, there must exist an identical route in the IP routing table provided by an IGP (OSPF, IS-IS, and so on).

debugs and verification:

show ip bgp a.b.c.d

Solution:

- Synchronize all BGP routes.

 R1# router ospf 1

 redistribute static subnets

 network 131.108.1.0 0.0.0.255 area 0

R1# router bgp 109

 network 100.100.100.0 mask 255.255.255.0

 neighbor 131.108.10.2 remote-as 109

neighbor 131.108.10.2 update-source Loopback0

ip route 100.100.100.0 255.255.255.0 Null0

- Turning off synchronization

This method is widely used in almost all BGP networks

no synchronization 

2. IBGP-Learned Route Not Getting Installed in IP Routing Table—Cause: IBGP Next Hop Not Reachable

The cause of this problem is most common in IBGP-learned routes where BGP next-hop address should have been learned through an Interior Gateway Protocol (IGP). Failure to reach the next hop is an IGP problem, and BGP is merely a victim. With BGP, when IP prefixes are advertised to an IBGP neighbor, the NEXT-HOP attribute of the prefix does not change. The IBGP receiver must have an IP route to reach this next hop.

Debugs and Verification:

show ip route a.b.c.d.

sho ip bgp a.b.c.d  - next hop is inaccessible

Solution:

BGP requires the next hop of any BGP route to resolve to a physical interface. This might or might not require multiple recursive lookups in the IP routing table. Two common solutions exist for addressing this problem:

a. Announce the EBGP next hop through an IGP using a static route or redistribution.

b. Change the next hop to an internal peering address.

This solution is more widely used and is the preferred method of announcing the next hop to IBGP peer.

3. Problem: EBGP-Learned Route Not Getting Installed in IP Routing Table

3a. EBGP-Learned Route Not Getting Installed in IP Routing Table—Cause: BGP Routes Are Dampened.

Dampening is the way to minimize instability in a local BGP network caused by unstable BGP routes from EBGP neighbors. RFC 2439, "BGP Route Flap Damping," describes in detail how dampening works. In short, dampening is the way to assign a penalty for a flapping BGP route. A withdrawal of a prefix is considered a flap. A penalty of 1000 is assigned for each flap; if the flap penalty reaches the suppress limit because of continued flaps (default 2000), the BGP path is suppressed and is taken out of the routing table. This penalty is decayed exponentially based on the half-life time (default 15 minutes). When the penalty reaches the reuse value (default 750), the path is unsuppressed and is installed in the routing table and advertised to other BGP neighbors. Any dampened path can be suppressed only until the max suppress time (default 60 minutes). Dampening is applied only to EBGP neighbors, not to IBGP neighbors.

router bgp 1009
bgp dampening half-life-time reuse suppress maximum-suppress-time

half-life-time— Range is 1 to 45 minutes. Current default is 15 minutes.

reuse— Range is 1 to 20,000. Default is 750.

suppress— Range is 1 to 20,000. Default is 2000.

max-suppress-time— Maximum duration that a route can be suppressed. Range is 1 to 255. Default is four times half-life-time.

debug and verifications:

R1#debug ip bgp dampening 1

R1#debug ip bgp updates 1

access-list 1 permit 100.100.100.0 0.0.0.0

Solution:

1. Wait for the penalty to go below the reuse limit (750).

2. Remove dampening altogether from the BGP configuration.

3. Clear the flap statistics.

clear ip bgp dampening a.b.c.d

3b. EBGP-Learned Route Not Getting Installed in IP Routing Table—Cause: BGP Next Hop Not Reachable in Case of Multihop EBGP

In a multihop EBGP session, EBGP speakers are not directly connected. Peering between loopback addresses of adjacent routers also is considered multihop.

This problem of an EBGP multihop route not getting installed in an IP routing table is identical to the IBGP next hop issue; however, most of the commonly seen problems occur when the router fails to resolve the next-hop address to an interface.

In this problem, the multihop EBGP next hop is reachable through a BGP route whose next hop is again the original multihop BGP next hop. For example, to reach prefix A, the next hop is prefix B; to reach prefix B, the next hop is again B. This is considered a recursion problem in which a router cannot resolve to an interface to reach the next hop B.

show ip route a.b.c.d

show ip bgp a.b.c.d

Solution:

The solution to this problem based on this cause is to simply have a more specific route for the next-hop address. In the case of EBGP, this is commonly done by having a static route for the multihop EBGP peering address.

This instance is observed in the case of multihop EBGP sessions when the next-hop address is not directly connected and the IP routing table must have an explicit route to the next-hop address.

4. EBGP-Learned Route Not Getting Installed in the Routing Table—Cause: Multiexit Discriminator (MED) Value Is Infinite

In Cisco IOS Software, if a multiexit discriminator (MED) is set to infinite 4294967295, the router will not install this route in the routing table.

The infinite metric sometimes is used in route servers, which provide a mirror view of the Internet BGP table. Setting the metric to infinity prohibits such routes from going in the IP routing table, so no IP traffic will use those routes. This case is discussed here just to show a corner case of a BGP path not getting installed in the routing table. Such a configuration is not seen in real BGP networks.