Tuesday, October 18, 2011

notes: Troubleshooting EIGRP Neighbor Relationships

1.  EIGRP Neighbor Problem—Cause: Unidirectional Link

A one-way neighbor relationship usually is caused by a unidirectional connection between the neighbors. The cause for unidirectional connection is usually a Layer 2 problem.

verification:

show ip eigrp neighbor

RtrB#show ip eigrp neighbors

IP-EIGRP neighbors for process 1
H Address   Interface Hold Uptime  SRTT   RTO  Q  Seq

                                   (sec) (ms)  Cnt Num

1 10.88.18.2    S0    14   00:00:15  0    5000  4    0

the fact that the SRTT timer is 0 indicates that no acknowledge-ment packets are being received. The Q count is not decrementing, which indicates that the router is trying to send EIGRP packets but no acknowledgement is being received. RTR B will retry 16 times to resend the packet; eventually, RTR B will reset the neighbor relationship with the log indicating RETRY LIMIT EXCEEDED, and the process starts again. Also, keep in mind that the 16 times retransmission of the same packet is done using unicast, not multicast. Therefore, the RETRY LIMIT EXCEEDED message indicates a problem with transmitting unicast packets over the link, and this is most likely a Layer 1 or Layer 2 problem.

Solution:

RtrB#show ip eigrp neighbors

IP-EIGRP neighbors for process 1

H Address   Interface Hold Uptime  SRTT   RTO  Q  Seq

                      (sec)        (ms)       Cnt Num

1 10.88.18.2    S0    14   01:26:30 149   894  0  291


Notice that the Q count column is 0 and that the SRTT and RTO have valid values now.


2.  EIGRP Neighbor Problem—Cause: Uncommon Subnet

Many times, EIGRP won't establish neighbor relationships because the neighbors are not in the same subnet. Usually, the cause of this problem is router misconfiguration. When EIGRP has problems establishing neighbor relationships because of an uncommon subnet, the following error message appears:

IP-EIGRP: Neighbor ip address not on common subnet for interface

3 possible cause


a.  The IP address has been misconfigured on interfaces.

Solution:  configure the correct IP address on interface.

b. The primary and secondary IP addresses of the neighboring interface don't match.

EIGRP sources the hello packet from the primary address of the interface. If the primary network address on one router is used as a secondary network address on the second router, and vice versa, no neighbor relationship will be formed and the routers will complain about the neighbor not being on a common subnet.


Solution:  Ensure Primary and secondary IP matches on both sides of the link.

c.  A switch or hub between the EIGRP neighbor connection is misconfigured or is leaking multicast packet to other ports.

If a single LAN hub connects the EIGRP neighbors for different LAN segment, the hub passes broadcast and multicast packets to other ports between two logical LAN seg-ments. So, the multicast EIGRP hello from LAN segment 1 will be seen on the neighbor located in LAN segment 2 if a single hub connects all the LAN devices on different LAN segments. The solution is to break up the broadcast domain by using a separate hub for each LAN segment or simply configuring no eigrp log-neighbor-warnings under EIGRP con-figuration to stop seeing the error message.


3. EIGRP Neighbor Problem—Cause: Mismatched Masks

Solution:

The solution for this problem: Configure the right subnet mask on Router.

4.  EIGRP Neighbor Problem—Cause: Mismatched K Values

For EIGRP to establish its neighbors, the K constant value to manipulate the EIGRP metric must be the same.

Troubleshooting this problem requires careful scrutiny of the router's configuration. The solu-tion for this problem is to change all the K values to be the same on all the neighboring routers.


5.  EIGRP Neighbor Problem—Cause: Mismatched AS Number

EIGRP won't form any neighbor relationships with neighbors in different autonomous systems.



Solution: Ensure AS are the same.

6.  EIGRP Neighbor Problem—Cause: Stuck in Active


Sometimes, EIGRP resets the neighbor relationship because of a "stuck in active" condition. The error message is

%DUAL-3-SIA: Route network mask stuck-in-active state in IP-EIGRP AS. Cleaning up 
 

Reviewing the EIGRP DUAL Process

To resolve an EIGRP stuck in active error, you need to understand the DUAL process in EIGRP. Refer to Chapter 6 for thorough coverage of the DUAL process, although it is reviewed here as well.

EIGRP is an advanced distance-vector protocol; it doesn't have LSA flooding, like OSPF, or a link-state protocol to tell the protocol the overall view of the network. EIGRP relies only on its neighbors for information on network reachability and availability. EIGRP keeps a list of backup routes called feasible successors. When the primary route is not available, EIGRP immediately uses the feasible successor as the backup route. This shortens convergence time. Now, if the primary route is gone and no feasible successor is available, the route is in active state. The only way for EIGRP to converge quickly is to query its neighbors about the unavailable route. If the neighbor doesn't know the status of the route, the neighbor asks its neighbors, and so on, until the edge of the network is reached. The query stops if one of the following occurs:

All queries are answered from all the neighbors.

The end of network is reached.

The lost route is unknown to the neighbors.

The problem is that, if there are no query boundaries, EIGRP potentially can ask every router in the network for a lost route. When EIGRP first queries its neighbor, a stuck in active timer starts. By default, the timer is three minutes. If, in three minutes, EIGRP doesn't receive the query response from all its neighbors, EIGRP declares that the route is stuck in active state and resets the neighbor that has not responded to the query.
 
 Determining Active/Stuck in Active Routes with show ip eigrp topology active

You must answer two questions to troubleshoot the EIGRP stuck in active problem:

Why is the route active?

Why is the route stuck?

Determining why the route is active is not a difficult task. Sometimes, the route that constantly is going active could be due to flapping link. Or, if the route is a host route (/32 route), it's possible that it is from a dial-in connection that gets disconnected. However, trying to deter-mine why the active route becomes stuck is a much harder task—and more important to learn. Usually, an active route gets stuck for one of the following reasons:

Bad or congested links

- Low router resources, such as low memory or high CPU on the router

- Long query range

- Excessive redundancy

By default, the stuck in active timer is only three minutes. In other words, if the EIGRP neighbor doesn't hear a reply for the query in three minutes, neighbors are reset. This adds difficulty in troubleshooting EIGRP stuck in active because every time an active route is stuck, you have only three minutes to track down the active route query path and hopefully find the cause.

The tool that you need to troubleshoot the EIGRP stuck in active error is the show ip eigrp topology active command. This command shows what routes are currently active, how long the routes have been active, and which neighbors have and have not replied to the query. From the output, you can determine which neighbors have not replied to the query, and you can track the query path and find out the status of the query by hopping to the neighbors that have not replied.

Methodology for Troubleshooting the Stuck in Active Problem

The methods for troubleshooting an EIGRP stuck in active problem and the show ip eigrp topology active command are useful only when the problem is happening. When the stuck in active event is over and the network stabilizes, it is extremely difficult, if not impossible, to backtrack the problem and find out the cause.

Solution:

The ultimate solution for preventing the EIGRP stuck in active problem is to manually sum-marize the routes whenever possible and to have a hierarchical network design. The more network EIGRP summarizes, the less work EIGRP has to do when a major convergence takes place. Therefore, this reduces the number of queries being sent out and ultimately reduces the occurrence of an EIGRP stuck in active error.

No comments:

Post a Comment