Part 2: Troubleshooting OSPF neighbor formation, NBMA routing issues found

redistribution_frenzy

So to quickly summarize where I left off, I’ve decided that the Area 0 loopback on R4 with a virtual-link to it was not going to work due to the ABR being R4. SO, to rectify this I am first going to attempt to add Area 0 on R3, create a virtual-link to some random Area, and see if that allows the loopback and virtual-links network come into play without killing all RIP routes to the NBMA spokes R2 and R3.

So I added lo33, 172.12.33.0/24 to R3 and added it to the OSPF network in Area 0, and made lo44 on R4 Area 51.

One thing I wanted to note right away, on R3 I have to use R4’s loopback of 172.12.44.4 as a virtual-link RID (because it’s the highest loopback), however I added the 172.12.33.0/24 loopback to R3 while I was configuring it and used that on R4 and it did not work, NO BUENO.

This is because the OSPF election took place already, the router has it’s RID, and unless power cycled or “clear ip ospf proc”‘d it is going to retain it’s original RID (highest loopback) it was elected with of 3.3.3.3. THIS IS A VERY IMPORTANT NOTE, RID’S DO NOT CHANGE ONCE AN ELECTION TAKES PLACE, UNLESS DEVICES ARE POWER CYCLED OR “CLEAR IP OSPF PROC” – So if you want to switch the DR / BDR around in a network you will need to drop adjacency’s.

Anyways, so it went like this:

R3(config-router)#area 34 virtual-link 172.12.44.4
R3(config-router)#
ASR#4
[Resuming connection 4 to r4 … ]

R4(config-router)#area 34 virtual-link 172.12.33.3
R4(config-router)#

And waited, nothing happening, so then I jumped back on R4:

R3(config-router)#do sh ip proto
Routing Protocol is “ospf 1”
  Outgoing update filter list for all interfaces is not set
  Incoming update filter list for all interfaces is not set
  Router ID 3.3.3.3
  It is an area border router
  Number of areas in this router is 2. 2 normal 0 stub 0 nssa
  Maximum path: 4
  Routing for Networks:
    172.12.33.0 0.0.0.255 area 0
    172.12.34.0 0.0.0.255 area 34
 Reference bandwidth unit is 100 mbps
  Routing Information Sources:
    Gateway         Distance      Last Update
  Distance: (default is 110)

What a great command for information, got my RID, my networks being routed in OSPF, I know it’s an ABR, not highlighted shows me no Stubs or NSSA’s. So I adjust on R4:

R4(config-router)#no area 34 virtual-link 172.12.33.3
R4(config-router)#area 34 virtual-link 3.3.3.3
R4(config-router)#
*Jan 10 23:44:26.835: %OSPF-5-ADJCHG: Process 1, Nbr 3.3.3.3 on OSPF_VL1 from LOADING to FULL, Loading Done
R4(config-router)#

Came up in just a couple of seconds, and just for example I am going to cause some UTTER CHAOS by doing a “clear ip ospf proc” on R3 to prove the DR/BDR point. Cause during the election, R4 had the highest logical IP configured, so it is the DR, and R3 is the BDR with it’s 3.3.3.3 that was in the election.

Before I do “clear ip ospf proc” on R3 to demonstrate changing roles, I wanted to show the output of the current shtick:

R4(config-router)#do show ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
3.3.3.3           0   FULL/  –           –        172.12.34.3     OSPF_VL1
3.3.3.3           1   FULL/BDR        00:00:38    172.12.34.3     FastEthernet0/1
R4(config-router)#
ASR#3
[Resuming connection 3 to r3 … ]

*Mar  1 15:30:33.595: %OSPF-5-ADJCHG: Process 1, Nbr 172.12.44.4 on OSPF_VL0 from LOADING to FULL, Loading Done

R3(config-router)#do show ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
172.12.44.4       0   FULL/  –           –        172.12.34.4     OSPF_VL0
172.12.44.4       1   FULL/DR         00:00:31    172.12.34.4     FastEthernet0/1

  • The virtual-link shows a dash as it’s state like point-to-point OSPF neighbors, point-to-point because they are both end of line routers so there is no neighbors to advertise routes to, and the Virtual-Link because it is logical so it probably doesn’t have logical neighbors, also notice the dead timer is also a dash

Also a very good command, shows not only the remote RID, but also under Address is the actual IP addresses that the neighbors learned eachother off of. So on to the chaos, with clearing the ospf process on R3 and reviewing the output:

R3#clear ip ospf proc
Reset ALL OSPF processes? [no]: yes
R3#
*Mar  1 15:52:21.623: %OSPF-5-ADJCHG: Process 1, Nbr 172.12.44.4 on OSPF_VL0 from FULL to DOWN, Neighbor Down: Interface down or detached
*Mar  1 15:52:21.659: %OSPF-5-ADJCHG: Process 1, Nbr 172.12.44.4 on FastEthernet0/1 from FULL to DOWN, Neighbor Down: Interface down or detached
*Mar  1 15:52:21.747: %OSPF-5-ADJCHG: Process 1, Nbr 172.12.44.4 on FastEthernet0/1 from LOADING to FULL, Loading Done
R3#
*Mar  1 15:52:36.752: %OSPF-5-ADJCHG: Process 1, Nbr 172.12.44.4 on OSPF_VL0 from LOADING to FULL, Loading Done
R3#

As can be seen, the FastEthernet came back in under a second, which is why I highlighted the timestamps by hh:mm:ss:ms, but the virtual-link even over FastEthernet took 15 seconds. Now for the output of this OSPF election:

R3#sh ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
172.12.44.4       0   FULL/  –           –        172.12.34.4     OSPF_VL0
172.12.44.4       1   FULL/DR         00:00:36    172.12.34.4     FastEthernet0/1
R3#
ASR#4
[Resuming connection 4 to r4 … ]

R4(config-router)#do sh ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
3.3.3.3           0   FULL/  –           –        172.12.34.3     OSPF_VL1
3.3.3.3           1   FULL/BDR        00:00:32    172.12.34.3     FastEthernet0/1
R4(config-router)#

That was honestly not expected, let me give it a “clear ip ospf proc” itself on R4:

R4#clear ip ospf proc
Reset ALL OSPF processes? [no]: yes
R4#
*Jan 11 00:14:49.583: %OSPF-5-ADJCHG: Process 1, Nbr 3.3.3.3 on OSPF_VL1 from FULL to DOWN, Neighbor Down: Interface down or detached
*Jan 11 00:14:49.619: %OSPF-5-ADJCHG: Process 1, Nbr 3.3.3.3 on FastEthernet0/1 from FULL to DOWN, Neighbor Down: Interface down or detached
R4#
*Jan 11 00:14:55.235: %OSPF-5-ADJCHG: Process 1, Nbr 3.3.3.3 on FastEthernet0/1 from LOADING to FULL, Loading Done
R4#
*Jan 11 00:15:14.583: %OSPF-5-ADJCHG: Process 1, Nbr 3.3.3.3 on OSPF_VL1 from LOADING to FULL, Loading Done
R4#sh ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
3.3.3.3           0   FULL/  –        00:00:06    172.12.34.3     OSPF_VL1
3.3.3.3           1   FULL/DR         00:00:36    172.12.34.3     FastEthernet0/1 <W, T, F.
R4#

I did a “sh ip proto” on R3 and it confirms 3.3.3.3 is still the RID, even though I did not hard code it with “router-id x” in router configuration, and now it is also the DR? I need to see if I can understand this, so I am going to clear the ospf process on R3 once more while running a “debug ip ospf pack” on R4 to see what I find.

I will save the output, but sure enough OSPF on R4 is still showing R3 reporting 3.3.3.3 as it’s RID, so things are about to get drastic. I am going to do a “wr mem” on both devices and reboot them to see if this is a bug or odd behavior:

R4#sh ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
172.12.33.3       1   FULL/BDR        00:00:33    172.12.34.3     FastEthernet0/1
R4#
ASR#3
[Resuming connection 3 to r3 … ]

R3#sh ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
172.12.44.4       1   FULL/DR         00:00:31    172.12.34.4     FastEthernet0/1
R3#

And sure enough in this situation, both routers required a complete reload simultaneously to produce the correct results, those being:

  • R4 is still the DR, because it’s loopback of 44.0 is still higher than R3’s 33.0
  • There is no longer a virtual-link to speak of, because R4’s remote RID changed

So now that things are as they should be, the change is made to bring up this virtual-link, and then I want to see if we are still losing RIP routes on R2 from re-adding the virtual-link:

R4(config-router)#no area 34 virtual-link 3.3.3.3
R4(config-router)#area 34 virtual-link 172.12.33.0
R4(config-router)#
*Jan 11 00:43:33.515: %OSPF-5-ADJCHG: Process 1, Nbr 172.12.33.3 on OSPF_VL2 from LOADING to FULL, Loading Done
R4(config-router)#

And now that we have dissected OSPF on this link / virtual-link, the last lab it was the bane of RIP, and with it configured R2 had no routes in it’s table except directly connected routers. Here is the output after that whole fiasco from R2 “sh ip route” :

R2#sh ip route
(Route codes redacted)

Gateway of last resort is not set

     1.0.0.0/32 is subnetted, 1 subnets
R       1.1.1.1 [120/1] via 172.12.123.1, 00:00:08, Serial0/0
     2.0.0.0/32 is subnetted, 1 subnets
C       2.2.2.2 is directly connected, Loopback2
     3.0.0.0/32 is subnetted, 1 subnets
R       3.3.3.3 [120/2] via 172.12.123.3, 00:00:08, Serial0/0
     4.0.0.0/32 is subnetted, 1 subnets
R       4.4.4.4 [120/3] via 172.12.123.3, 00:00:08, Serial0/0
     172.12.0.0/16 is variably subnetted, 5 subnets, 2 masks
R       172.12.33.0/24 [120/2] via 172.12.123.3, 00:00:09, Serial0/0
R       172.12.34.0/24 [120/2] via 172.12.123.3, 00:00:09, Serial0/0
R       172.12.44.4/32 [120/3] via 172.12.123.3, 00:00:09, Serial0/0
R       172.12.15.0/24 [120/1] via 172.12.123.1, 00:00:09, Serial0/0
C       172.12.123.0/24 is directly connected, Serial0/0
R2#

Now THAT is what we want to see, and it can be seen the metric even has the true hop count for the redistributed routes, has both routes on R4 should be 3 hops, and the directly connected OSPF route should be 2 hops to R3 through the Hub.

Wooooo! So, we have routes, but can R2 ping those routes and get Layer 3 connectivity?

R2#ping 4.4.4.4

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
…..
Success rate is 0 percent (0/5)
R2#

We are not getting a response from R4 because return RIP routes are not being redistributed into OSPF, and we cannot make it a Total Stub because Total stubs cannot have virtual-links, and then how will we ever get to Area 51? To get those R4 routes redistributed, the virtual-link has got to stay, for now.

To overcome this, I’ll simply add a default route pointed at R3, essentially what a Total Stub would have done, lets see if it solves our issue so we may move on:

R4(config)#ip route 0.0.0.0 0.0.0.0 172.12.34.3
R4(config)#
ASR#2
[Resuming connection 2 to r2 … ]

R2#ping 4.4.4.4

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
…..
Success rate is 0 percent (0/5)
R2#

I was so sure about that working I didn’t even think twice about it, first I’ll look at the route table in R4 to see whats up:

R4(config)#do sh ip route
(Route codes redacted)

Gateway of last resort is 172.12.34.3 to network 0.0.0.0

S*    0.0.0.0/0 [1/0] via 172.12.34.3
      4.0.0.0/32 is subnetted, 1 subnets
C        4.4.4.4 is directly connected, Loopback4
      172.12.0.0/16 is variably subnetted, 5 subnets, 2 masks
O        172.12.33.3/32 [110/2] via 172.12.34.3, 00:19:09, FastEthernet0/1
C        172.12.34.0/24 is directly connected, FastEthernet0/1
L        172.12.34.4/32 is directly connected, FastEthernet0/1
C        172.12.44.0/24 is directly connected, Loopback44
L        172.12.44.4/32 is directly connected, Loopback44
R4(config)#

So there is that piece, saying ‘all other traffic go to my neighbors Fa0/1 interface, so next up is R3 to see if it is choking on me:

R3#sh ip route
(Route codes redacted)

Gateway of last resort is not set

     1.0.0.0/32 is subnetted, 1 subnets
R       1.1.1.1 [120/1] via 172.12.123.1, 00:00:01, Serial0/2
     2.0.0.0/32 is subnetted, 1 subnets
R       2.2.2.2 [120/2] via 172.12.123.2, 00:00:01, Serial0/2
     3.0.0.0/32 is subnetted, 1 subnets
C       3.3.3.3 is directly connected, Loopback3
     4.0.0.0/32 is subnetted, 1 subnets
O IA    4.4.4.4 [110/2] via 172.12.34.4, 00:21:25, FastEthernet0/1
     172.12.0.0/16 is variably subnetted, 5 subnets, 2 masks
C       172.12.33.0/24 is directly connected, Loopback33
C       172.12.34.0/24 is directly connected, FastEthernet0/1
O IA    172.12.44.4/32 [110/2] via 172.12.34.4, 00:21:26, FastEthernet0/1
R       172.12.15.0/24 [120/1] via 172.12.123.1, 00:00:02, Serial0/2
C       172.12.123.0/24 is directly connected, Serial0/2
R3#

So now things are getting very odd, because R4 throws all traffic to this route table, that obviously has a valid route to R2 via the Hub. So I am going to test connectivity from R3 around the network, I have a feeling this is a RIP behavior, but we shall see:

R3#ping 172.12.123.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.12.123.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 64/66/69 ms
R3#ping 172.12.123.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.12.123.2, timeout is 2 seconds:
…..
Success rate is 0 percent (0/5)
R3#

So R1 doesn’t feel like passing the traffic, and I would get a U.U.U response if it was missing the route, so lets see what it’s hold up is:

R1#sh ip route
(Route codes redacted)

Gateway of last resort is not set

     1.0.0.0/32 is subnetted, 1 subnets
C       1.1.1.1 is directly connected, Loopback1
     2.0.0.0/32 is subnetted, 1 subnets
R       2.2.2.2 [120/1] via 172.12.123.2, 00:00:14, Serial0/0
     3.0.0.0/32 is subnetted, 1 subnets
R       3.3.3.3 [120/1] via 172.12.123.3, 00:00:03, Serial0/0
     4.0.0.0/32 is subnetted, 1 subnets
R       4.4.4.4 [120/2] via 172.12.123.3, 00:00:03, Serial0/0
     5.0.0.0/24 is subnetted, 1 subnets
D       5.5.5.0 [90/156160] via 172.12.15.5, 02:22:18, FastEthernet0/1
     172.12.0.0/16 is variably subnetted, 5 subnets, 2 masks
R       172.12.33.0/24 [120/1] via 172.12.123.3, 00:00:04, Serial0/0
R       172.12.34.0/24 [120/1] via 172.12.123.3, 00:00:04, Serial0/0
R       172.12.44.4/32 [120/2] via 172.12.123.3, 00:00:06, Serial0/0
C       172.12.15.0/24 is directly connected, FastEthernet0/1
C       172.12.123.0/24 is directly connected, Serial0/0
R1#ping 172.12.123.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.12.123.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 64/64/68 ms
R1#ping 2.2.2.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 64/65/68 ms
R1#ping 172.12.34.4

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.12.34.4, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 64/65/68 ms
R1#ping 4.4.4.4

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 64/66/69 ms
R1#

So the Hub has no issue with no connectivity either, an inherent rule saying I will not send traffic to a destination off the same interface I learned about it on, S0/0. ***An interesting note to keep in mind is that it is still broadcasting routing updates, but will not allow pings back out.*** To see if there is something I can put in the debug, I am going to start a continuous ping on R3 to R2, and “debug ip packet” on R1 to see what message it gives me:

R1#debug ip packet
IP packet debugging is on
R1#
*Mar  1 16:13:19.908: IP: s=172.12.15.1 (local), d=224.0.0.10 (FastEthernet0/1), len 60, sending broad/multicast
R1#
*Mar  1 16:13:23.370: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
*Mar  1 16:13:23.646: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
R1#
*Mar  1 16:13:24.764: IP: s=172.12.15.1 (local), d=224.0.0.10 (FastEthernet0/1), len 60, sending broad/multicast
R1#
*Mar  1 16:13:27.160: IP: s=172.12.123.2 (Serial0/0), d=224.0.0.9, len 212, rcvd 2
*Mar  1 16:13:27.821: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
R1#
*Mar  1 16:13:28.542: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
*Mar  1 16:13:29.331: IP: s=172.12.15.1 (local), d=224.0.0.10 (FastEthernet0/1), len 60, sending broad/multicast
R1#
*Mar  1 16:13:32.264: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
*Mar  1 16:13:32.837: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
R1#
*Mar  1 16:13:33.907: IP: s=172.12.15.1 (local), d=224.0.0.10 (FastEthernet0/1), len 60, sending broad/multicast
R1#
*Mar  1 16:13:37.192: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
*Mar  1 16:13:37.333: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
R1#
*Mar  1 16:13:38.695: IP: s=172.12.15.1 (local), d=224.0.0.10 (FastEthernet0/1), len 60, sending broad/multicast
*Mar  1 16:13:39.616: IP: s=172.12.123.3 (Serial0/0), d=224.0.0.9, len 212, rcvd 2
R1#
*Mar  1 16:13:41.227: IP: s=172.12.15.1 (local), d=224.0.0.9 (FastEthernet0/1), len 192, sending broad/multicast
*Mar  1 16:13:41.251: IP: s=1.1.1.1 (local), d=224.0.0.9 (Loopback1), len 192, sending broad/multicast
*Mar  1 16:13:41.251: IP: s=1.1.1.1 (Loopback1), d=224.0.0.9, len 192, rcvd 2
*Mar  1 16:13:41.551: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2
*Mar  1 16:13:42.016: IP: s=172.12.15.5 (FastEthernet0/1), d=224.0.0.10, len 60, rcvd 2

The only sign of R3 traffic is 20 seconds after the debug was started, and it was a routing update, and an ICMP packet coming from it show definitely be making noise on that debug. The traffic must be getting dropped on R3 somehow:

Sending 5, 100-byte ICMP Echos to 172.12.123.2, timeout is 2 seconds:

*Mar  1 17:24:41.851: IP: tableid=0, s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), routed via FIB
*Mar  1 17:24:41.851: IP: s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), len 100, sending
*Mar  1 17:24:42.620: IP: s=172.12.34.3 (local), d=224.0.0.5 (FastEthernet0/1), len 80, sending broad/multicast.
*Mar  1 17:24:43.855: IP: tableid=0, s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), routed via FIB
*Mar  1 17:24:43.855: IP: s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), len 100, sending.
*Mar  1 17:24:45.858: IP: tableid=0, s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), routed via FIB
*Mar  1 17:24:45.858: IP: s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), len 100, sending.
*Mar  1 17:24:47.857: IP: tableid=0, s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), routed via FIB
*Mar  1 17:24:47.857: IP: s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), len 100, sending
*Mar  1 17:24:48.109: IP: s=172.12.34.4 (FastEthernet0/1), d=224.0.0.5, len 80, rcvd 0.
*Mar  1 17:24:49.860: IP: tableid=0, s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), routed via FIB
*Mar  1 17:24:49.860: IP: s=172.12.123.3 (local), d=172.12.123.2 (Serial0/2), len 100, sending.
Success rate is 0 percent (0/5)
R3#

That is the ‘debug ip packet’ when pinging R2 from R3, just shows “routed via FIB” but nothing specific about dropping packets. ALSO, I noticed I did not do a lot of connectivity tests during previous redistribution labs, so I am considering those kind of invalid at this point.

I don’t know that I will get past that rule over the NBMA, as I see different / simpler topologies being used in a portion of a training video on RIP Redist I just watched quick to see if there was something beyond “no ip split” and “no auto” to top it off. I am going to stop the lab here, I was at least able to get the routes to propagate around, my next lab will be a final for this topology to see if I can Redistribute EIGRP and actually get Layer 3 connectivity, or at least the routes to propagate around the network. Until next time 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s