Stare_Compare.PNG

I left myself with one  device to assist in getting my network back up and running, and that is the Frame-Relay Switch running configuration, which I can look at the above 3 interfaces and do the “Stare and Compare” with R1 / R2 / R3 to get them reconfigured for labbing TSHOOT concepts going forward – But that is far from all the tools available!

A quick review of Network Data to be collected as part of ongoing maintenance

To recap some of the things that are important parts of both the troubleshooting and maintenance of a network to assist in reactive troubleshooting of a problem is:

  • Troubleshooting information collected – As the name implies this is information collected both while troubleshooting (fixes documented, updates running configs, change logs) that are collected through proactive and reactive troubleshooting
  • Baseline information collected – Information collected when things are running normal on the network, so it can be used at a later time to compare to current network data to see if it reveals the source of the issue, this includes packet captures (wireshark) / CPU Utilization (Cisco IOS) / etc
  • Network Event information – Archived information of network events using tools like a syslog server, a netflow collector, or just doing a “sh log” on the device and pasting the output into a centralized knowledge base for other network admins

Another important piece is having a procedure for how documentation is kept, and the process to retrieve it, which there are a couple of examples of:

  • Trouble ticketing system – This can serve the role for Network Event info AND Troubleshooting collected info, as it keeps the problem report / timestamp from the customer of an issue and usually time entries with fixes along the way to problem resolution (so it is important to keep these tickets very well documented!)
  • Wiki – Kind of a generic term for a centralized “Knowledge Base” where procedures can be posted such as Sharepoint, Talisma (3rd party KB), or even this type of blog

I have referred to my own troubleshooting in this blog / on my lab for my job, and it has saved my butt more than once when I couldn’t quite recall a command, so when you study creating a blog like this is a great idea not just for exam studies!

Keep this information centralized and accessible to anyone who needs it is very important, and this blog is quite literally a great example of maintaining a good knowledge base to refer back to for troubleshooting (not to kiss my own butt) 🙂

Troubleshooting Tools found in Cisco IOS / accessible from CLI!

There are various tools that can be used from the command line of Cisco devices that can give really good insight into real time troubleshooting, but also for storing backups of device configurations at correctly scheduled times, and data samples (SNMP / Netflow).

“show (something)”

This is an obvious tool but needs to be mentioned that it is technically a “troubleshooting tool” available on IOS / the CLI, it is what I call the “verification” commands / output for protocols, and will provide protocol specific information:

FRSW#sh ntp status
FRSW#

In this example, I didn’t realize I had never set the clock on my Frame Relay Switch, which is going to cause all sorts of issues with time sensitive protocols, and for event logging as they will be logged using the default time which is a ways back in time:

FRSW#sh clock
*00:36:15.675 UTC Mon Mar 1 1993
FRSW#

So if a problem was reported and this Frame Switch may be a point of interest in the troubleshooting path, I would need to figure out how the actual time of the problem report correlates to this switches time and event logging.

However I probably won’t have to worry about the time being off if I don’t even have “buffer logging” enabled on the Frame Switch:

FRSW#sh log
Syslog logging: enabled (0 messages dropped, 2 messages rate-limited, 0 flushes, 0 overruns, xml disabled)
Console logging: level debugging, 31 messages logged, xml disabled
Monitor logging: level debugging, 0 messages logged, xml disabled
Buffer logging: disabled, xml disabled
Logging Exception size (4096 bytes)
Count and timestamp logging messages: disabled
Trap logging: level informational, 35 message lines logged
FRSW#

So I will turn on both buffered logging and correct the time for this device, which are both considered troubleshooting / maintenance tools native to Cisco IOS as well!

Enabling buffer logging / differences in event levels logged

There are 8 event levels, which seen in the above output highlighted in blue is the default level of “debugging” for events, which will display any information from “debug” output captured on the device while a debug is running.

You will probably want to adjust this to a proper level to meet your troubleshooting needs, and take into consideration the amount of information you want captured, as the buffer can be flooded out extremely quickly based on the lower priority event logging:

Levels of logging explained

  • 0 – Emergencies – System unusable
  • 1 – Alerts – Immediate action needed
  • 2 – Critical – Critical condition
  • 3 – Errors – Error condition
  • 4 – Warnings – Warning condition
  • 5 – Notifications – Normal but significant condition
  • 6 – Informational – Informational message only
  • 7 – Debugging – Appears during debugging only

When you set a certain level, it will capture events at that level AND everything above it, so for example if I want to see only messages from network problems, I would use “Error” level 3 to see Errors and everything above it.

Lets take a look at options when configuring buffer logging available via IOS:

Enabling buffer logging

FRSW#conf t
Enter configuration commands, one per line. End with CNTL/Z.
FRSW(config)#logging ?
Hostname or A.B.C.D IP address of the logging host
buffered Set buffered logging parameters
cns-events Set CNS Event logging level
console Set console logging parameters
count Count every log message and timestamp last occurance
exception Limit size of exception flush output
facility Facility parameter for syslog messages
history Configure syslog history table
host Set syslog server IP address and parameters
monitor Set terminal line (monitor) logging parameters
on Enable logging to all supported destinations
origin-id Add origin ID to syslog messages
rate-limit Set messages per second limit
reload Set reload logging level
source-interface Specify interface for source address in logging
transactions
trap Set syslog server logging level

FRSW(config)#logging buffered ?
<0-7> Logging severity level
<4096-2147483647> Logging buffer size
alerts Immediate action needed (severity=1)
critical Critical conditions (severity=2)
debugging Debugging messages (severity=7)
emergencies System is unusable (severity=0)
errors Error conditions (severity=3)
informational Informational messages (severity=6)
notifications Normal but significant conditions (severity=5)
warnings Warning conditions (severity=4)
xml Enable logging in XML to XML logging buffer
<cr>

FRSW(config)#

A few things to note here:

  • “logging …” commands are issued at global config prompt
  • “logging monitor” is highlighted as this will allow you to see console messages when NOT connected via console, so if you are using Telnet / SSH you will not see console messages without issuing this command on the device!
  • Logging can either be configured by its #, name, or just enabled in general which will configure it as 4096 buffer size and Debugging by default
  • The buffer size and logging level can be set here at the same time as shown here:

FRSW(config)#logging buffered 32768 ?
<0-7> Logging severity level
alerts Immediate action needed (severity=1)
critical Critical conditions (severity=2)
debugging Debugging messages (severity=7)
emergencies System is unusable (severity=0)
errors Error conditions (severity=3)
informational Informational messages (severity=6)
notifications Normal but significant conditions (severity=5)
warnings Warning conditions (severity=4)
<cr>

FRSW(config)#logging buffered 32768 warnings
FRSW(config)#

So my buffer size is now 32768 bytes in size, logging Warning level events, which can be confirmed using the “show” command of “sh log” on the CLI:

FRSW(config)#do sh log
Syslog logging: enabled (0 messages dropped, 2 messages rate-limited, 0 flushes, 0 overruns, xml disabled)
Console logging: level debugging, 32 messages logged, xml disabled
Monitor logging: level debugging, 0 messages logged, xml disabled
Buffer logging: level warnings, 0 messages logged, xml disabled
Logging Exception size (4096 bytes)
Count and timestamp logging messages: disabled
Trap logging: level informational, 36 message lines logged
–More–

Note that the others didn’t change, but can also be configured using similar syntax, so now that logging is enabled I will want to correct the time!

Configuring NTP time / timezone / daylight savings

Generally you will want to point your device towards a Stratum 1 device out on the internet, or more preferably a pool of servers in case a single server becomes unavailable, so for real world purposes I generally see “#pool.ntp.org prefer” set as network time sources depending on the IOS running on the device.

These commands may not be necessary if the pool is in the same timezone of the network, however if the servers are based of GMT +/- hours, you will want to hard code things like “Timezone” and “Daylight Savings” time options.

Configuring multiple NTP servers (with a preferred server)

FRSW(config)#ntp server ?
Hostname or A.B.C.D IP address of peer
vrf VPN Routing/Forwarding Information

FRSW(config)#ntp server 172.168.123.1 ?
key Configure peer authentication key
prefer Prefer this peer when possible
source Interface for source address
version Configure NTP version
<cr>

FRSW(config)#ntp server 172.168.123.1
FRSW(config)#ntp server 172.168.123.2 prefer
FRSW(config)#ntp server 172.168.123.3
FRSW(config)#

My lab doesn’t connect to the internet so I just pointed this Frame Relay Switch at R1 / R2 / R3 for redundancy, and will use R2 as its time source given the “prefer” sub-command.

“clock timezone” to set the networks timezone

FRSW(config)#clock ?
summer-time Configure summer (daylight savings) time
timezone Configure time zone

FRSW(config)#clock timezone ?
WORD name of time zone

FRSW(config)#clock timezone CST ?
<-23 – 23> Hours offset from UTC

FRSW(config)#clock timezone CST -5 ?
<0-59> Minutes offset from UTC
<cr>

FRSW(config)#clock timezone CST -5
FRSW(config)#

For all intensive purpose, UTC is the same as GMT, the difference is that UTC is a timezone “standard” and GMT is considered an actual “Timezone” though both acronyms start at the same point in time (as GMT and all time zones are part of the UTC standard).

“clock summer-time” to set the networks observed Daylight Savings time

FRSW(config)#clock summer-time ?
WORD name of time zone in summer

FRSW(config)#clock summer-time CDT ?
date Configure absolute summer time
recurring Configure recurring summer time

FRSW(config)#clock summer-time CDT recurring ?
<1-4> Week number to start
first First week of the month
last Last week of the month
<cr>

FRSW(config)#clock summer-time CDT recurring 2 ?
DAY Weekday to start

FRSW(config)#clock summer-time CDT recurring 2 Sunday ?
MONTH Month to start

FRSW(config)#clock summer-time CDT recurring 2 Sunday March ?
hh:mm Time to start (hh:mm)

FRSW(config)#clock summer-time CDT recurring 2 Sunday March 02:00 ?
<1-4> Week number to end
first First week of the month
last Last week of the month

FRSW(config)#clock summer-time CDT recurring 2 Sunday March 02:00 1 ?
DAY Weekday to end

FRSW(config)#clock summer-time CDT recurring 2 Sunday March 02:00 1 Sunday ?
MONTH Month to end

FRSW(config)#$r-time CDT recurring 2 Sunday March 02:00 1 Sunday November ?
hh:mm Time to end (hh:mm)

FRSW(config)#$ recurring 2 Sunday March 02:00 1 Sunday November 02:00 ?
<1-1440> Offset to add in minutes
<cr>

FRSW(config)#$ recurring 2 Sunday March 02:00 1 Sunday November 02:00

With that there is now “Warning” level logging messages being saved to the buffer, NTP is now set as correctly as we can get it on this lab, and the timezone / Daylight Savings will ensure we have correct timestamps to base our logging events off of!

Using Debugging as an IOS troubleshooting tool

Debugging is especially useful in troubleshooting of protocols, none of which I have running on this network device outside of frame-relay at the moment, however while learning ROUTE / SWITCH exam topics and on the job we’ve all probably had exposure to debugging certain protocols that are suspect as the network problem.

A sample of debugging on this Frame Switch was just “debug all” which gives a dire warning, which this command should never be run on production network device:

FRSW#debug all

This may severely impact network performance. Continue? (yes/[no]): yes
All possible debugging has been turned on
FRSW#
*Mar 1 01:37:57.211: Serial1/0: FR invalid/unexpected pak received on DLCI 560
FRSW#
*Mar 1 01:38:02.079: crm_send_periodic_update
*Mar 1 01:38:02.299: Serial1/0: FR invalid/unexpected pak received on DLCI 560
*Mar 1 01:38:02.631: NTP: xmit packet to 172.168.123.1:

Then of course “undebug all” or “u all” to turn it back off, and if you are using Telnet or SSH to remotely connect to the device, you will need to use the “logging monitor” command at Global Config level to see the console output in your remote session!

Also outside of the scope of the exam, in real world debugging practices you will want to use an ACL in most cases on a production network to narrow down the debugging for two host IP addresses to keep the output limited.

This will limit the possibility of pegging the CPU with output from “debug ip ospf” when there are 20 OSPF neighbors sending updates / traffic so fast that it could actually kick you off your telnet / ssh session and keep you off it (and possibly overload the device causing it to lock up and require a hard reboot to clear).

So for production network devices, always be very thoughtful of the output that will possibly be generated by the debug before issuing it to the network device.

Many other CLI tools are available and will be continued in the next post!

I think this is a good place to stop to keep relatively well known information contained in one post, and on the next I will review more advanced troubleshooting tools such as FTP / SNMP / Netflow / many more!