To begin, I’d like to start with a stark contrast between Frame-Relay CIR (Committed Information Rate) provided by the FR provider, and IP SLA.
As the FR Provider guarantees at least a certain up-time and bandwidth availability, that makes no claims for over-utilization or perfect uptime, and this helps to have those minimums in place to plan your WAN usage and priority.
The IP SLA is based on that same concept of minimum guaranteed performance, but instead of company to FR-Provider, it will generally be from internal clients of the company to the network team asking for priority of certain protocols.
Speaking of protocols, lets take a look at the beginning configuration of “ip sla” and what kind of protocols we can guarantee some minimum performance for:
R1#conf t
Enter configuration commands, one per line. End with CNTL/Z.
R1(config)#ip sla ?
<1-2147483647> Entry Number
apm IP SLAs Application Performance Monitoring
Configuration
group Group Configuration or Group Scheduling
key-chain Use MD5 authentication for IP SLAs Control Messages
logging Enable Syslog
low-memory Configure Low Water Memory Mark
reaction-configuration IP SLAs Reaction-Configuration
reaction-trigger IP SLAs Trigger Assignment
reset IP SLAs Reset
responder Enable IP SLAs Responder
restart Restart An Active Entry
schedule IP SLAs Entry Scheduling
slm Service Level Management
R1(config)#ip sla 5
R1(config-ip-sla)#
We will just go with an Entry number for now, but I wanted to show the modifiers for “ip sla …” in case we run into them later, also as you can see this drops me into “ip-sla” config mode.
So from that mode, lets see whats available for ip sla:
R1(config-ip-sla)#?
IP SLAs entry configuration commands:
dhcp DHCP Operation
dlsw DLSW Operation
dns DNS Query Operation
exit Exit Operation Configuration
frame-relay Frame-relay Operation
ftp FTP Operation
http HTTP Operation
icmp-echo ICMP Echo Operation
icmp-jitter ICMP Jitter Operation
path-echo Path Discovered ICMP Echo Operation
path-jitter Path Discovered ICMP Jitter Operation
slm SLM Operation
tcp-connect TCP Connect Operation
udp-echo UDP Echo Operation
udp-jitter UDP Jitter Operation
voip Voice Over IP Operation
R1(config-ip-sla)#
As can be seen, just about any protocol expected to come into our go out of your network, though I myself generally see “ip sla tracking” which will be covered shortly in another post.
Now to set up any of these options, it requires a source and a responder, and to start things off the source sends control packets to the responder via UDP port 1967 in an attempt to create a control connection similar to FTP. This connection is not the actual SLA test / operation, but an agreement between sender and respond on the rules of communication (much like a VPN tunnel build).
The information being sent in this initial communication is the port number to be listening on for the test, and the time limits for the testing, and should the responder agree it will send back a message indicating the decision and starts listening on that port.
If it does not agree, it will respond back with that, and that is the end of that.
Once agreed we not going from controlling to probing, as the source sends test back to the responder, to see if the packets are indeed echoed back as well as how long the process takes. This is because the responder will add Timestamps to those test packets as they’re received and once they’re return to the sender – The sender now knows the processing time of the responder as well as the overall round trip time.
This is assuming that all parties have the networks synchronized using NTP, of course.
So lets continue the configuration of IP SLA on R1:
R1(config-ip-sla)#icmp-echo ?
Hostname or A.B.C.D Destination IP address or hostname, broadcast disallowed
R1(config-ip-sla)#icmp-echo 172.12.123.3 ?
source-interface Source Interface (ingress icmp packet interface)
source-ip Source Address
<cr>
R1(config-ip-sla)#icmp-echo 172.12.123.3
R1(config-ip-sla-echo)#
I’ve decided to go with “icmp-echo” down to my spoke 172.12.123.3, note the “Broadcast disallowed” when specifying the host, as well as the two other options available where you can set a different source interface or IP if needed for any reason (perhaps a logical interface that won’t go down unless administratively or if the router goes down).
However I just took the <cr> way out, which dropped me now into configuration mode for ip-sla-echo, so lets look at the commands we have here:
R1(config-ip-sla-echo)#?
IP SLAs echo Configuration Commands:
default Set a command to its defaults
exit Exit operation configuration
frequency Frequency of an operation
history History and Distribution Data
no Negate a command or set its defaults
owner Owner of Entry
request-data-size Request data size
tag User defined tag
threshold Operation threshold in milliseconds
timeout Timeout of an operation
tos Type Of Service
verify-data Verify data
vrf Configure IP SLAs for a VPN Routing/Forwarding instance
R1(config-ip-sla-echo)#frequency ?
<1-604800> Frequency in seconds (default 60)
R1(config-ip-sla-echo)#frequency 60
R1(config-ip-sla-echo)#
Always a good to use IOS help to see if a command is in milliseconds, hours, years, etc. I set this to its default frequency of 60, so as of right now, this SLA is now configured and ready for use – It just needs to be “scheduled” to run!
Here is how to schedule an SLA from global configuration mode:
R1(config)#ip sla schedule 5 ?
ageout How long to keep this Entry when inactive
life Length of time to execute in seconds
recurring Probe to be scheduled automatically every day
start-time When to start this entry
<cr>
R1(config)#ip sla schedule 5 life ?
<0-2147483647> Life seconds (default 3600)
forever continue running forever
R1(config)#ip sla schedule 5 life forever ?
ageout How long to keep this Entry when inactive
recurring Probe to be scheduled automatically every day
start-time When to start this entry
<cr>
R1(config)#ip sla schedule 5 life forever
R1(config)#
One this not to trip up on, is putting “ip sla 5 schedule” or you will get this:
R1(config)#ip sla 5 schedule ?
% Unrecognized command
R1(config)#ip sla 5 schedule
So remember, when scheduling, your are first defining you are scheduling and then call out the IP SLA “entry” # after.
Now from the output above it seems pretty self explanatory, I just wanted to include it for thoroughness, but I chose to set the schedule of “life” running “forever” in this schedule.
The only thing that is left from the above output that we STILL need, is a start time!
R1(config)#ip sla schedule 5 start-time ?
after Start after a certain amount of time from now
hh:mm Start time (hh:mm)
hh:mm:ss Start time (hh:mm:ss)
now Start now
pending Start pending
R1(config)#ip sla schedule 5 start-time now
R1(config)#
So you can set it to start at different times in hours minutes and seconds, after a certain amount of time you set or right now. Pending is also a modifier but is used for much more complex SLA configurations that involve events to trigger the SLA, so we will stick to the “now” option.
To verify an SLA configuration, the command is “sh ip sla config”:
R1#sh ip sla config
IP SLAs Infrastructure Engine-II
Entry number: 5
Owner:
Tag:
Type of operation to perform: icmp-echo
Target address/Source address: 172.12.123.3/0.0.0.0
Operation timeout (milliseconds): 5000
Type Of Service parameters: 0x0
Vrf Name:
Request size (ARR data portion): 28
Verify data: No
Schedule:
Operation frequency (seconds): 60 (not considered if randomly scheduled)
Next Scheduled Start Time: Start Time already passed
Group Scheduled : FALSE
Randomly Scheduled : FALSE
Life (seconds): Forever
Entry Ageout (seconds): never
Recurring (Starting Everyday): FALSE
Status of entry (SNMP RowStatus): Active
Threshold (milliseconds): 5000
Distribution Statistics:
Number of statistic hours kept: 2
Number of statistic distribution buckets kept: 1
Statistic distribution interval (milliseconds): 20
History Statistics:
Number of history Lives kept: 0
Number of history Buckets kept: 15
History Filter Type: None
Enhanced History:
R1#
So here we see all the SLA information you’d about ever need. You have the Entry # / Type of Operation / Destination and Source (Source all 0’s because we didn’t change it during config) / The frequency schedule (60 seconds) / Next Scheduled Start Time (Already passed means it is currently running) / Life (seconds): Forever!
So now that we have verified the configuration is correct, lets verify the operation and see how that is going:
R1#sh ip sla statistics
IPSLAs operation id: 5
Latest RTT: 43 milliseconds
Latest operation start time: 12:42:24.904 UTC Mon Apr 17 2017
Latest operation return code: OK
Number of successes: 11
Number of failures: 0
Operation time to live: Forever
R1#
So our successes vs failure ratio is exactly what we want to see, we see the operation “start time” as well as the operation id (entry).
Now lets say we wanted to change something with this SLA:
R1#conf t
Enter configuration commands, one per line. End with CNTL/Z.
R1(config)#ip sla 5
Entry already running and cannot be modified
(only can delete (no) and start over)
(check to see if the probe has finished exiting)
R1(config)#
Get a message from the router saying back off, because you cannot modify an SLA entry that is already running, the command even tells you that you can only delete it with “no” and start over (and check to see if the probe has finished exiting).
So you do have to delete it and start over creating a new ip sla.
Now to see what failure looks like, you can find me on linkedin, or we can go over to R3 and shut down that interface to see if R1 starts spitting out some error messages:
R3#conf t
Enter configuration commands, one per line. End with CNTL/Z.
R3(config)#int s0/2
R3(config-if)#shut
R3(config-if)#
Apr 17 12:50:31.176: %OSPF-5-ADJCHG: Process 1, Nbr 1.1.1.1 on Serial0/2 from FULL to DOWN, Neighbor Down: Interface down or detached
R3(config-if)#
ASR#1
[Resuming connection 1 to r1 … ]
R1(config)#
I left it cooking for a minute or two and no console messages, so I issued a “sh ip sla stat” to see if that is showing a pile up of errors:
R1#sh ip sla stat
Round Trip Time (RTT) for Index 5
Latest RTT: NoConnection/Busy/Timeout
Latest operation start time: 12:52:24.905 UTC Mon Apr 17 2017
Latest operation return code: Timeout
Number of successes: 19
Number of failures: 2
Operation time to live: Forever
R1#
Sure enough it is, so you won’t get console message telling you there is an issue, you will need to use the “sh ip sla stat” to find errors (or they may also end up in “sh log” if configured properly).
R1#sh ip sla stat
Round Trip Time (RTT) for Index 5
Latest RTT: 44 milliseconds
Latest operation start time: 12:54:24.905 UTC Mon Apr 17 2017
Latest operation return code: OK
Number of successes: 20
Number of failures: 3
Operation time to live: Forever
R1#
So while I was over opening the interface back up on R3 we got another failure, but then it begins to increment successes again so we are good to go there!
That concludes this IP SLA overview, next up is Tracking with SLA, I cannot wait! 🙂