CCIE or Null!

My journey to CCIE!

Posts Tagged ‘Network Monitor

Network Management a forgotten art? Part 2!

leave a comment »

Monitornetwork

A while back I wrote a quick article about network management and tools you can rely on. As you could imagine network management is a quite a broad topic and definitely one definitely can talk about till I turn blue. So I figured I would throw out some other interesting considerations when dealing with Network Monitoring.

  1. The location of the NMS or more particularly the device that performs the device polling. When you get into larger networks and more robust network monitoring applications (Nagios & SolarWinds for example) you can expand into a distributed model. That distributed model sounds awesome at first but there could be some interesting caveats with this approach.
    1. The location of that distributed poller, did I say that twice? Why is the location important though? Well any health statistics gathered from the poller will be reported in the NMS from the perspecitve of the poller. At that point you must ask yourself is that information valuable, and does that provide the level of monitoring you are expecting to capture?
    2. Running a networking monitoring system in a distributed model creates additional overhead and requires more upkeep to simply monitor and maintain the NMS itself. On top of verifying the health of the network/systems/services in your environment you also need to account for the health of the distributed pollers as well.
  2. Are you putting an additional load on the monitored devices? Believe it or not but monitoring a device via SNMP can actually create issues. A device might be susceptible to a software caveat/bug when certain processes are monitor or polled. When polling a device; that monitored device must process and respond to a number of different SNMP Get request you definitely want to make sure the other more important devices functions/process are not hindered as the device process SNMP data from the process/control-plane perspective.
  3. Consider the additional load of monitored services. In my first post I mentioned monitors tools such as Syslog, NetFlow & IP SLA’s depending on the environment it may worth considering the amount of traffic those tools can generate sure from a dozen devices it might not add up much but what if you had hundreds of devices at that point syslog messages from a few hundred device might become a bit overwhelming. In some cases management related traffic might need to be marked down in QoS policies to avoid management traffic affecting the production traffic of your network.
  4. Polling Intervals, this is an easy one to overlook but an extremely important factor to take into account. It may take some time to find that sweet spot but:
    1. Poll a device too aggressively and you risk creating false alarms.
    2. Poll a device too infrequently and you risk missing important events in the network and/or network outages.
    3. A ‘very aggressive’ polling interval may also crash the NMS application itself, remember at the end of the day these are simple applications running some type of database on the back-end with limitations.
  5. Understand the data that the NMS is presenting to you. Anyone can open a web browser and look at a nice fancy but what does that fancy graph really tells us? Let’s say your NMS calculates latency and response time, well how does it do that? Is it polling a particular OID or is it simply pinging the device and looking at the response time, and if so how often does it issue that ping? The same can be asked with interface utilization graph, is it simply grabbing the Rxload/TxLoad statistics from the specific interfaces at a set point or are other tools like NetFlow taken into account? The more you know about how your NMS works the easier and quicker it will be for you to interpret, analyze, and diagnose issued presented to you via your NMS.

Written by Stephen J. Occhiogrosso

June 19, 2014 at 7:51 AM

Who’s congesting my network?

with 6 comments

I figured I would write a post concerning some features built-in to most Cisco routers nowadays that can be lifesavers in identifying network congestion and who/what is causing it.

The first feature I want to mention is NetFlow, this nifty little feature will identify network traffic by the protocol as well as determine how much throughput each protocol is using giving you a clear view of the traffic traveling your network. You configre it on a per interface basis, specify the address you want the Netflow information sent to, and also the port you want it sent out on. 2055 is the default port used by the SolarWinds Netflow Analyzer in this case (Free Tool)

You can issue the sh ip cache flow command to see the output. While this output can be duanting at first it is actually fairly simply to understand once you realize what each column signifies. A nice shortcut for analyzing netflow is to find a free tool that will do it for you.

Their is more information displayed but from this point it looks almost identical to the sh ip flow top-talkers command shown below, the important thing here is the breakdown of the major protocols.

The next really cool feature is called top talkers after you configure this you can quickly see which end devces on your network are taking up the most bandwidth.

The configuration is as follows:

A fairly straight forward configuration, first you enable top top talkers and then configure the parameters you want. You can set top-talkers to sort by the amount of bytes from each end device or by the amount of packets. You can also configure the amount of devices you want to see, anything from 1 device to 200 device I usually prefer to simply see the top 10 devices (well 8 in this case)

You view the top talkers with the sh ip flow top-talkers command:

As you can see the output is placed nicely in a few columns, identifying the source interface and IP address, the destination interface and IP address,  the protocol number (Pr column), the source and destination ports (keep in mind these are in hex format and need to be converted to decimal), and lastly the amount of bytes transferred in this case.

So whether someone has introduced a new program, or a users decides to try and download the entire internet you should be able to easily identify it. Those two built-in features alone can help you troubleshoot any network congestion your network experiences with your Cisco devices.

Written by Stephen J. Occhiogrosso

January 13, 2011 at 1:13 PM

%d bloggers like this: