I know I’ve been quiet of my blog this month, but that’s not because I’ve forgetting about this site! For the month of September I’m spending some extra time over at Thwack, this is SolarWinds community where everyone likes to get together and talk about network management. I’ve talked about Network Management quite a bit on this blog in broad topics, and during September at Thwack I get focus on VoIP Network Management.
That announcement aside, the Thwack community is great! I’ve been member of the community for many years (Unfortunately under a few different accounts until I nailed one down a single account). Especially if you are SolarWinds user here are a few highlights of the community:
- Content Exchange – There is a good chance if you are looking for a particular report or custom poller someone else may have already thought about it to, and there also a chance that it might be available in the context exchange available to be downloaded an imported into your SolarWinds platform.
- Feature Request – Got an idea that will make a SolarWinds product easier or better enter this section, the more users that agree with your idea the better a chance it may get implemented into the system
- Feel free to ‘+1′ my request for better & more stream lined monitoring of F5 Virtual Server :-)
- Great discussions & news updates from SolarWinds – There are many forums on Thwack that focus on different areas of network and systems management, with many great discussions going on. SolarWinds is also great at keeping their users in the know in regards to updates and possible upcoming features.
So far my posts have been about the following:
- How do you prepare for a VoIP deployment? What items are on your VoIP readiness checklist.
- I also took some extra time to consolidate many of the great recommendations into a single document found here.
- What tools do you use to troubleshoot VoIP related issues?
Come join the discussions!
VN-Tag, typically was a technology only seen in the Data Center (When using Nexus 2000 series FEXs) however this has started changing. If you check out Cisco’s new 6800’s series Catalyst switch you’ll see they are now pushing a new ‘instant access’ model. This new model allows us to deploy ‘dumb’ switches that are centrally managed by the main 6800 Chassis essentially making those switches act as ‘external linecards’. Now in the Data Center this is nothing new, sounds just like the Nexus 2000 Series Fabric Extenders (FEX’s) right? Well, they are a little more similar then you think deep down. VN-Tag is the technology used by Nexus 2k FEX’s with their upstream parent switch, that same VN-Tag is also being utilized in the Cisco 6800 Series with its downstream external switches.
The VN-Tag technology itself simply adds an additional header to the packet as it traverses between the ‘Instant Access’ (Or FEX) and the it’s parent switch, where all the switching occurs. It is important to call-out the ‘Instant Access’ switches do NOT do any local switches (once again just like Nexus 2k FEX) all the information is forwarded up the 6500 or 6800 chassis where the packet will get switched or routed accordingly.
Let’s take a quick look at the VN-Tag information itself:
Surprisingly it’s nothing more than an additional 6-bytes. The fields are as follows:
- EtherType [16-bits] = 0x8926
- Destination Bit [1-bit] – Indicates which direction the frame is flowing.
- Pointer bit [1-bit] -
- Destination VIF – [14-bits] – Identifies the destination port.
- Looped – Identifies the source vNIC, ment to identify multicast frames to ensure it is not forwarded back to where it originated.
- Reserved [2-bits] – For future use.
- Source VIF [12-bits] – vif_id of the downstream port.
- Version [2-bits] – Set to 0
Expanding the VN-Tag to products outside of the Nexus line is definitely a great move, and it definitely has great use-cases in many large enterprises that want a more consolidated approach for management, after all this easy way to cut down on your Layer-2 spanning-tree topology. You can also cut down on configuration/implementation time, since this technology allows you to pre-provision the ports on an Instant Access before it is actually connected to the chassis.
P.S. - This was a blog post I had start middle/late last year when the 6800 & Instant Access were first announced. Finally got around to finishing it better late than never right?
FC vs FCoE infrastructure, usually a common debate when designing the network infrastructure of a new Data Center or new part of a Data Center, after all the advantages of running a converged storage & IP network are hard to turn down. Many of us are probably already aware of the why FCoE is always good option vs the traditional dedicated FC design, but I wanted to point out one interesting fact that make FCoE more efficient and that fact stems down the Ethernet transport.
When FC frames are sent over a Fibre Channel network they are placed onto the physical medium in 8b/10b encoding which has an overhead of 25% to ensure the signal is sent successfully. (and intact with no corruption, essentially for every 8-bits, 2-bits are used to ensure the signal is still intact think of this an CRC or FCS for the electrical signals)
When we consider Ethernet, it is sent over the wire with a 64b/66b encoding, about a 3% overhead as the information hits the physical medium. So while FCoE may encapsulate the already large FC Frame with additional header & trailers FCoE from physical perspective is much more efficient means transmission. (So for every 64-bits sent to the wire, 2-bits are used for this ‘error checking’ and much better ratio than 8b/10b)
The 8b/10b encoding is used for 1, 2, 4, & 8 Gb Fibre Channel, the new 10+Gb Fibre Channel technologies are relying on 64b/66b encoded, which may tip the scales. However the converged infrastructure & cabling make FCoE the better option for most environments from a cost management perspective.
This is also an aspect the differentiates 100Mbps from 1Gbps from 10Gbps
- 100Mb Ethernet – 4b/5b encoding
- 1Gb Ethernet – 8b/10b encoding
- 10Gb Ethernet – 64b/66b encoding
How’s that for a topic out of left field, never before seen on my blog. VMWare! Well, I got wind that WMworld sessions are available to watch for free so I just wanted to spread the word. All you need to do is create an account, you can find the sessions here. I’ll say It’s nice see vendors continue this trend, Cisco is already doing that with their CiscoLive365 site.
Looks like the Cisco Certification team has been busy lately, earlier this year the CCNP: Security track got an update and recently an update to the CCNP: Route/Switch was just announced. Before you get too worried if you are currently studying for the current exams, you have until January 2015 before the current exams get retired. So you still have plenty of time to study.
To highlight a few of the changes:
Route v2 300-101:
- Much more IPv6 related topics.
- The introduction of DMVPN
- CEF Concepts
- Various security technologies
Switch v2 300-115
- Stackwise technologies
- Removal of VoIP, Video, & Wireless topics
- L2 Security technologies
TShoot v2 300-135
- Mixture of the new Routev2 & Switchv2 Technologies.
Looks like the newer CCNP: Route/Switch objectives are really going to focus on routing & switching technologies and less on other networking technologies. These new objectives also line up closer with the new CCIEv5.
Now, I took the older CCNP exams (the old ONT, ISCW, BCMSN, BSCI) but it is interesting to see how these exams grow and evolve over time. I will definitely say I am surprised to see the removal of Wireless, Voice, & especially QoS from the CCNP: R/S exams. While I understand the CCNP: R/S should focus on well Routing & Switching, I also think it is important for engineers know of these other technologies especially QoS.
What do you guys think?
Reloading a router can sometimes feel right like an eternity, usually when you issue a reload you can step away get a cup of coffee sit back down and the router should be just about be ready. For quite a while now (well since before 12.4 which lets admit is a long time ago) we’ve had to ability to ‘warm’ boot Cisco routers and cut the reload time in half! The first question we probably ask, is what exactly does the router do differently to decrease the time it takes to reload? Well, When you want a router boot, what is one of the first things you see it doing? Usually you see the router reading the flash card, grabbing the IOS and decompressing it into RAM, when you enable the warm reload feature the router skips this steps because it keeps this information in a reserved portion of memory.
This was the best graphic I could that provides a visual explanation of the process. A powerpoint from Cisco in 2004:
What’s even better about this feature, is how easy it is configure. This feature is enabled with a single command:
There a few caveats you want to keep in mind when configuring this command:
- As mentioned after the initial configuration of the warm reload you need to initiate another reload of the router before the feature really becomes active.
- The count is the number of times you can perform a warm reload before you must perform another cold boot.
- uptime is the amount of time the router must online in between warm reloads this means you can sit back and continuously warm reload a router as soon as it becomes available.
- When you want to perform a warm reload you must specify the keyword warm after the reload command. (see example below)
You can verify the configuration with show warm-reboot:
Performing a warm reload:
This was the initial reload:
This was a warm reload:
Now, what I think the best part of this feature is, is the fact we can use warm reloads to stage IOS updates, because let’s face it how many often do we decide to just randomly reboot Cisco routers! This process works in a similar fashion, the router will actually load & decompress the new image prior to going down, saving time.
This is done with the following command:
-Unfortunately I was unable to successfully do this, looks like more trusty old 1841 just couldn’t do it with IOS v15.
Now, if only we could get this feature in NX-OS, when you reboot a Nexus you can forget about the coffee, hop in the car and get a pour-over from Starbucks come back, and you might still be waiting.
A while back I wrote a quick article about network management and tools you can rely on. As you could imagine network management is a quite a broad topic and definitely one definitely can talk about till I turn blue. So I figured I would throw out some other interesting considerations when dealing with Network Monitoring.
- The location of the NMS or more particularly the device that performs the device polling. When you get into larger networks and more robust network monitoring applications (Nagios & SolarWinds for example) you can expand into a distributed model. That distributed model sounds awesome at first but there could be some interesting caveats with this approach.
- The location of that distributed poller, did I say that twice? Why is the location important though? Well any health statistics gathered from the poller will be reported in the NMS from the perspecitve of the poller. At that point you must ask yourself is that information valuable, and does that provide the level of monitoring you are expecting to capture?
- Running a networking monitoring system in a distributed model creates additional overhead and requires more upkeep to simply monitor and maintain the NMS itself. On top of verifying the health of the network/systems/services in your environment you also need to account for the health of the distributed pollers as well.
- Are you putting an additional load on the monitored devices? Believe it or not but monitoring a device via SNMP can actually create issues. A device might be susceptible to a software caveat/bug when certain processes are monitor or polled. When polling a device; that monitored device must process and respond to a number of different SNMP Get request you definitely want to make sure the other more important devices functions/process are not hindered as the device process SNMP data from the process/control-plane perspective.
- Consider the additional load of monitored services. In my first post I mentioned monitors tools such as Syslog, NetFlow & IP SLA’s depending on the environment it may worth considering the amount of traffic those tools can generate sure from a dozen devices it might not add up much but what if you had hundreds of devices at that point syslog messages from a few hundred device might become a bit overwhelming. In some cases management related traffic might need to be marked down in QoS policies to avoid management traffic affecting the production traffic of your network.
- Polling Intervals, this is an easy one to overlook but an extremely important factor to take into account. It may take some time to find that sweet spot but:
- Poll a device too aggressively and you risk creating false alarms.
- Poll a device too infrequently and you risk missing important events in the network and/or network outages.
- A ‘very aggressive’ polling interval may also crash the NMS application itself, remember at the end of the day these are simple applications running some type of database on the back-end with limitations.
- Understand the data that the NMS is presenting to you. Anyone can open a web browser and look at a nice fancy but what does that fancy graph really tells us? Let’s say your NMS calculates latency and response time, well how does it do that? Is it polling a particular OID or is it simply pinging the device and looking at the response time, and if so how often does it issue that ping? The same can be asked with interface utilization graph, is it simply grabbing the Rxload/TxLoad statistics from the specific interfaces at a set point or are other tools like NetFlow taken into account? The more you know about how your NMS works the easier and quicker it will be for you to interpret, analyze, and diagnose issued presented to you via your NMS.