Pages

10/22/2009

Good article: Storms RIP the Net

This is an informative recounting by Laura Chappel of the investigation and repair of network traffic issue crippling a network. Nothing could stay connected even long enough to do a "normal" packet capture.
She had them setup a quick packet capture outside the GUI to allow for getting on and getting the capture before being bumped off.

tshark -c 100 -w gen1.pcap


The -c parameter indicates the number of packets to capture. The -w parameter is
used to define the name of the trace file to create.

Looking at the 100 packets the fact that the IP Identification field matched for every packet indicated that this was a looping condition rather than some kind of denial of service from a single host.

A switch loop is easy to create and often hard to troubleshoot, unless you are looking for this exact condition. And often the opportunity to create a loop is made available to the masses with proliferation of workgroup switches to avoid spending a couple hundred bucks on having another jack installed. ("Gee, here's an end of a cable coming out of a big tangle under my desk. It must need plugged in...")

Separating broadcast domains into several VLAN's, like one per floor or some other logical separation, can limit the scope of a problem due to a switch loop. At least only one VLAN will be down and you have a narrower search area for the loop -- check the log on one or two switches instead of 20-30.

No comments: