Case of the Named Pipes

Case of the Named Pipes

Problem I have come to expect vague error messages that seemingly blame the network. This one is no different. Server Error in '/' Application. The network path was not found Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. Exception Details: System.ComponentModel.Win32Exception: The network path was not found Source Error: An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below. (The stack trace is long and doesn't actually provide root cause either, so I omitted it to keep this post brief.) In this case, a web server was providing the error above after receiving a user request and attempting to connect to a backend database. Unless there are other dependencies, the problem is between the web server and database. Both ping tests and telnet tests to...
Read More

Case of the Rogue Server

Several hundred users lost network connectivity. They went down randomly, one by one, and over a short period of time. Some users had intermittent connectivity. All of the network devices were online and functional.  Users were roaming the halls and getting bored. This called for a packet capture, but with clients offline it had to be done on a network switch. In this instance, the capture was performed at the distribution switch on the layer 3 VLAN. It revealed clients frantically trying to connect but being rejected, dropped, and ignored. With everything down, something had to be done. The L3 VLAN was rebuilt and port security was removed. Nothing worked. An analyst at one point decided to clear the arp table. It helped momentarily, and then things fell back into disarray. That was the first real clue, though.   Viewing the ARP table showed the same MAC for several IPs including the client switch IPs. This MAC belonged to a device not managed by...
Read More

Case Study: Capture at Both Ends

While the general rule of thumb is to capture at the client, or at least start there, sometimes it's necessary to take captures at both ends of a connection. The client perspective will allow you to view the problem as it is seen from the client. The server perspective might show the same thing. Or, in some cases like this one, it will provide the reason for the problem. The problem was that a webpage wouldn't load. There were various errors and no real indication of the problem. While SSL was suspected, there was no proof. So, we started with a capture at the client. From the client, the capture reveals it sent a TLSv1.2 Client Hello as expected. However, it then abruptly ends with a FIN with no server hello. Versions and ciphers were compared in the settings, but everything matched. More data was required. Another capture point was setup on the server. Immediately, the problem was revealed. The server received a...
Read More
Capturing Light

Capturing Light

How do you go about catching the one of the fastest things known to man (light) at a specific point in time with pinpoint accuracy over and over again? With a little patience and your network card, of course! This post is an introduction to the process of capturing network traffic (aka "sniffing" or "tracing"). With most of my blog being dedicated to network performance analysis, a post like this is foundational, and will help you understand the basics moving forward if you are new to "sniffing the wire". Networks range in size from small home networks with 1 PC to global corporations with many users and data centers, to the largest network (or collection of networks depending upon your definition) of them all, the Internet. While size, speed, design, and costs may vary, they all use the same signals to communicate. These signals may take the form of electricity (ethernet cables), light (fiber), and radio waves (wireless) across varying mediums. These...
Read More
Classic Performance Example

Classic Performance Example

*Disclaimer: all captures in this post were anonymized using TraceWrangler. I was recently asked to help with a performance issue. I was informed a transfer was going to take weeks instead of a couple days as expected. The transfer rate was getting 80Mbps throughput max on a 10Gbps connection. So, I setup captures at both ends and got to work. This is just a quick summary of that work with the classic tell-tale signs of a performance problem. The first thing I noticed were 30 zero window segments in a matter of seconds in the "Expert Information" window. One or two might be tolerable under normal circumstances, but 30 is something of interest. Small TCP window sizes and zero window packets generally mean there is a problem with one of the end devices. This grabbed my attention, so I moved back to the packet list. When looking at the Zero Window packets I noticed delays anywhere from 100 to 300 ms before the...
Read More