Case of an FQDN Issue

The phrase, "I can't access my shared drive" was intermittent, but becoming common for a remote location connected via an MPLS circuit. Without hesitation the finger was pointed at the network and my phone rang. People connect to shared drives everyday, but it is one of those things they take for granted. Behind the scenes there are many layers of technology, protocols, and devices working together to make those connections happen. I can’t count the number of ways to hinder performance of a network share or prevent it from working altogether. *SPOILER* If you’re like me and you like to know the big picture first, the problem in this case was DNS. Read on for the details, or just know a great test is to place a ‘.’ at the end of your DNS path. To kick things off I asked for a screen shot of the error. This is what I received: Obviously, this was not very helpful (are error messages ever?)....
Read More

Problems After Updates?

One of the most common problems any IT admin faces is a software update. While software updates are generally considered a good thing, because they patch security flaws, fix bugs, try to improve performance, and more, they are also a common source of problems. Every admin knows to be ready for calls after a scheduled maintenance window. This issue was no different.   A ticket came in stating users could not access a web app through the backend system after an upgrade to Java 1.7. The server, java, and app logs all looked ok and appeared to be running properly. Also, interestingly, the web app worked when accessed directly from a web browser. This sounded like a perfect opportunity for a quick packet capture and analysis. Here is what was produced:     *Note: In order to maintain the SSL session info I could not anonymize this, so I'm just using a screen shot instead of sharing the capture on CloudShark.   I've done quite a few...
Read More
Case of the Named Pipes

Case of the Named Pipes

Problem I have come to expect vague error messages that seemingly blame the network. This one is no different. Server Error in '/' Application. The network path was not found Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. Exception Details: System.ComponentModel.Win32Exception: The network path was not found Source Error: An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below. (The stack trace is long and doesn't actually provide root cause either, so I omitted it to keep this post brief.)   In this case, a web server was providing the error above after receiving a user request and attempting to connect to a backend database. Unless there are other dependencies, the problem is between the web server and database. Both ping tests and telnet tests to...
Read More
Case of the Tired Firewall

Case of the Tired Firewall

Have you ever had a nightmare where you are being chased and you can't just seem to run away fast enough? No? Well, maybe you've tried running through snow up to your knees or swimming while wearing jeans. All of those examples point to situations that feel like something isn't quite right. Cases where there could be better performance if only something was changed or improved. Sometimes this same thing happens to network devices. In this example, it's the case of the tired firewall. Problem It was a typical day in the office with users milling about drinking coffee, others happily working, and some furiously pecking at their keyboards with deadlines looming. For a small segment of users, though, things weren't normal. They were staring at their monitors with looks of puzzlement and confusion. Instead of working on their latest application update they were running tests with varying and unusual results. They found several symptoms, but were unable to pinpoint the root...
Read More

Case of the Rogue Server

Several hundred users lost network connectivity. They went down randomly, one by one, and over a short period of time. Some users had intermittent connectivity. All of the network devices were online and functional.  Users were roaming the halls and getting bored. This called for a packet capture, but with clients offline it had to be done on a network switch. In this instance, the capture was performed at the distribution switch on the layer 3 VLAN. It revealed clients frantically trying to connect but being rejected, dropped, and ignored. With everything down, something had to be done. The L3 VLAN was rebuilt and port security was removed. Nothing worked. An analyst at one point decided to clear the arp table. It helped momentarily, and then things fell back into disarray. That was the first real clue, though.   Viewing the ARP table showed the same MAC for several IPs including the client switch IPs. This MAC belonged to a device not managed by...
Read More

Case Study: Capture at Both Ends

While the general rule of thumb is to capture at the client, or at least start there, sometimes it's necessary to take captures at both ends of a connection. The client perspective will allow you to view the problem as it is seen from the client. The server perspective might show the same thing. Or, in some cases like this one, it will provide the reason for the problem. The problem was that a webpage wouldn't load. There were various errors and no real indication of the problem. While SSL was suspected, there was no proof. So, we started with a capture at the client. From the client, the capture reveals it sent a TLSv1.2 Client Hello as expected. However, it then abruptly ends with a FIN with no server hello. Versions and ciphers were compared in the settings, but everything matched. More data was required. Another capture point was setup on the server. Immediately, the problem was revealed. The server received a...
Read More