Just because you can do something doesn’t always mean you should. One such example of this is using large HTTP headers. While the HTTP specification itself doesn’t set boundaries, most web servers have default limits around 8 KB. Other devices in the path such as firewalls/WAFs, proxies, and load balancers also have similar limits.
The application testers were receiving a reset error. Their application and web server logs did not show any problems.
The first question asked was, “If the web server isn’t sending the reset error, what is?” In this case we found there were several devices in the path including a domain firewall and a load balancer. The firewall admin saw two-way traffic hitting an accept rule and passing through. That left the load balancer. The load balancer admin confirmed via a packet capture that it was, in fact, sending a reset near the end of the TCP stream.
Why would the load balancer send a reset?
A load balancer does exactly that….balances the request load, so why would it care about a request near the end of a stream and reset it? To answer that question, here is an anonymized version of the packet capture.
Thread shown from CloudShark. You can see about 46 thousand letter “B”s in the HTTP header (though it looks a bit different here due to the process of anonymizing it).
This is a bit different than I typically see, so it caught my attention. A quick search of the load balancer’s documentation showed that it has a system-wide default 4 KB header limit. The users and this test were sending 46+ KB headers!
The solution seemed simple…raise the header limit. Fortunately, there was a QA instance that could be adjusted and tested without any outages. The header size was modified to 50 KB for testing (above the 46 KB test header). Surprisingly, the reset continued.
The Solution Take 2
After returning to the load balancer documentation, it was also discovered that there was a default packet count limitation of 16 packets as well, but this couldn’t be adjusted. After a little head scratching, it was decided to disable and re-enable the WAF services on the load balancer. Apparently, after changing the header limits the WAF didn’t cooperate and needed a little kick. The resets were no more and the application functioned properly.
Here is the link to the stream in CloudShark with my comments.
There are good reasons for header limits from performance issues to security concerns. Raising limits such as these should be done with reservation. In this case, the app owner is working to change the way requests are submitted in order to reduce the header size in the near future.
Traces Sanitized by TraceWrangler v0.3.6 build 583