This first post will give you all the inputs to understand how web protocol works and how performance can be terrible due to either poor application coding, configuration issues on client or server side and / or latency on network links. We will go through different steps in order to understand the complete stack and options available to developers, architects and operational teams in charge of running the service.
Bear in mind that we are going to looks here at optimizing performance of applications that are laggy due to HTTP protocol being chatty and degrading perfromance on low speed or high latency links. If your application is performing like crap (e.g poor database queries lasting hours), this topic won't help make it better :-)
0. Understanding HTTP :
Without going into the details, HTTP is a standard process that has been defined in order to standardize the internet. It is pretty straightforward and independant of the web server and client you are using. It always starts very simply : the client queries the server to download an HTML. In the URL, you will see all sorts of extensions to the file which is loaded (aspx, php, html, jar...), but these are simply script or compiled code that will produce html content.
Once the client downloads the HTML file, that file contains a list of tags which define the web page structure but also all media to load. As your browser parses the HTML files, it triggers more requests to the web server to download media this time (javascript files, css style sheets, images, video resources, sounds....)
So to summarize, this is pretty simple... You always start with one file which defines the page structure and all the associated media. When your browser receives the HTML file, it parses it and starts loading all resources defined in the HTML file...
1. A simple test case :
The easiest way to illustrate and understand the basic behaviour of a web client communicating with its server is to build a very basic web page and load it. In order to emphasize the issues, I have created a 1.5 MB png image and renamed it 20 times (a0 to a9 and b0 to b9) so that the page loads slowly enough to illustrate the breakdown of the activity.
Let's start with the code, which is very basic. As you can see below, this HTML file has a list of 20 PNG images
And an overview of the page shown in the browser
I used a freeware tool to take very basic measurements from Firefox & Internet Explorer:
HttpWatch (Please be aware you will need another tool for similar measurements from Chrome). This tool simply displays graphically the loading of a web page from your browser. When loading the page above, you get the image below :
Now, what do we see here ?
- The very first item loaded from the site is test.html (as shown in the first line above). The loading of this page is very quick since the HTML file is very small in size (724 bytes). Note that nothing else happens while the test.html file is loaded. This is simply due to the fact that the browser needs to load the HTML file and parse it to know what other files it has to load from the web server (the HTML file defines the page strucutre in addition to all resources to load such as javascript files, css style sheets, images....)
- The following part is that is interesting : you would expect that, as soon as the hmtl file is parsed by your browser, all images would be loaded together. This is not the case. Worse, this is never the case. Indeed, there are a set of restrictions on servers and browsers that limit the maximum number of parallel transfers between a client and a server. I used a Firefox version 3 for the test above, which has a limitation of 6 maximum connections per server (controlled by the network.http.max-persistent-connections-per-server setting as explained here)
Now, why do these limitations exist? This is more a "gentleman agreement" on the Internet. Indeed, if users completely removing this limit, they would very quickly hit the limit of maximum connections on the web servers and penalize the service for other users. Be aware that increasing this value above 10 maximum connections per server risks having your IP backlisted... so make sure you play with this in private. Finally, the
RFC 2616 defining HTTP 1.1 standards initally stated that only 2 persistent connections per server should be allowed... but this was in the old days of internet, when pages were lightweight and didn't host so much media...
If we go a little further in the analysis, we can not that the screenshot above shows and HTTP GET 200 return code which corresponds in HTTP to an "OK" (meaning you successfully executed the GET request and downloaded the file from the server). Note that if we reload the page, we get another HTTP code :

This time, you can notice the loading time is way faster. This is simply due to the fact your browser has a local cache of the files. Still, you will notice the browser queries the server to see if the local cache file corresponds to the file stored on the web server... the impact is not noticeable here, due to the fact the web server is local, but if you imaging a web server at 200 ms latency from you (e.g a server in Asia contacted from a client in Europe), and the fact you can only run 6 queries in parallel, you need 4 groups of queries (6 + 6 + 6 + 2) to check all the files with the cached one which adds 800 milliseconds to the loading time of your page... If this is not clear at this stage, don't worry, we will check this out in detail lower in this post.
2. Fixing the performance issue through client configuration :
There are several places you can work on in order to solve the issue. The ones presented in this section are only configuration changes. Please note this will enhance performance but is it can be combined with solutions proposed lower in this document to really boost end user experience...
The first obvious setting to change is on your browser side, and increase the maximum connections :
- In Firefox, type about:config in the URL bar and change the value of the http.max-persistent-connections-per-server setting

- In Internet Explorer, the value is changed in the Operating System Registry. Click Start => Run => Regedit and look for the key HKEY_CURRENT_USER> Software> Microsoft>Windows> CurrentVersion> Internet Settings and change the values of the key MaxConnectionsPer1_0Server and MaxConnectionsPerServer.
- On Chrome, this setting seems to be hard coded and tied to the user profile... meaning fine tuning for a single end user (hence a single profile) will be tricky...
You can find the details of the default maximum connetions per server in the
Browserscope web site.
3. Fixing the performance issue through server / application configuration :
If client configuration is not sufficient, you can start working on the server or application side. The server will allow the same tuning as the client, working on the maximum TCP connections it will accept but also the maximum number of connections per client.
The next changes are directly in the application stack, Where several changes can be done in order to change the end user experience. Here are a few easy ways of improving your solution (this part will be detailed later)
Adding expiration header to your media:
This solution is pretty straightforward as you simply replace every call to a resource by a small piece of code that will inform the client of the expiration date of your resource. This code can be dealt with in several ways. The first solution is to build a small script page that adds header information about resource expiry before transfering the resource to the client.
a) Hard coding expriation
A simple example would be that, instead of referring to your images in your HTML files, you call a home made script that informs the client of a default expiration date :
b) Global Configuration of expiration via htaccess
Rather that handle every image individually, an easy solution is to set expiration directly on your .htaccess file in order to have a general control over file expiration
c) Global Configuration of expiration via web server
Several web browsers will let you configure media expriation directly in their configuration files. The most common web server being Apache, let's look at how to do the change with Apache. A little module named
mod_expires will give you control over the expiration headers of media in your web pages. Here again, configuration will allow you to set expiration based on media types :
Now, this is usually not sufficient... so the last section of this article will show information that will make a huge difference...
4. Leveraging other technologies to change user experience :
Before looking into WAN acceleration devices, let's just compare 2 solutions :
- A first solution which remains our initial use case : I connected to a central Sharepoint web site located in Europe, from my work computer, when I was travelling in Brazil. You will see below the large blue sections which represent download of media through a WAN link
- A second attempt, when I was on the same site (and connection) in Brazil, but where I no longer used my browser but a browser hosted on a Citrix platform in Europe, just beside the Sharepoint farm I want to connect to. Here, transfer times are reduced to almost nothing and web experience is completely changed
So what is the main difference here? The first is the page loading time : with my local browser, loading time was 14 seconds. When going through Citrix, this dropped to 5 seconds. Indeed, from my browser, due to the chatty HTTP protocol and the latency coupled to the 5 persistent connections limit with the webserver, I spend most of my time on red and blue parts which correspond to waiting for server response (250 ms round trip latency) and waiting for the data to come through the pipe (data transfer).
On the second case, with Citrix, waiting for the server and transfers are minimal since all data is transfered between 2 servers in the same Datacenter. The only transfer to my local PC is Brazil is Citrix data, which, as you will see later, can also be optimized / cached with specific WAN acceleration appliances.