HTTP persistent connections to a server

Imagen eliminada.My work and readings have lead me through a lot of things related to the simultaneity of connections from a browser (client) to a web server and the HTTP context in which all that happens. It is still a bit difficult for me to find the time to put all that in order, but below are a few good reads on the topic.

Respect for HTTP limits

First of all, you should know that HTTP 1.1's RFC says (between many other things) that "a single-user client SHOULD NOT maintain more than 2 connections with any server or proxy". This explains a lot as to why some websites can be very slow, while others can be very fast. It doesn't explain, though, why a browser queues one single request to a server when it's a request to a full page, but I can imagine this is a corrolary to this part of the HTTP's RFC. In any case, what that means is that, if you load a page from a web server and all the CSS, JS and images on this web server are located under the same domain name, you will only ever be able to download 2 at a time. When one resource has downloaded completely, your browser will be able to initiate the call to another resource. This sometimes makes websites appear "bit by bit" in an upsetting manner. Now browsers used to respect that recommendation in the past, but that tended to limit the viewing speed of a website, so browsers developers started to allow for more. Today, for example, you can load the "about:config" page in Firefox and look for the network.http.max-persistent-connections-per-server variable. You will see it is set to 6 (which is the default now in Firefox). You can pretty much set it to 20 if you like, but beware that this is not necessarily the best setting (it might make it difficult for your browser to manage that many downloads at a time). So, in conclusion, your browser is limited to 6 downloads (sometimes 4, sometimes 8) at a time, by default, although the standard says they should limit it to 2.

Multi-CDN-ification

And this is why major websites have started multiplying the number of CDN domains they use: if the browser is limited to 6 downloads at a time *by domain*, why not multiplying the number of domains? Well, it's true: by increasing the number of domains (and these can also be subdomains), you increase the total number of files a single user can download, thus potentially increasing the download speed for your website. What's more, if you spread these domains between several web servers, you provide a spread bandwidth and the speed is further improved. If you abuse this system though, you might get a reverse effect: the time it takes to resolve the IP addresses of all these domains might increase the load time for your site.

Optimizing CDNs

This is why some techniques are available to cover that last problem (which isn't to say that you should use 20 different domain names for your site), whereby you can ask the browser to "prefetch" the translation of domain names to IP addresses while the other domain names haven't been asked for yet. You do that by adding prefetch instructions into your HTML header section. This is apparently implemented by most browsers. You can also hire a specialized service that makes CDNs resolution super-fast, which can save up to 500ms on connections to your website. Manually user-level multi-threading PHP (sort of) Something that is not directly an extension of the previous topics is that a web server will only execute one PHP process on one single core of a multi-core processor, so if you want to load a very heavy script two times in parallel, you'll have to take into account that your browser will try to limit the number of concurrent connections to a server and actually queue them (one by one), which means it will be impossible for you to execute simultaneous requests and use the 16 cores of your super Xeon XYZ processor. In order to "hack" this process, you can start several browsers at the same time. This will work (but there are only so many different browsers you can install on one computer). An extension to this is that you can start individual Firefox sessions by launching (on the command line or in your ini script) it this way: firefox --no-remote -ProfileManager. This way, you can manage 4 different Firefox sessions and even overload your server if your computer can manage more simultaneous instances of Firefox.

References

This is all for today. I'll leave you with a few references which have helped me working out these details