Rob Hartill. Internet Movie Database. Apr 1998
Surfing the internet you sometimes
may be forgiven for thinking that you've wandered off onto some
mud infested cattle trail, things go that slow. As you wander
the net you may not see anyone else but
you're sharing congested highways with millions of
other people from every corner of the world. The key thing
to remember here is that you're sharing resources... limitted
resources.
Sooner or later, when there
are enough people using the same resources they will inevitably
become saturated. Once saturated, everyone ends up waiting
their turn to use the scarce resource and as web surfers we observe
this saturation in the form of delays reaching and downloading web pages.
The Internet is prone to slowdowns caused by bottlenecks on
key network routes. However, for web surfers this is not the only cause
of delays. In this age of instant information the slightest delay
is sometimes noticeable. We can become impatient with a 30 second
download delay from a web server even though it might be saving
us a 3 hour trip to the mall or library. That 30s delay may well
be caused by the sheer complexity of creating the page requested.
Maybe you've asked for a database to be searched or perhaps the
server you're talking to is very popular and is overworked. Delays
are almost inevitable.
Strangely, some argue that delays are for other
people, and that if you use their products you can reap the benefit of
state of the art technology that's able to side-step delays. We're
talking web-accelerators.
Since there's some variations in what some vendors consider to be
web-accelerators we need to define them for the context of this
article. Here we consider web-accelerators to be the web browsers
and web-browser plugins which use a technique called prefetching
to download web pages before they are needed.
If we were to believe the marketing hype of the web-accelerators
creators and vendors then we'd be looking at the Net equivalent of
the 'science' of alchemy. They'd have us believe that the web is
only slow because we're not using it quickly enough. Let's explain.
Web-accelerators use prefetching. Basically that means that
when you visit a web site and download a page, instead of your browser
and network connection sitting there idle, it gets put to some 'good use'.
While you read the page you last downloaded, the web-accelerator will
look at that page and find all the links it has to other pages.
(Often, but not always the web-accelerator will only look for links that
point to the same web server.) When it finds a link or collection of links
the accelerator starts to download each of them. Each downloaded page is
squirreled away onto your harddisk just in case you need it later.
Can you see the
catch yet ?.
If you haven't worked out what the catch is yet,
think back to earlier paragraphs of this document that explain why the
web is seen to be slow in the first place...
- the electronic highways are congested.
- web servers may not have the capacity to serve any quicker.
- web-accelerators download pages that may not be needed or looked at.
... and the alchemists in the web-accelerator business' answer to this
is to push more traffic onto the networks and more work onto the web-servers.
Now you see the catch. Good.
To illustrate the problem, consider the average web surfer visiting our website
for a 30 minute browse. Let's assume the visitor is able to find the content
she came looking for and reads it. Based on traffic at the Internet
Movie Database (IMDb) we might expect this user to download between
10 and 30 pages (downloading and reading 1 page per minute on average as
a maximum isn't an unreasonable assumption). For the IMDb, each page
contains an average
of more than 100 links to other pages, so the 10-30 pages shown to the user
will contain some 1000-3000+ links. We have observed on many occasions
web-accelerators requesting all the links on the currently viewed page,
so our average visitor is now requesting 1000-3000+ pages in 30 minutes
with her web-accelerator; that's one page every 1.8 to 0.6 seconds. In
reality we've seen web-accelerators go much faster than this and do
so for hours at a time.
If all of this wasn't bad enough, there's more. The ingenious reader may
well be thinking all we need to do is refuse to serve web-accelerators and
they'll become extinct, or at least they'll not be a bother to web server
administrators wise to them. If only that were possible, well to a degree it
is since some web-accelerators identify themselves to servers and the
servers are entitled to refuse service as they see fit. The problem with this
antidote to prefetching is that there are growing number of web-accelerators
which hide behind anonymity, or to be more precise they hide behind the 'good name'
of others. Web-accelerators often masquerade as Mozilla (Netscape browsers) or
MSIE (Microsoft). Some are probably written by people too clueless to realise they
are supposed to identify their HTTP agents properly while others are perhaps happy
to shift the blame for web-accelerator abuses onto the browser makers.
Readers familiar with web server administration may relate web-accelerator problems
to those of robots and crawlers. For almost as long as there have been
web servers there have been robots/crawlers. Very early on it became apparent to many
that unless these HTTP agents followed common sense guidelines and obeyed some ground
rules laid down by server administrators, the robots would be more of a nuisance than
a service. Unlike web-accelerators, robots do provide a service, that of indexing
sites so that search engines can refer more people to relevant services. No such rules
or guidelines exist for web-accelerators although it would be simple and a step in the
right direction if they followed the same set of rules as robots.
Web-accelerators are a nuisance, they are counterproductive and they are a danger to some
web servers and the infrastructure of the web itself. The more people use these products
the slower the web will become and web surfers will perceive a need for more of these
snake oil products.
Web-accelerators slow down the web!
© 1998 Internet Movie Database