I was wondering several weeks ago why the “raw” access counter for my music quote list was going through the roof (and the usage statistics showing unusual changes) without similar increases in “real” visitor numbers, but did not pursue this matter further yet. Earlier this month, a post about spam protection on Holy Shmoly! (whose feed is also included in the WordPress dashboard) also mentioned additional traffic caused by the AVG LinkScanner – an additional program of AVG’s virus scanner which loads all links on search engine results paged beforehand and checks them for malicious code and scripts. AVG Watch has all the background information on that.
At first, this doesn’t sound too bad an idea, however it generates lots of traffic without the found pages actually being read by someone – it would really be sufficient to check the pages when accessing them… And this additional traffic and the tampering of statistics it caused – statistics that have financial consequences for professional sites, and statistics analysts could, at best, only react with some delay on these effects – was what drove many webmasters crazy.
Now, basically, I got unlimited traffic included in my little shared hosting package (though I don’t need to test how unlimited that actually is) and the server didn’t seem to crash under the load, and I certainly wasn’t going crazy – but I did want to have a closer look and of course see whether it really was this AVG LinkScanner that caused the access increase I had noticed. So I grabbed my logfiles and analyzed them for the four different referrers characteristic for AVG – the following chart shows the result for the aforementioned music quote list, which is the post that got the (relative) majority of search engine referrals by far, especially for English sentences (click for large version with longer period):
■ Accesses according to WordPress.com stats (visitors with JavaScript)*
■ All other normal accesses (search engine robots, visitors without JavaScriipt)*
■ ■ ■ ■ Various LinkScanner referrers
■ Redirections using .htaccess
* both without my own accesses
Meaning of the labelled days:
1: Public release of the new AVG version with LinkScanner on April 23.
2: Holy Shmoly! reports, and I add the .htaccess redirections.
3: Small change in redirection, thus letting through a few again.
So we see: The LinkScanner caused up to 1000 additional accesses per day for this post, up to 7 times of real visitor numbers. By the way, it read the page itself (PHP and database accesses) and all JavaScript files linked in it each time (and often a particular GIF image, for whatever reason).
In the mean time – AVG changed LinkScanner’s behavior – the “false traffic” has decreased greatly again.
But it actually had a positive side effect: You could get a little rough impresion of how often your site shows up on the search result pages without the users clicking them…