Tag Archives:

## Neck-and-neck race?

Can you call a race whose main objective is to drop back as little as possible still a neck-and-neck race? Wouldn’t butt-to-butt race be more appropriate?

For this is the case when you look at member numbers of the political parties in Germany – the SPD dropped from its top of one million in 1976 to 531,737 at the end of May and 529,994 at the end of June, the CDU from 750,000 after the wall was torn down to 531,299 at the end of May, and they now gloat about having more members than the SPD (and want to publish exact figures on Monday). Update: It’s 530,755.

And some news sites call that a neck-and-neck race…

## Percentage calculation problems or The shrunken bodies

Statista is always worth a look if you’re no statistics hater (and speak German). Today’s stats of the day about the question “How tall are you?” (» filtered by sex), and 22358 adult Germans had been asked.

The unfiltered overview shows 2.9% for the really big ones (to which I also belong, thanks to my 190 cm), rounded on top of the bars for clarity:

(190 cm and taller: 2.9%)

You can also enter a number to compare to – and the result is:

Your reply: 190.0 cm
98.0% are smaller than 190 cm.
2.0% are like you taller than 190 cm.

Oops, did 0.9% of the people suddenly shrink? Or how else could this result be explained then? And why “are like you taller than 190 cm”?

If I enter 189 cm for testing purposes, I get: “96.8% are smaller than you. 3.2% are taller than you.” So nobody is 189 cm tall? Are 0.3% 189 cm tall and 2.9% taller, or 1.2% 189 cm and 2.0% taller? For 154 cm, the numbers “2.2%/97.8%” are reported, basically matching the bar graph, but here, too, with the words “smaller” and “taller” without mentioning the size of exactly 154 cm.

Well, apparently there’s room for improvement… but the title still says “BETA”. Let’s see if the error report that I sent them (they got a special link for that) will have any effect.

## False traffic

I was wondering several weeks ago why the “raw” access counter for my music quote list was going through the roof (and the usage statistics showing unusual changes) without similar increases in “real” visitor numbers, but did not pursue this matter further yet. Earlier this month, a post about spam protection on Holy Shmoly! (whose feed is also included in the WordPress dashboard) also mentioned additional traffic caused by the AVG LinkScanner – an additional program of AVG’s virus scanner which loads all links on search engine results paged beforehand and checks them for malicious code and scripts. AVG Watch has all the background information on that.

At first, this doesn’t sound too bad an idea, however it generates lots of traffic without the found pages actually being read by someone – it would really be sufficient to check the pages when accessing them… And this additional traffic and the tampering of statistics it caused – statistics that have financial consequences for professional sites, and statistics analysts could, at best, only react with some delay on these effects – was what drove many webmasters crazy.

Now, basically, I got unlimited traffic included in my little shared hosting package (though I don’t need to test how unlimited that actually is) and the server didn’t seem to crash under the load, and I certainly wasn’t going crazy – but I did want to have a closer look and of course see whether it really was this AVG LinkScanner that caused the access increase I had noticed. So I grabbed my logfiles and analyzed them for the four different referrers characteristic for AVG – the following chart shows the result for the aforementioned music quote list, which is the post that got the (relative) majority of search engine referrals by far, especially for English sentences (click for large version with longer period):

Accesses according to WordPress.com stats (visitors with JavaScript)*
All other normal accesses (search engine robots, visitors without JavaScriipt)*
Various LinkScanner referrers
Redirections using .htaccess
* both without my own accesses

Meaning of the labelled days:
1: Public release of the new AVG version with LinkScanner on April 23.
2: Holy Shmoly! reports, and I add the .htaccess redirections.
3: Small change in redirection, thus letting through a few again.

So we see: The LinkScanner caused up to 1000 additional accesses per day for this post, up to 7 times of real visitor numbers. By the way, it read the page itself (PHP and database accesses) and all JavaScript files linked in it each time (and often a particular GIF image, for whatever reason).

In the mean time – AVG changed LinkScanner’s behavior – the “false traffic” has decreased greatly again.

But it actually had a positive side effect: You could get a little rough impresion of how often your site shows up on the search result pages without the users clicking them…