I actually like Compete I think it’s quite a bit more accurate than Alexa or other competing services, however when I see comments like this on their blog it makes me question some of their conclusions:
Every search query we see in our data is a query performed by an actual person.
via compete blog
Ok Compete gathers their data in a few different ways, first they have a toolbar which lets them gather data, easy enough to fake the same way Alexa is faked. However they also use other methods like clickstream data from ISP’s (like hitwise), and then weight the data according to their algorithm.
Not to rain on their parade but people sometimes send automated queries to search engines for any number of purposes, some benign like ranking reports, others more nefarious like trying to influence things such as Google Trends. Of course a human being does “push the button” so if you wanted to get technical about it yes they are all human generated, but I think the essence of your statement is each query is done by a person singularly, not en masse. It gets even more difficult when sophisticated people start generating massive amounts of “pseudo human” traffic, and throttle it with variable time intervals so it appears more natural. Compete you really think you’ve got that one beat do you … c’mon now … (wink wink nudge nudge).
No doubt you are better than many of the others, but lets be real adversarial information retrieval can get very tricky at times and I think declaring “Every search query we see in our data is a query performed by an actual person” would rank right up there with building an unsinkable ship … and removing spam from the internet
Related posts:- Steve Rubel, Wikipedia, and AOL Data, Part II One of the great things about the AOL data leak...
- Alexa Get Yer’ Game On In certain circles and under certain circumstances having a good...
- Make Thesis Work Better With Digg and Facebook If you’re involved with social media sites like Digg, Facebbok...
See my disclaimer about advertising and affiliate links










{ 10 comments }
Did I land on a different Planet? I thought you had wordpress before.
no i wuz tinkering
Michael,
Thanks for writing about Compete. I’m the Founder of Compete, which we started 7yrs ago so the wet behind the ear comment made me smile.
Thanks for the good words, we’ve been at it a long time and help some of the largest brands with our data.
Also it’s always nice to meet a fellow Queens College Alumni.
Cheers,
David
hi David thanks for stopping by, so care to share how you dealing with “pollution” in the data stream without giving away secret sauce?
A big difference between Compete and others (Alexa, Quantcast, etc, etc) is that we select the users who make it into our panel. We start each month by selecting _only_ a subset of the users for whom we have data for. The selection process is secret but it includes using demographic variables, browsing habits ,activity metrics and preference to those who were chosen in past months. Also a user has to have been active for months before we will include them into our sample.
This allows us to normalize our data to project the total internet browser population in the US as well as avoids the “anyone can download a toolbar and game the system” problem that others have.
We are also constantly checking our data for outlier users e.g. those who perform bot like behavior and either filtering their data or removing them.
Happy to answer any other questions that come up.
David
Michael,
Thanks for reading the post. David actually covered most of my comments I was going to post in response to your post. I guess the founder of Compete knows a little something about what we do … go figure.
One thing I didn’t mention in my post that I probably should have called out. We actually did some additional screening on the data to remove outliers. Anything that looked like a clear case of scripted search testing we removed from the analysis. I’m still a relative rookie in this world … but I’m not that wet behind the ears.
Although I suppose if I had commented on that in my post I might not have gotten the link from you.
thanks again
David & Jeremy – first off, I think it’s terrific that you’re out here responding to posts in the blogosphere. Inspiring, really.
My issue with Compete.com is the same I see for the other services (Alexa, Quantcast, Hitwise) – the numbers for visitors are off, yes, but even the ratio of which sites are more/less popular and by how much appears to be frequently wrong. We get to see the real stats for many, many domains and when we compare them in Compete, the numbers just don’t add up.
This is way cool… The people behind the product interacting and covering their assets.
I was surprised to read you consider Compete more accurate.
One of my sites has a 46k rank on Alexa (that I find pretty accurate), and Compete ranks it at 500k. More alarming than that, they add insult to injury by pretending to know the site’s traffic, which they guesstimate at 2k per month.
Granted, the site is in Spanish, but it does receive plenty of visits from the US market, more like 2k per day than per month (average 10k visits per day with all regions included). They are basically getting my traffic wrong by a ratio of 150X – nowhere near excusable in my opinion.
I think Compete should stick to “Can’t say because we don’t have enough data for this site”, as other tools do (although I think any respectable tool should have data on a site that displays a million pages per month to over 200k uniques). They mention not having enough data, but do go ahead and guesstimate some ridiculous number.
Perhaps Compete has tailored its algorithms to serve too narrow a market.
it does not really matter what they specifically say about their numbers, we already know how they are calculated and the possible sources of error
but first, thanks to all the analytics applications for producing such a wide variance in statistical results.
i might be sunk if the case were otherwise, then again, i should also thank the many developers building non-spiderable sites
secondly, defending your site in the blogosphere these days is necessary, not forward-thinking. if you walked around the iron-age with a bronze sword, then…
that being said, I applaud Compete for trying to end the discrepancies swept under the rug by Alexa. but there is a lot of work to do…
Comments on this entry are closed.