How to Turn Google Knols into an MFA Scraper

Michael Gray

By Michael Gray
In Google  

Print Post Print Post Email Post Email Post    ADD TO STUMBLEUPON Sphinn It ADD TO DEL.ICIO.US  Tweet This

The following is purely hypothetical and should not be taken as advice or a suggestion.


Since all of the content wikipedia is free for public use as long as you credit it, what’s to prevent anyone from copying large chunks and importing them into Google Knols. Lets say you took the top 1000 pages from wikipedia, got some cheap labor from country with a weak economy like india, china, russia, united states and have them copy/scrape all the pages and create knol pages for them. You could get around the dupe content issues by rolling back 3-6 months of revisions on wikipedia. Beg, borrow, spam, or rent yourself enough links to get them indexed. Wait 9-12 months for Knols domain authority to kick in and then turn on the adsense.

Again this is all purely hypothetical and not to be interpreted as a suggestion or advice.

Just saying I don’t think Google thought this knols thing through completely cause there are all sorts of holes in it, like mack truck and sherman tank sized holes, not teeny tiny mouse holes.

Related posts:

  1. How to Turn Off Personalized Search in Google Chrome With the impending arrival of Google OS I’ve been spending...
  2. Wikipedia Invades Google News The following screen shot has to be seen to be...
  3. How Google Profiles SEO’s At SMX one of the more contentious subjects was the...

Crazyegg Link Tracking

{ 16 comments }

aaron wall July 23, 2008 at 8:38 pm

Thanks for the ideas. I own KnolGenerator.com and I think I can help a lot of people provide useful information to a lot of other people ;)

Brian Provost July 23, 2008 at 9:10 pm

I only comment on blogs like 2x a year, but this drew me out. God bless you, you filthy, filthy spammer.

me July 23, 2008 at 11:03 pm

I don’t think you thought this post through. The adsense account will be banned in seconds due to it inherently being an MFA site. Please don’t say this was just a hypothetical scenario… why even start with this in a blog posting.

I’m very close to removing you from my feed. Just haven’t seen anything useful in the last 6-12 months. :(

Joe Hall July 24, 2008 at 1:38 am

I really like this idea.

Internet Eyer July 24, 2008 at 5:42 am

By chance, do you know how and why they chose the name knol? Thx

Michael Gray July 24, 2008 at 6:49 am

@me: ok here’s the thing you are 100% within your full legal rights to copy the entire wiki as long as you give it credit and allow other people to copy it or any derivations from you.

So you may not like it but it’s not illegal in any way.

Will July 24, 2008 at 9:50 pm

“I own KnolGenerator.com and I think I can help a lot of people provide useful information to a lot of other people ;)

But don’t you realize you’ll be killing the dream of Mahalo?

Demerzel July 24, 2008 at 11:28 pm

@graywolf

Let’s be clear though, 100% legal to take Wiki’s content, but also 100% legal for Google to remove your content.

That’s not to say it won’t be a large MFA site (I noted that on my blog as well)–scope and scale will certainly become an issue when you have to keep track of millions of people (theoretically) will more than millions of pages being created.

Yowza July 25, 2008 at 12:38 am

Remember folks, when building knols, make your name a keyword as well because your name appears in the url. So if you’re trying to rank for buttermilk pankcakes, the knol url should be http://knol.google.com/k/buttermilk-pankakes/buttermilk-pancakes/

Let the games begin…

paisley July 25, 2008 at 9:58 am

i was going to ask about the creaitve commons info, and you posted it.. LMAO..

LH July 25, 2008 at 10:52 am

It’s funny someone else mentioned Mahalo. That was my first thought as well.

Gab Goldenberg July 25, 2008 at 2:07 pm

So let’s see:

1) Google is pushing adoption of Knols by blatantly giving them ranking boosts. This means the system gets adopted by regular folks finding stuff in search results AND webmasters hungry for traffic.
2) It can be monetized (syndk8: gentlemen, start your engines!)
3) Wikipedia’s content can be copied by anyone.

Shit, wonder what’ll be the most popular free online encyclopedia in 2 years from now?

Am I the only one who notices that Google’s consistently using its own search results to push its products and kill competitors? Mapquest, other video sites, now Knol, hosting providers are all teaming up with it to give you “new, free Google Apps” (whupee 8-| ) or even have it be entirely hosted by the borg (weird to pay for hosting then be told G will do it instead, but wtv.)

The problem with being a web-only business is that Google can knock off your code and gain much faster adoption because it controls the flow of information. It literally can enter any web-only market within a year and dominate. Scare what control over the flow of information can do, huh?

Michael Gray July 25, 2008 at 2:34 pm

@Gab Goldenberg: kinda scary when the lights go on and you connect the dots isn’t it

Yeeks July 26, 2008 at 2:33 am

“1) Google is pushing adoption of Knols by blatantly giving them ranking boosts.”

To be fair, Knol should be ranked highly according to Google’s algo. Knol is a subdomain of Google, and Google has given Google.com the highest trust rank of any site on the Internet, so of course Knol pages should rank highly based on Google’s algorithm. Google’s algo wouldn’t be doing its job if didn’t give knol pages high ranking, you see.

me July 27, 2008 at 12:22 am

ok here’s the thing you are 100% within your full legal rights to copy the entire wiki as long as you give it credit and allow other people to copy it or any derivations from you.

Michael: We understand the legality of reproducing content under creative commons. Please re-read my original statement. MFA (made for adsense) sites are against the adsense TERMS OF POLICY. Your adsense account can be easily banned and the corporation used blacklisted from the program.

Michael Gray July 27, 2008 at 8:42 am

@me: I don’t see anything here https://www.google.com/adsense/localized-terms about low quality, or MFA sites. The problem you are confusing it with is DMCA. If you copy a page W/O permission then they are within TOS to not only delist you but drop you from adsense. The entire point of Creative Commons is to encourage people to share and reuse the content. That includes ways you may not have thought of, like or approve.

From an algo perspective trusted sites “get away” with duplicate content, where low quality sites don’t.

Comments on this entry are closed.