Michael Gray

How to Turn Google Knols into an MFA Scraper

Posted on July 23rd, 2008
by Michael Gray in Google



If you're new here, you may want to subscribe to my RSS feed. Read my top posts or learn more about Michael Gray. Want more frequent updates follow me on Twitter. Thanks for visiting!

The following is purely hypothetical and should not be taken as advice or a suggestion.


Since all of the content wikipedia is free for public use as long as you credit it, what’s to prevent anyone from copying large chunks and importing them into Google Knols. Lets say you took the top 1000 pages from wikipedia, got some cheap labor from country with a weak economy like india, china, russia, united states and have them copy/scrape all the pages and create knol pages for them. You could get around the dupe content issues by rolling back 3-6 months of revisions on wikipedia. Beg, borrow, spam, or rent yourself enough links to get them indexed. Wait 9-12 months for Knols domain authority to kick in and then turn on the adsense.

Again this is all purely hypothetical and not to be interpreted as a suggestion or advice.

Just saying I don’t think Google thought this knols thing through completely cause there are all sorts of holes in it, like mack truck and sherman tank sized holes, not teeny tiny mouse holes.

Sphere: Related Content

Text Link Ads


16 Responses to “How to Turn Google Knols into an MFA Scraper”

  1. User Gravataraaron wall Says:

    Thanks for the ideas. I own KnolGenerator.com and I think I can help a lot of people provide useful information to a lot of other people ;)

  2. User GravatarBrian Provost Says:

    I only comment on blogs like 2x a year, but this drew me out. God bless you, you filthy, filthy spammer.

  3. User Gravatarme Says:

    I don’t think you thought this post through. The adsense account will be banned in seconds due to it inherently being an MFA site. Please don’t say this was just a hypothetical scenario… why even start with this in a blog posting.

    I’m very close to removing you from my feed. Just haven’t seen anything useful in the last 6-12 months. :(

  4. User GravatarJoe Hall Says:

    I really like this idea.

  5. User GravatarInternet Eyer Says:

    By chance, do you know how and why they chose the name knol? Thx

  6. User GravatarMichael Gray Says:
    @me: ok here’s the thing you are 100% within your full legal rights to copy the entire wiki as long as you give it credit and allow other people to copy it or any derivations from you.

    So you may not like it but it’s not illegal in any way.

  7. User GravatarWill Says:

    “I own KnolGenerator.com and I think I can help a lot of people provide useful information to a lot of other people ;)”

    But don’t you realize you’ll be killing the dream of Mahalo?

  8. User GravatarDemerzel Says:

    @graywolf

    Let’s be clear though, 100% legal to take Wiki’s content, but also 100% legal for Google to remove your content.

    That’s not to say it won’t be a large MFA site (I noted that on my blog as well)–scope and scale will certainly become an issue when you have to keep track of millions of people (theoretically) will more than millions of pages being created.

  9. User GravatarYowza Says:

    Remember folks, when building knols, make your name a keyword as well because your name appears in the url. So if you’re trying to rank for buttermilk pankcakes, the knol url should be http://knol.google.com/k/buttermilk-pankakes/buttermilk-pancakes/

    Let the games begin…

  10. User Gravatarpaisley Says:

    i was going to ask about the creaitve commons info, and you posted it.. LMAO..

  11. User GravatarLH Says:

    It’s funny someone else mentioned Mahalo. That was my first thought as well.

  12. User GravatarGab Goldenberg Says:

    So let’s see:

    1) Google is pushing adoption of Knols by blatantly giving them ranking boosts. This means the system gets adopted by regular folks finding stuff in search results AND webmasters hungry for traffic.
    2) It can be monetized (syndk8: gentlemen, start your engines!)
    3) Wikipedia’s content can be copied by anyone.

    Shit, wonder what’ll be the most popular free online encyclopedia in 2 years from now?

    Am I the only one who notices that Google’s consistently using its own search results to push its products and kill competitors? Mapquest, other video sites, now Knol, hosting providers are all teaming up with it to give you “new, free Google Apps” (whupee 8-| ) or even have it be entirely hosted by the borg (weird to pay for hosting then be told G will do it instead, but wtv.)

    The problem with being a web-only business is that Google can knock off your code and gain much faster adoption because it controls the flow of information. It literally can enter any web-only market within a year and dominate. Scare what control over the flow of information can do, huh?

  13. User GravatarMichael Gray Says:
    @Gab Goldenberg: kinda scary when the lights go on and you connect the dots isn’t it
  14. User GravatarYeeks Says:

    “1) Google is pushing adoption of Knols by blatantly giving them ranking boosts.”

    To be fair, Knol should be ranked highly according to Google’s algo. Knol is a subdomain of Google, and Google has given Google.com the highest trust rank of any site on the Internet, so of course Knol pages should rank highly based on Google’s algorithm. Google’s algo wouldn’t be doing its job if didn’t give knol pages high ranking, you see.

  15. User Gravatarme Says:

    ok here’s the thing you are 100% within your full legal rights to copy the entire wiki as long as you give it credit and allow other people to copy it or any derivations from you.

    Michael: We understand the legality of reproducing content under creative commons. Please re-read my original statement. MFA (made for adsense) sites are against the adsense TERMS OF POLICY. Your adsense account can be easily banned and the corporation used blacklisted from the program.

  16. User GravatarMichael Gray Says:
    @me: I don’t see anything here https://www.google.com/adsense/localized-terms about low quality, or MFA sites. The problem you are confusing it with is DMCA. If you copy a page W/O permission then they are within TOS to not only delist you but drop you from adsense. The entire point of Creative Commons is to encourage people to share and reuse the content. That includes ways you may not have thought of, like or approve.

    From an algo perspective trusted sites “get away” with duplicate content, where low quality sites don’t.