When Google Gets Duplicate Content Wrong
Posted on May 14th, 2008by Michael Gray in Google
If you're new here, you may want to subscribe to my RSS feed. Read my top posts or learn more about Michael Gray. Want more frequent updates follow me on Twitter. Thanks for visiting!
There’s lots of hand wringing among publishers about duplicate content, and how other people who are authorized to syndicate, and scrapers who aren’t often outrank the original source, when Google gets it wrong. Here’s an example in action.
I have an agreement with Web Pro News, they can pick and choose any of my content that they like and republish it on their website. This is fully authorized and sanctioned by me, and I’m glad they do it. However the problem is with Google, their ranking algo IMHO places too much of a bias on domain trust and authority. Since WPN has more trust and authority than I do they will outrank me for my own content. I was looking for something this morning I knew I wrote and couldn’t find it. Eventually I did, but only after realizing Google wasn’t giving me credit. Here’s an example:
OK that wasn’t where I started, but I chose that multi phrase term to prove a point, Google sometimes gets it wrong. I’ve been blogging for quite some time, this domain has a decent amount of links and trust (even though MC has only linked to me once) and gets good traffic, but in comparison to WPN I’m lower down the food chain. So if I’m having duplicate content filters applied to me, imagine how much harder it is for newer domains.
This trust and authority part of the algo isn’t unique to just Google. I’ve had other stories that were syndicated on WPN make it to Techmeme, and their algo also didn’t give me credit.
If you are having dupe content issues what should you do:
- If you are specifically writing content to be syndicated don’t put the exact same copy on your website if possible, write a different article about the same thing
- If that’s not practical or feasible try to get a link back to the source article, preferably with the title or some other KW rich text
- If you are pushing out a full blog feed and getting scraped, start using the RSS Footer plugin from Joost de Valk it automatically adds links back to the source post and your blog
You can also check out this post on SEL: Search Illustrated - How A Search Engine Determines Duplicate Content
PS: Please note nowhere in this post did I use the word penalty I only used filter. Yes there is a difference and no the two terms aren’t interchangeable.
Sphere: Related Content










May 14th, 2008 at 10:24 am
I’ve wondered about that for a couple of my clients who syndicate their content elsewhere, and figured when the other domain has a higher trust rank by Google, the client’s original site’s content gets a bit pooched.
May 14th, 2008 at 12:14 pm
lol, this page now outranks the other page thanks to the “also trying really hard to get into bookmarking space with their latest attempt Google shared stuff” link, but WPN are still ahead X(
May 14th, 2008 at 1:18 pm
Best part of the post . . .
“PS: Please note nowhere in this post did I use the word penalty I only used filter. Yes there is a difference and no the two terms aren’t interchangeable.”
So true!
Brent D. Payne
May 14th, 2008 at 4:42 pm
“I have an agreement with Web Pro News, they can pick and choose any of my content that they like and republish it on their website. This is fully authorized and sanctioned by me, and I’m glad they do it. However the problem is with Google, their ranking algo IMHO places too much of a bias on domain trust and authority. Since WPN has more trust and authority than I do they will outrank me for my own content.”
So funny, same thing happens to me with WPN. Even when my post gets better and more links and the WPN has only internal links to the article it still can out rank me. now how is that fair or right? proof that once you get to a certain level in authority your internal links can far out-weight someone else’s external links.
May 14th, 2008 at 6:51 pm
Happens to me all the time from SitePronews as well. It typically resurfaces in 48 hours with our site through, but frustrating at times. I agree.
May 16th, 2008 at 2:38 am
I see this a great deal with not just legitimate syndication, but also scraping. If a site, for example, sets up a spam blog on Blogspot.com or another trusted free blog host (even if that trust is not deserved) I’ve seen the search engines rank scraped copies higher than originals.
Likewise, I’ve seen spammers invest in expired domains and use the trust of those domains to outrank original authors.
The severity of this problem is roughly inversely proportional to your PageRank, but it is a serious problem for new and smaller blogs.
Though the trust system of search was a vast improvement, it has its limitations. Sadly, this is one of them.
May 25th, 2008 at 3:51 pm
Is there an argument here for some kind of tag to specify syndication, with a from-to attribute?