Do Spelling and Grammar Matter
Posted on March 8th, 2007by Michael Gray in Random Thoughts
If you're new here, you may want to subscribe to my RSS feed. Read my top posts or learn more about Michael Gray. Want more frequent updates follow me on Twitter. Thanks for visiting!
Recently DigitalGhost made a post where ended with the Gunning Fog number for the post. For you lazy SEO types Gunning fog is reading level score for a written work like a book or article. Which got me thinking does Google care about spelling, grammar and reading levels?
Now I fully admit I haven’t tested pretty much any of this recently, so this post represents little more than me speculating, but since I don’t have anything in mind to write and look at this week I’m going to run with it.
Recently I evaluated Open Office and one of things I found sorely lacking was the grammar checking capabilities of Open Office. Possibly because my grammar is so poor and is need of some extra nudging but Microsoft Word was vastly superior in it’s grammar checking ability. I suspect developing a grammar checking module is a fairly complex undertaking. While Google has the financial and personnel resources to throw at something like that since we aren’t seeing it in Gmail or Google Documents it’s probably not getting a lot of attention.
One are Google has put resources behind is spelling, much to the dismay of many SEO’s. However looking at spelling it appears to be several distinct modules operating in isolation. Gmail and Google Documents seem to share similar functionality and operation (see why it’s important to play with Google’s toys). Google as a search engine is much more advanced. Google learns or perhaps more accurately adapts to new words. For example Google Search “learned” that [stuntdubl] was a word and stopped asking ‘did you mean’, while Gmail and Google Docs remain less informed.
Penalizing based on misspellings is a tricky issue. I would never imagine that Google would become the spelling police and a simple typo would doom you to page two and beyond. However it might not be a bad idea to consider excessively bad spelling as signal of poor quality. However what about technical and scientific material, they use $5 words that only PDF search patent lovers can appreciate. Which brings us back to Gunning Fog and reading levels.
Is it possible a document could be filled with a high percentage of misspellings yet have a very high reading level and still be quality, I’d like to think so. So I’ve come up with a little test. if you remember my Aequeosalinocalcalinosetaceoaluminosocupreovitriolic post tested insanely high density level and ranks right behind some wiki so here’s a test where I misspelled everything except the 4 keywords Ichthyosaur Didgeridoo Asterism Velutinous. Here’s a screen shot of the SERP when I started. Considering there’s only 14 results, mostly college dictionary files (google why are you indexing those?), I should be able to rank pretty easily especially with a little help from the people who scrape me daily.
Update:
Looks like bad spelling doesn’t matter for ranking purposes.











March 8th, 2007 at 6:17 am
You know that Google and other SEs use semantics in their systems already to do stuff like determining the language the content is written in for their advanced search features and of course their different languages sites (e.g. Google.de, Google.fr etc.). They are also using it with more or less success to identify auto-generated pages that are scrambled content from numerous different sources. Other stuff such as stemming is already mainstream that it is part of pretty much every enterprise search solution out there.
That means that they already check individual words, detect the language, and find derivations to be able to consider the content for search results of keywords that are not part of the content, but mean the same thing (including singular and plural).
The amount of scrambled scraper junk that still ranks gives you an indication about the ability and application of advanced grammatical and semantic checks and verifications.
Structure of whole sentences, punctuation and capitalization. “Passive Voice” (word has constantly this problem with my writing).
I think that those things might be or will be part of the algo. It should not be or become a major ranking factor for reasons that you already pointed out, but I think it will be used a lot for spam detection and removal (automated or just raising flags to trigger a human review).
My 2 cents.
March 8th, 2007 at 6:39 am
Whaddya upp tooo? Weada itt wanks higghly orr nott iss irevewent iff itt don’t convvert, itts a exersize in googel futtillitty.
March 8th, 2007 at 10:04 am
those wordlists are gold! google keep indexing them!
March 8th, 2007 at 11:13 am
Misspellings are traffic friendly!
March 8th, 2007 at 11:14 am
Have you ever seen this:
http://www.google.com/jobs/britney.html
Some eye opening stuff. If only there was a good misspelling generator.
March 8th, 2007 at 11:51 am
Gungan Language Experiments, Duncan? I bet it sounds like like Jar Jar Binks when processed through talkr.com hehe.
March 8th, 2007 at 4:42 pm
In the end, given the complexities and nuances of online societies, I cannot foresee a search engine used by the public at large ever penalizing for grammar and misspelling. Their mission, if you think about it, is to match the web page with whatever occupies the space between the keyboard and the seat …in the US, that seatblob has a fairly high likelihood of using text constructs fraught with poor grammar, typos, and misspellings. So, from an SEO standpoint, I expect the algos to remain passive.
As for hurting readership of a blog, a subject you’ve explored frequently; yes, it’ll hurt. I know it does in my case though it’s often a slow death (a thousand epaper-cuts).
As for DG’s 12.2 experiment. I liked the article and could comprehend it with no problem but I’ll report here that I felt impatient with it. I’m in speed mode while on my daily online reading regime and 12.2 seemed an unnecessarily high path relative to the topic. I knew that going in, having first read about it here, but it still felt stilted.
March 8th, 2007 at 4:46 pm
Duncan - nothing developed by an American could ever hope to cope with the West Australian version of ‘Strine
March 8th, 2007 at 10:57 pm
I wonder how much longer it took you to write this post.
March 9th, 2007 at 8:27 am
Are you talking about US English spellings or British English spellings?
Awareness of US spellings has been a concern in UK SEO for some years now.
March 10th, 2007 at 4:29 am
“Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a total mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Amzanig huh?”
March 10th, 2007 at 7:13 am
Brian: “Awareness of US spellings has been a concern in UK SEO for some years now”
No wonder, you poor folks over at the island, who live in flats and not in apartments, have to read all the “misspellings” every day. Pants? What? “Speak English! It is called “trousers” you dumb .&#%@.” hehe.
Dave: got it
and that even though English is my second language. Very nice. Aloha!
June 30th, 2007 at 8:22 pm
I think spelling should matter… people should care more about writing things correctly.
August 15th, 2007 at 11:13 pm
Google still will never manage language semantics. It is NLA, but it should be NLG - read Wiki on this matter. Google must be slayed.