Google algorithm update slams duplicate content

03 Feb 2011

Adam BunnJust last month, leading search marketing specialist and technology firm, Greenlight, pointed out in its 2011 top 10 predictions for natural and paid search, that among other things, search engines would take an even more aggressive stance towards duplicate and low-value content this year (See: Display marketing to mirror search in practice: Greenlight's predictions for 2011). It is going on almost a week since Google's Matt Cutts announced an algorithm update that does just that. Already, sites have reported significant shifts in their rankings.

This move will be welcome news to many Google users. But as ever, when Google makes a relatively big change to its algorithm, there have been reports of "collateral damage" where the change has affected sites whose owners feel Google has mistakenly identified their site as having duplicate or low-value content.
 
The update, which somewhat unusually for a Google algorithm change does not yet appear to have been christened with a catchy moniker, tightens duplicate content filters and tries to make a better job of identifying the original source of duplicate material. An extract from Cutt's post reads: ''My [previous] post mentioned that 'we're evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others' content and sites with low levels of original content.' That change was approved at our weekly quality launch meeting last Thursday and launched earlier this week.''
 
How are your rankings?
Greenlight advises businesses that have recently lost rankings to identify their affected pages and then search Google for some short snippets of text from the pages, encapsulated in quotes.  If the page does not rank first, or at all, that is a strong signal the problem could be duplicate content-related.
 
If firms have noticed any significant changes in their online rankings since this Monday which do not recover by the next few days, that would be suggestive of their not being normal or typical temporary ranking fluctuations. It is possible their site has been affected by the update. This will be either because the site's content is not entirely original, in which case it will need to take steps to correct this.
 
Alternatively it could be as a result someone else having the respective site's content and Google mistakenly assuming their site is the original source.  In this case, the respective sites' appeal options are limited, and it is often easier to change the content anyway.''
 
What next?
Because search engines base their judgement of what is unique on the content of the page, the order in which duplicates were discovered, and volume and quality of links and citations, Greenlight's Bunn advises that the best way to insulate against any future steps search engines may take against duplicate content is for sites to ensure their pages:

* have a sufficient amount of original text content, supported by images, videos and other multimedia as appropriate. 

* are rapidly indexed by the search engines.  To achieve this the site should be regularly linked to, necessitating some kind of link acquisition strategy, and new pages should be submitted to the engines via XML sitemaps and featured on the homepage or another highly authoritative hub page in the respective site (such as a category homepage) until they have been indexed.  If the site has a blog, make sure it pings the search engines when a new post is published (most do), and then use the blog to publish or link to new content on the site. 

* are linked to and / or cited directly by third party sites.  Since it is rarely practical or economical to actively link build for every page in a site, consideration should be given regularly as to why someone/a third party would naturally link to the site's pages or share them on Twitter, for example.  If a firm/site cannot think of a good reason, it may need to go back to the drawing board.

 ''This problem (low-value and duplicate content) is one of such perpetual nature that the search engines are sure to revisit it again in the future.''