Dupe content a.k.a. Google doesn't care if you own the content

admin · #1 01-15-2010, 08:19 AM

Ideally this would be well-edited editorial on our new blog, but I don't want to wait 4-6 more weeks to get this written up.

In one of many business ventures I'm involved in, a particular project has me working with some field experts. And we have several of them willing to write some content for the related websites, which will long-term help us both.

..... However, the first one submitted copies of papers he already had published!

Most SEO experts already know where I'm going with this, but I've decided to share it here, as I know there are (at minimum) several dozen folks reading in on this new web content/planning forum regularly -- and they're probably not SEO experts (yet).

Although Google has a very complex algorithm in place for locating content and doling out search result rankings via PageRank, it also uses what I'd refer to as "caveman logic". Google is both incredibly smart and incredibly stupid at the same time. All search engines are, in one way or another, but I'll only be picking on Google today.

So here I am with an article that was published on www.ezines.com already, and then stolen from there and used on at least 10 other sites -- something confirmed by using a quick www.copyscape.com search. There are probably more, but I didn't want to pay for the Copyscape search -- 10 was enough.

None of the "copier" sites returned by Copyscape came up in Google, likely meaning they've been punished. Although a few were scraper sites, others were not. In fact one of the sites was for a venture similar to our own! If that site was punished, odds are we would be too!

(Not to mention, if we had wanted just any old article, we could have gone to a free content site long ago, and bypassed the time required in schmoozing and sometimes coddling the "experts".)

The immediate response from the expert was:

Quote:

But I am the freaking author...I wrote these word for word. Ezines has no copyrights.

Google is just a dumb smart network of machines, they don't know the difference between Shakespeare and a high school kid with a popular site that used Shakespeare quotes. If the kid has a more popular site, then logically that dumb bastard Shakespeare copied this kid. I know that sounds ridiculous, but such are the follies of letting algorithms determine ownership by using "popularity" as a qualification. (I think I'm having flashbacks to high school. Ugh.)

This is an excerpt from an internal email I wrote to one of the partners, who didn't quite understand why publishing the already-published as-is stories was just not going to help us. I'd rather not publish them than be punished by Google.

Quote:

Google uses caveman-style logic:

ezines "older site" - older is better
ezines "more popular" - popular is better

our sites "new" - older takes priority, therefore same content = you copied it
our sites not yet "popular" - popular takes priority, therefore same content = you copied

copying bad
your site bad
bad site not listed in google
me punish.
you screwed

"copyright" big word, me no understand, me no care. you still bad.

Worse than this, sometimes the owner of content can be plagiarized, and the ORIGINAL SITE can be punished!

Here's a perfect example, a comment I read on Matt Cutts blog:

Quote:

Michael February 1, 2008 at 7:51 am
I have a poetry blog. I uploaded a couple of new poems lately, and I included links to the original content in my RSS feed, like you had suggested, because another service was scraping my feed. It did no good, though. The other service now ranks for my poems, and the original content cannot be pulled up in Google.

There is a solution, of course, when somebody steals your content without permission (DMCA takedown notices, C&D letters). But I'll save that one for another day.

.... oh, did I mention that the expert submitted multi-page novel-length "articles" for our blog? We're trying to get him to edit about 5 pages down to 5 paragraphs. If he's not willing, then we'll just move without him.

That alone will probably solve the "dupe content" issue.

admin · #2 07-25-2013, 07:04 PM

Looking back at some old posts...

Thankfully Panda and other Google updates have somewhat addressed this now.

The "big, old, popular" sites that pumped out mountains of useless crap have been downgraded. This means all the "article" sites are going to have a hard time getting to rank for certain words. EzineArticles, in the example above, barely ranks these days. The author, for example, would no longer have to fear the "multi-content" site, as they're are not as valued. The in-house niche sites would be able to later publish the content and be just fine.

As of 2013, at least.

And it's not always true, but it can be now.