RSS Scraping, Scavenging, Stealing, And Content Theft…Get Thee Behind Me!

I’ve been noticing a number of Auto Blogs, Content Thieves, and SPLOGS stealing my content, word for word, even going as far as stealing my Post Slugs Verbatim!   I am going to begin actively changing a number of ways in which I syndicate this site over the next month to attempt to deal with these outcasts.

I’ve recently installed the “Anti Leech” plugin from Owen Winkler and will be considering a number of other changes to my content, including a specific policy with regard to content usage from MusTech.Net.  It is very hard to stop these sites from “snagging” your content especially if you use a full RSS feed like I do (Yes, I’m even considering turning that back to a partial).  One of the big issues with this kind of vile operation is that legitimate people looking for your content will find it on another’s site and end up getting the benefits from your hard work.

Although it’s relativivley easy for me to find out who is virtually copying my blog information verbatim (I have linkbacks, headers and footers in the RSS feed), it is hard to stop them as many of the sites do not respond to any form of contact whatsoever.   It’s very hard to find the actualy IP addresses of those stealing the content because they are using RSS Scraping bot agents that utilize different IP addresses than those of their sites (Even using Feedburner’s tools and my server’s Access Logs presents a challenge with regard to finding them out!).

I would recommend reading Lorelle’s GREAT article on “What Do You Do WHEN Someone Steals Your Content”as a starter for anyone experiencing this type of illicit activity.  I will also keep you posted as to what steps I’m implementing over the next month or so to thwart these undesirables.

    Dr. Joseph M. Pisano Ph.D. 

Joseph M. Pisano, Ph.D. is the creator of many education websites, a lecturer, clinician, trumpeter, and conductor. He is currently the Associate Chair of Music and Director of Bands in the Calderwood School of Arts at Grove City College in PA. He been named a TI:ME Teacher of the Year, received the JEN Jazz Educator Award and the PA Citation of Excellence. He is a past Vice President of the Technology Institute for Music Educators and the current Vice-President of the PA Intercollegiate Bandmasters Association. He also writes for DCI Magazine, Teaching Music Magazine, and is the Educational Editor for In-Tune Monthly Magazine; he has contributed hundreds of articles to various publications. Find out more at his website jpisano.com.
Print Friendly
  • Erinn Wrobel

    This is disturbing. Thanks for letting us know about it.

  • A couple of things you might want to try are numerous links to previous articles on your own site, as well as monetized affiliate links (ie. Amazon). Both of those might make your site less attractive to the troglodytes that scrape feeds, as their potential readers may link either back to your site or to your monetized links.

  • Thanks guys,

    I’ve actually been dealing with these content thieves type for over a year now in various ways. I’ve been avoiding putting ads in the rss feed, but I may, just for these reasons. About a year ago I made the decision to include full post RSS, but I may be changing that. We’ll see.

    I typically will let a RSS Leecher go if they have a lower page rank or google rank than me, but if it’s higher that means people using a searh engine will find my article on their site before they find it on mine, that’s when I start getting really mad!

  • You know it is sad when you see this very article, links, copyright notice, header and all sitting squarely on someone elses site…

  • Joe,

    I feel your pain. I ran into this a couple of months ago (http://www.musicedmagic.com/tales-from-the-podium/plagiarism-hits-home.html) and did my best to stamp it out. I really didn’t get very far but my complaints did get heard by at least one of the plagiarist’s ISP’s resulting in his account being cancelled. These days I use Copyscape once in a while to check on things.

    There is also an article at Blogherald (http://www.blogherald.com/2007/05/28/how-to-stop-plagiarism-cold/) on how to help stop it as well.

  • Joe: I am very sorry to hear about your recent troubles here. The Anti-leech plugin is a very good first step but, in reality, shutting down most of these spammers is pretty easy. I can help you if you want. I’ve stopped over 600 plagiarists of my own content and will certainly do what I can to assist.

    Just drop me a line if there is anything I can do. You can either use the email address here or the contact form on my site, both go to the same place.

    No matter what, best of luck with this!

  • Kevin

    Stop syndicating then. Quit whining and be happy that others get to view your work, whether it’s on your website or someone else’s….You’re not the glorious thinker or content provider of all time…neither am I. It’s just another viewpoint that you have on any particular subject. Don’t think that just because you wrote it, it has merit and needs to be copy written or put on a pedestal.

    First of all, by now there are no original thoughts on this subject or any subject. You’re not the first to have the thoughts about someone having their content used on another website. I’m sure you’ve read an article about someone else having their content stolen and now you’re carrying on. Grow up! Even that article, that you might have read about content “stealing”, might need to be forwarded on and on…ironic huh?

    If you’re syndicating, I say you’re opening yourself up to having others use your content…cut and dried. Otherwise quit syndicating.

    I can’t wait to hear the simple-minded, one dimensional people respond negatively to my comments. I’m sure it will include jokes about my grammar or that I’m simple-minded. Please don’t be that simple minded sole.

    Just admit that your so called hard work has been based solely on other’s hard work previously. It’s not original content and let’s keep the internet open and free….even if it means others reprinting your work.

  • Kevin,

    While you have some real and partially valid opinions about syndication, I feel that you are “anti-intellectualy property from the onset” so no matter what I type in relationship to your response will no doubt be of little argument for you.

    First off, syndication through use of an RSS feed does not give any individual the right to re-publish the RSS feed in part or in it’s entirety. This right is freely given or not-given by the publisher. What syndication does do is give any person, or entity the right to read the feed, subscribe to the feed or utilize the feed in any non-published or re-published way.

    As far as no new subject material, or “nothing new under the sun’, their may be part truths to that, but credit is usually given to where things are taken from, links for blogs, footnotes, for books, and an annotated bibliography for other works (That being said, I do believe that that are new thoughts, and new ways to look at others works)…what really bugs be about SCRAPERS are that they steal your content for the sole purpose of increasing their exposure to search engines to “Sell” their goods (Adsense, ads) etc., to make a quick buck, to get notariety, truly without any NEW thoughts (usually through bots, or automation), further pushing legitimate sites’ information to the hard to find places of the internet search engines where many people looking for real information or a “real site” never find them.

    Let me assure you, I’m all for an open internet and the increasing availbility of previously non-accesible information becoming available. In part, this is one of the main reasons i started to blog years ago. There are many journals that I feel should be allowing their informaiton to be freely accessed to those on the internet.

    I very much agree with the context of your last statement, but I would be totally against havine a journal, like say the, Music Educators Journal, putting their information online and having some idiot re-print the article verbatim in their own blog or site, it doesn’t make sense and I believe it’s ethically wrong. Personally, I want people to read what I write, to chime in in the comment sections, most of what I write, I freely license for people to print, utilize and share in closed environments, but I don’t want to see some fool printing my stuff verbatim, without my permission, so they can get a few clicks off of adsense and deny people from finding a legitimate resource. Come to my site, it’s here, it’s FREE already and there is a community talking about it. Not unlike you.

    Regards,

    J. Pisano

  • Kevin

    J. Pisano…Very positive comments…I appreciate that. Mine were not so positive and some of which I regret. Silly me!

    Having said that, yes it’s probably true that I don’t hold full regard for the fact that someone would lay claim to a written piece. I feel that once written and published on the internet, it becomes the property of the people and not of any single individual or entity that would try to lay claim. You publish on the internet…you lose control. That’s the theory. No I’m definitely not communist…more free commerce even at the expensive of the individual. (Wow…kind of funny! Communism and Captialism have something in common….the sacrifice of the individual.)

    An aside…recently I was floored when I had to pay to drive in to see the Grand Canyon. Should I pay the government each time I open my door and go into my home? The Grand Canyon belongs to all our citizens, possibly even to the world citizens….why should I pay? Should God receive a patent or copyright and we shouldn’t be allowed to enjoy without paying for it? The Creator loves for us to enjoy His work…I would assume the same from the creator of writing.

    I think we’re taking way to many steps to limit the creativity of individuals and businesses in this country. I’m sure Eli Whitney didn’t do a patent search when he started working on the Cotton Gin. He saw a need, filled the need and became successful for it. Today, if he hadn’t checked to make sure there were already patents, he might have been stifled.

    Those that are starting websites, whether for commercial or for other egotistical fulfilling needs, need content….plain and simple. They either create or provide other’s content or both. They see a way to get content for their readers and they provide it and become succesful for it. I say cool!!! I’m sure most of the time it’s content related to their website. I would think it would match the reader demographics or why use it? So, the reader might also be the one served. I also think that’s cool!!!

    Take your local paper for instance. Where would they be without the AP Wire Services. The local paper wouldn’t have enough content to publish a decent paper. They would have to fill the paper with some other content somehow. A newspaper is already stretched to the limits for manpower so increasing their productivity is out of the question. So, using someone else’s material is the only option….granted that’s a paying arrangement but never to the original writer…am I (w)right?

    Are you upset at the AP Wire Services for sucking the blood from poor underpaid writers at newspapers? Those writers again will never profit for their work like the AP Wire Services or the newspaper’s stockholders will. Those writers only get paid a flat salary most of the time.

    And as far as an individual worrying whether their website is at the top of a search engine because of any particular article is self-serving and egotistical. I personally think that those that write obviously write to serve mankind. If someone else is better at getting your article to the top of the heap and you did a lousy job…who is better served? I think the writer and the reader win.

    In that particular case, the fight goes to the fittest and the individual’s website be damned if he wasn’t smart enough to use technology to get his “intellectual content” to the top of the heap…but the smarter someone else did. Why not collaborate with that smarter website to get a link off of their site and be better for it. Work with the best not damn the best….that’s a simple minded tactic.

    I believe in marketing email (some call it spam) but I don’t believe in worms and viruses. The first should be taught to be used properly (narrowing demographics and better software and infrastructure) and the other is harmful and vicious.

    Again, as you can tell, I’m all for freedom for the internet.

  • Kevin,

    I appreaciate your comments and can’t say I agree with you on everything, but I wish you well and appreaciate the discourse.

    Regards,

    J. Pisano

  • I use joomla CMS for my website, and I’ve been using it for several years now. Joomla has a newsfeed component which takes the RSS feed and places it on the page. This morning I added the mustech.net feed and saw the big warning at the end of the article, which caused me to stop and look into this more.
    First, I cut down the number of articles the component will show from five to one. Then I went back and looked at what was happening to various feeds. Some of them show just the first part of the article, with a “more” type of ellipses following (like this one: http://pedaplus.com/index.php?option=com_newsfeeds&task=view&feedid=67&Itemid=149)

    Others show a full article. Obviously, someone reading reader will need to follow the link to the site to read the rest of the article. This seems like one of the best solutions – you get exposure on someone else’s site, but the reader is motivated to follow the link to the original site.

    Jon

  • Tonight I went back and reworked my website because of inconsistent results with the Newsfeed component. I changed all of the RSS feeds to regular links so that each feed opens in a new window or tab. Thanks for the heads-up on this, and I think I’ll sleep better tonight not worrying whether someone will think I’m plagiarizing!

    Jon

  • Pingback: Great sites for bloggers to help fight content theft | UK airports information blog()