« List Random Authors | Main | Running Publish Queue under daemontools »

Fighting Hotlinkers with htaccess

If you have a blog with reasonably well conceived content, and have had this blog for any length of time, you will undoubtedly encounter content scrapers, people bottom-feeding scumbags who republish rip-off your content either from your feed or from your site, to populate their own "made for Adsense" sites. Often the only recourse left to you is to file a DMCA complaint, which can take over a month to resolve.

If the content scraper is hotlinking your images (pulling the images directly from the files on your server) you can at least shut down the images with a command in your htaccess file if your site is hosted on an Apache server. This is sort of a sledgehammer approach, but if they have scraped your entire site (which happened to me recently), it's a lot easier than changing the file names of every image on your site and rewriting your html code for every entry.

content-scraper.jpg

In the LMT article I wrote about spam years ago, there is a section on "referrer spam". In that section there is a line of code that you can add to the htaccess file of your blog:


SetEnvIfNoCase Referrer ".*(casino|gambling|poker|porn|sex|nude|xxx).*" BadReferrer
order deny,allow
deny from env=BadReferrer

This code blocks access to your site from domains having the listed words in them. Find the string of characters that is most unique in the domain name of the content scraping site and add it to this list (or replace one of the spammy words), separated by a vertical line. If that site has been hotlinking images from your site, it will no longer be able to do so. Note that ANY website with that string will no longer be able to access your site, so be cautious in the words or characters you select. I once had my recipe site pulled out of completely legitimate feed service because the service's name was "Food Porn watch". Had to remove that one little p-word from my htaccess code.

Like all advice given here on LMT regarding htaccess, know what you're doing before messing with this file. Or have someone who knows what she or he is doing do it for you.

Post a comment

(If you haven't left a comment here before, your comment may need to be approved before will appear on the entry. Thanks for waiting.)