![]()
Spam crawlers trawl the internet hopping from page to page, searching for unprotected email addresses in your source code. When they do find one, and even worse: if they do find your email address then you best hope you have good filtering because that email address will be squirrelled away and appear on thousands of spam mailing lists for literally years to come. I recently logged into an email account at HotPop that I abandoned almost five years ago just to discover that it was still being inundated with spam every day, even now.
there are enough websites and forums with unprotected email addresses that there isn’t any incentive for the developer to need to collect more
Fortunately these email harvesters are by their very nature, cheaply designed and relatively inefficient – there are enough websites and forums with unprotected email addresses (in particular usenet groups) that there isn’t any incentive for the developer to need to collect more, or improve what is essentially nothing more than a string-matching algorithm. Of course, this isn’t totally true and new bots are seen and evolutions of old ones appear frequently, and there is no denying that their behaviour is getting more and more dexterous, but still the fact that some webmasters don’t bother to protect email addresses at all means acts as a sort of protection to the rest of us – they are far more likely to be targeted than if you put in a little protection of your own.
There are a lot of different ways around this and the most common has been to replace your email address with an image of your email address. This leads to a few problems: visitors to your site can’t just simply click the link and fire off an email to you, they have to open up their email client and copy your address down manually letter-for-letter and symbol-for-symbol. An arduous task which reduces the chances of anybody actually bothering to contact you, and increases the risk of human error.
replace the email address with unicode
The most effective way I’ve found, and the one I use wherever possible in my developments is to replace the email address with unicode. The theory behind this is that although your visitor’s web browser will render it as a normal email address, to the bots crawling across your site it doesn’t match their algorithms and is ignored. These bots could easily be re-written to pay attention to encoded email addresses, but that would require more effort from the developer, and more processing power from the computing when they could be elsewhere getting email addresses with ease at half the relative ‘cost’.
As an example, here’s my email address, with a mailto link: mail@pixelcounter.co.uk, it doesn’t look any different than a normal address, and certainly you’re more than welcome to give it a click and send me over an email if you like. There’s no lost usability, and it takes an extra ten seconds of development time to convert your email address using the very useful ishida tool here.
The code itself looks like this:
<a href="mailto:mail@pix
elcounter.
co.uk">mail
@pixelcoun
ter.co.uk</a>
it looks ugly, but it does the trick and is well worth considering in your next project. To be totally safe you should use a contact form with captcha, but when the client insists they know best (as was the case with one of my recent projects), this is an easy middle ground.
Sadly though, no action will totally protect your email address from the spammers, they will still get hold of it: from other people’s compromised email accounts, from the header of those annoying Microsoft will give you money if you forward this email type forwards, from black voodoo and magic. The only real way to protect yourself, and your productivity (don’t forget that time spent dealing with spam is time you’re not earning – predictions show spam costs an average of £85 per person in a year) is to have a good spam filter set up, and to purge it regularly.
Coincidently Google’s Gmail spam filter is excellent and you can host your own domain name email addresses using their Gmail system, worth a look if only for the user interface. Click here to find out more.
No Responses to “Avoid email harvesting from your website”