Content-type: text/html Set-Cookie: cookiehash=D8TIX1F9GET8DML97LCWDC1UDL31CF7Q; expires=Thu, 24 Dec 2026 00:00:00 GMT; path=/; domain=.drivemeinsane.com DMI News

DMI News

Previous Entry.. Next Entry..

Anti-Spam Solutions

August 23, 2008 21:56

I'm currently working on a number of changes to the site, some of which will not be directly visible to the average visitor. I'm going to spend some time discussing each of the changes, or planned changes, as I'm working on them. The first thing I'm going to discuss, are some anti-spam techniques.

Everyone knows what email spam is. Pretty much everyone is familiar with various anti-spam software solutions to filter email for spam... or sadly, as is more the norm, to filter spam for legitimate email. However these filters work, they all suffer from one signficiant shortcoming; mainly, that you've already got the spam, and there's no way to stop receiving it. You might be able to filter out 99% of it, but some will always get through, and there is always the issue that you might lose legitimate messages. Filtering is very much an inexact science. You're much better off if you can avoid getting it in the first place, or, if you DO recieve spam, you can easily disable the source.

The first idea, which I've already implemented for myself, is to simply create a unique email alias for every entity you provide your email address to. My site lists one address, and everytime I register on someone else's site and they require an email addy, I create a new alias specifically for them. All of the email gets forwarded to a single address that doesn't get posted anywhere (and therefore shouldn't get any spam). Should I get spam from any of the aliases, I have two things working in my favor. First off, I know exactly where the spam came from, and secondly, I can disable a single alias and all of the spam goes away. If someone is selling email addresses, I now know exactly who it is. Of course, most spam comes from (or more exactly comes TO) the email address posted on my site.

Obviously, I can just change the address on my site whenever the spam level gets unreasonable, that IS afterall the whole point of setting up the aliasing the way I have. And it works well enough. But lets see if we can do one better. What if the email address is clearly displayed on the page for a visitor to see, but the spam harvesters can't grab it. One way would be to use a captcha (or any image) to display the address instead of writing it out in text. A human could obviously enter the address manaully, but an automated harvester would have no luck with it. However, this is tedious because now I no longer have a mailto: tag on the address, so someone has to type the address in manually instead of just clicking on the link.

So lets give them the link. The harvester is going to grab the page html code and search for email addresses. However, there is no reason the address has to be encoded in the page in a format that the harvester can easily read. An address that's encrypted can be decrypted with javascript and displayed on the page at the time it's loaded, and the mailto tag can be displayed normally. The harvester will only see jibberish. Alternatively, we can simply not include the address in the page code at all. After the page gets loaded, have some AJAX code grab the address off the website and display it realtime. Once again, the harvester won't be able to glean the address from the page code.

Lets take it one step further, and combine several of these ideas. When you visit the contact page, you get an address that has been specifically created for you, and you alone. Base it either on a cookie value or an ip address, but something unique. If a spammer sends an email to me using that address, I'll know the ip address of the harvester. From that point on, that ip address will always receive specially crafted pages. Whether it'll be a tarpit, or a page of spambot snacks of many many MANY invalid email addresses, that will yet to be determined.

Next time I'll talk a bit about my efforts at syncing databases over multiple servers.

Comments(0)