Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Avoiding spam-bots

0.00/5 (No votes)
31 Aug 2004 1  
Prevent spam-bots from harvesting email addresses in web pages.

Introduction

Spam-bots scan the web and harvest email addresses from web pages, news groups, and other sources. This article shows you a simple technique you can use in web pages to avoid spam-bots. The idea is used in the FotoVision sample I created, but I thought it would be useful to discuss this particular piece outside of the FotoVision sample. The idea is pretty simple; instead of storing the real email address in the HTML, an encoded version of the address is stored and decoded on the client when necessary.

Step 1. Encode the email address

First, the email address needs to be encoded. The encoded string can be pre-calculated or dynamically calculated on the server. The following function uses the BitConverter class to encode the email address derf@example.com to the string 64657266406578616D706C652E636F6D.

// C#

string EncodeEmailAddress(string email)
{
  return BitConverter.ToString(
    ASCIIEncoding.ASCII.GetBytes(email)).Replace("-", "");
}
' VB.NET

Function EncodeEmailAddress(ByVal email As String) As String
  Return BitConverter.ToString( _
    ASCIIEncoding.ASCII.GetBytes(email)).Replace("-", "")
End Function

Step 2. Use the encoded email in the HTML

Instead of using the real email address in the HTML link, use the encoded value. For example:

<a href="javascript:sendEmail('64657266406578616D706C652E636F6D')">Email Derf</a>

I considered using HTML encoding for the email address, but I think spam-bots would be more likely to process the value and using a custom encoding algorithm is a better solution.

Step 3. Decode the email address on the client

The client-side function sendEmail is called on the client; this function decodes the email address and displays the email application. The sendEmail function contains the following:

// open the client email with the specified address

function sendEmail(encodedEmail)
{
  // do the mailto: link

  location.href = "mailto:" + decodeEmail(encodedEmail);
}

// return the decoded email address

function decodeEmail(encodedEmail)
{
  // holds the decoded email address

  var email = "";

  // go through and decode the email address

  for (i=0; i < encodedEmail.length;)
  {
    // holds each letter (2 digits)

    var letter = "";
    letter = encodedEmail.charAt(i) + encodedEmail.charAt(i+1)

    // build the real email address

    email += String.fromCharCode(parseInt(letter,16));
    i += 2;
  }
  
  return email;
}

That's it, now derf@example.com will not be picked up by spam-bots since the text never appears in the HTML, but the email link still works like expected (the email program is displayed with the correct address when clicked).

Step 4. Optionally, update the status area

You can extend the link by handling the mouseover and mouseout events to display the email address in the status area. The updated HTML link looks like the following:

<a href="javascript:sendEmail('64657266406578616D706C652E636F6D')" 
  onmouseover="javascript:displayStatus('64657266406578616D706C652E636F6D'); 
  return true;" onmouseout="javascript:clearStatus(); return true;">
  Email Derf</a>

And two functions are added to the client-side script:

// display the email address in the statusbar

function displayStatus(encodedEmail)
{
  window.status = "mailto:" + decodeEmail(encodedEmail);
}

// clear the statusbar message

function clearStatus()
{
  window.status = "";
}

Now, the real email address is displayed in the status area when the mouse is moved over the link.

Sample code and encoding web page

There are two files in the sample code. The file email.js contains the client-side script functions that you can include in your HTML pages. The file test.html is a sample HTML page that uses the email.js file.

The encoded email address can be dynamically calculated on the server, but that's not necessary, you can also pre-calculate the encoded email and use that value in the HTML. I created an encoding web page that encodes an email address that you can paste into your HTML code. If your site contains a lot of email links, it would be easy to create a control that takes in an email address and emits HTML that contains the encoded link.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here