Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Simple Microsoft Outlook 2007 Spam Filter

15 Mar 2010CPOL2 min read 1   964  
A very simple Outlook spam filter

Introduction

Do you ever wonder if your Outlook spam filter could be enhanced with your own rules?

If your answer is yes, this is your article. The coding answer is easier than you think.

Background

The right way to do this is by creating a Outlook 2007 addin from Visual Studio. The problem is that it seems too hard the first time you look into the predefined Visual Studio template.

It looks like:

C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;
using Outlook = Microsoft.Office.Interop.Outlook;
using Office = Microsoft.Office.Core;

namespace OutlookSpamFilter
{
    public partial class ThisAddIn
    {
        private void ThisAddIn_Startup(object sender, System.EventArgs e)
        { 
        }

        private void ThisAddIn_Shutdown(object sender, System.EventArgs e)
        {
        }

        #region VSTO

        /// <summary>
        /// [INTERNAL]
        /// </summary>
        private void InternalStartup()
        {
            this.Startup += new System.EventHandler(ThisAddIn_Startup);
            this.Shutdown += new System.EventHandler(ThisAddIn_Shutdown);
        }

        #endregion
    }
}	 

But if you look more carefully, you'll probably discover "this.Application.NewMailEx". It is the root of this article, an Outlook event made to catch the new email event.

Using the Code

The code is very simple once the new email event has been detected. But there are two options:

  1. Full scan all inbox folders each time an email has arrived (option that I strongly disagree with)
  2. Check the new email items for a spam condition

The most simplistic spam check condition is finding a special string within the typical email text fields, so we need a list of spam strings.

C#
SpamTokens = new string[] { "Viagra", ".ru/", "Can't view email?","SpamTest" };

When a string like that is found in an email, this email is probably spam.

The email fields to look for would be:

  • SenderName: The name of the sender
  • Body: The TEXT email body
  • HTMLBody: The HTML body (if the email has HTML content)
  • SenderEmailAdress: The address of the sender
  • Subject: The subject of the email

The email filtering condition should check every email object within a list. So we do the check and the action of moving to JUNK EMAIL folder. Note that for every email object and every string, we do the spam check condition ("ToUpperInvariant" is used to ignore caps).

C#
private void CheckSpamCondition(MAPIFolder mfJunkEmail, System.Object oBoxItem)
{
	if (oBoxItem is Outlook.MailItem)
	{
		Outlook.MailItem miEmail = (Outlook.MailItem)oBoxItem;
		foreach (string SpamToken in SpamTokens)
		{
			string strSpamToken = SpamToken.ToUpperInvariant();
			if (miEmail.SenderName.ToUpperInvariant().
				Contains(strSpamToken))
			{
				miEmail.Move(mfJunkEmail);
				break;
			}
			else if (miEmail.Body.ToUpperInvariant().
				Contains(strSpamToken))
			{
				miEmail.Move(mfJunkEmail);
				break;
			}
			else if (miEmail.HTMLBody.ToUpperInvariant().
				Contains(strSpamToken))
			{
				miEmail.Move(mfJunkEmail);
				break;
			}
			else if (miEmail.SenderEmailAddress.ToUpperInvariant().
				Contains(strSpamToken))
			{
				miEmail.Move(mfJunkEmail);
				break;
			}
			else if (miEmail.Subject.ToUpperInvariant().
				Contains(strSpamToken))
			{
				miEmail.Move(mfJunkEmail);
				break;
			}
		}
	}
} 

Additionally a very simple database string was added, this database keeps the token list in a file or reloads it at Outlook launch time.

The loading system is very simple, if the file exists it loads its contents else it makes an empty list. If you set the property, the value is kept in RAM and saved to disk.

C#
#region Token database
/// <summary>
/// Cached spam tokens
/// </summary>
string[] g_CachedSpamTokens = null;
/// <summary>
/// Tokens considered to mark mails as spam
/// </summary>
public string[] SpamTokens
{
	get
	{
		string strDbPath = TokenDatabasePath();
		if (g_CachedSpamTokens == null)
		{
			if (File.Exists(strDbPath))
			{// recover saved database
				List<string> g_SpamTokens = new List<string>();
				foreach (string strToken in 
				    File.ReadAllLines(strDbPath, Encoding.UTF8))
				{
					if (strToken != null && 
						strToken.Trim().Length > 0)
						g_SpamTokens.Add(strToken);
				}
				g_CachedSpamTokens = g_SpamTokens.ToArray();
			}
			else
			{// no saved database
				g_CachedSpamTokens = new string[0];
			}
		}

		return g_CachedSpamTokens;
	}
	set
	{
		g_CachedSpamTokens = value;
		File.WriteAllLines(TokenDatabasePath(), value, Encoding.UTF8);
	}
}
/// <summary>
/// Path to the Spam Token list database
/// </summary>
/// <returns></returns>
private static string TokenDatabasePath()
{
	return Path.Combine(Environment.GetFolderPath
	(Environment.SpecialFolder.LocalApplicationData), "OutlookSpamFilter.db");
}
#endregion	 

First spam check option (full scan) (Worst performance)

This option takes the NewEmail void event and performs a full scan search of every email item inside the inbox or subfolders of inbox.

C#
private void ThisAddIn_Startup(object sender, System.EventArgs e)
{
	// register the new email event
	this.Application.NewMail += new ApplicationEvents_11_NewMailEventHandler
		(Application_NewMail); // (Full-scan office 2003 version)	
}
/// <summary>
/// When new email arrives (Full-scan office 2003 version)
/// </summary>
void Application_NewMail()
{
	try
	{
		MAPIFolder mfInbox = this.Application.ActiveExplorer().
		Session.GetDefaultFolder(Outlook.OlDefaultFolders.olFolderInbox);

		MAPIFolder mfJunkEmail = this.Application.ActiveExplorer().
		Session.GetDefaultFolder(Outlook.OlDefaultFolders.olFolderJunk);

		Queue<MAPIFolder> qInboxSubfolderIteration = new Queue<MAPIFolder>();
		qInboxSubfolderIteration.Enqueue(mfInbox);
		List<MAPIFolder> lInboxSubfolders = new List<MAPIFolder>();
		while (qInboxSubfolderIteration.Count > 0)
		{
			MAPIFolder mf = qInboxSubfolderIteration.Dequeue();
			lInboxSubfolders.Add(mf);

			foreach (MAPIFolder mf2 in mf.Folders)
				qInboxSubfolderIteration.Enqueue(mf2);
		}

		foreach (MAPIFolder mf in lInboxSubfolders)
		{
			foreach (System.Object oBoxItem in mfInbox.Items)
			{
				CheckSpamCondition(mfJunkEmail, oBoxItem);
			}
		}
	}
	catch (System.Exception ex)
	{// if something fails do not bother user
		Console.WriteLine(ex.Message);
		Console.WriteLine(ex.StackTrace);
	}
} 

Second spam check option (only new email) (Best performance)

This option uses the NewMailEx event, this event includes a comma-separated list of every new email received. So we need to use a simple split by comma, and then a function to convert the ID into an email item (Application.Session.GetItemFromID).

C#
private void ThisAddIn_Startup(object sender, System.EventArgs e)
{
	// register the new email event
	this.Application.NewMailEx += 
		new ApplicationEvents_11_NewMailExEventHandler(Application_NewMailEx);
}

/// <summary>
/// When new mail has arrived
/// </summary>
/// <param name="EntryIDCollection">A comma-separated ID collection as string</param>
void Application_NewMailEx(string EntryIDCollection)
{
	MAPIFolder mfJunkEmail = this.Application.ActiveExplorer().
		Session.GetDefaultFolder(Outlook.OlDefaultFolders.olFolderJunk);

	string[] ids = EntryIDCollection.Split(',');
	foreach (string s in ids)
	{
		object oBoxItem = Application.Session.GetItemFromID(s, Type.Missing);

		CheckSpamCondition(mfJunkEmail, oBoxItem);
	}
} 

Points of Interest

This code has been published to be changed. It's only a demo. But if you want to use the token-string-list database, remember to remove the "// debug" line at the source code.

C#
private void ThisAddIn_Startup(object sender, System.EventArgs e)
{
	// register the new email event
	//this.Application.NewMail += new ApplicationEvents_11_NewMailEventHandler
		(Application_NewMail); // (Full-scan office 2003 version)
	this.Application.NewMailEx += new ApplicationEvents_11_NewMailExEventHandler
		(Application_NewMailEx);

	// debug
	SpamTokens = new string[] { "Viagra", ".ru/", "Can't view email?","SpamTest" };
} 

History

  • 2010-03-14: Some explanations added

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)