Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Filter unwanted usernames or offensive or dirty words

0.00/5 (No votes)
8 Feb 2006 2  
Login names, article titles, and other user input on public sites rated G need help in filtering out the trash.

Introduction

Public sites and those "Rated G" are no place for words we all consider offensive. Commedian George Carlin has made lots of money with his Seven Dirty Words, and the rest of humanity seems to want to top him. If you have a site where users can create input, then sometimes you just might want to filter it.

WordFilter Project

This project is so simple. There is nothing fancy at all, unzip the files in a new folder and create a virtual directory in IIS to the folder. Since this is a .NET 2.0 project, you will need to set the properties in IIS on the ASP.NET tab to .NET 2.0. Make sure the virtual directory is an application, and that it is configured to run scripts.

The working code is in a class file in the APP_CODE folder. In that file are two versions of a string of words to test against.

The test is performed in two stages. First, the input word is run against a string to find a match embedded in the test string. If it is found, then the function returns "true". If that quick match fails, then a slightly more exhaustive check is preformed by taking the exclusion word list and testing the input string against it looking for a match. Again, if a match occurs, "true" is returned.

These methods are able to stem a word to produce a match: e.g., test (stem) test, tests, tested, testing, intestional will all return "true".

I have been asked, "Why have two methods?" There many times when the stem would produce a match when it is a perfectly acceptable word. Such as the name Mitchell. The first test looks for word matches while the second test performs stem/embedded matching. In my example project, I have the same words in both lists, but that is not required. Often, you will want to find specific words to filter, but not stems or embedded.

Limitations

A big list can take a long time. Also, no "pre-filtering" is done in this project. In a real world project, you most likely will want to do some initial range test or Regular Expression test before you test the filter. I run this after every thing else has cleared.

Naturally, this is not a complete list. You will certainly want to modify it for your own use.

Possibilities

The first thing that comes to mind is that you could simply drop the wordfilter.cs file directly into a project that you have and it is ready to go. You could implement it as a static class, then it would not need to be instanced in your code.

You could implement different lists that are applicable to the type of input expected.

Summary

I hope you find the code interesting and useful.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here