Click here to Skip to main content
16,004,727 members
Articles / Programming Languages / C#
Article

A Naive Bayesian Classifier in C#

Rate me:
Please Sign up or sign in to vote.
4.36/5 (25 votes)
28 May 2006CPOL1 min read 190.6K   9.6K   77   31
A Naive Bayesian Classifier in C#
Sample Image - pict.gif

Introduction

I was looking for a way to classify short texts into several categories. A simple but probably sufficient method seemed to be naive bayesian classification. Looking for some readily available stuff, I found many different implementations in Perl or Java. The only CLR implementation I could find was NClassifier, yet it was not doing classification into multiple classes. Therefore I decided to write my own.

Background

There is plenty of information around on the Internet describing the theory of bayesian classification. Wikipedia has a good introduction.

Using the Code

First, create an instance of BayesClassifier.Classifier.

C#
BayesClassifier.Classifier m_Classifier = new BayesClassifier.Classifier();

Tip: You may experiment with BayesClassifier.ExcludedWords to define the words that you will consider irrelevant for your classification. That can lead to smaller dictionaries and therefore speed up the classification.

Then define the categories and teach each category:

C#
m_Classifier.TeachCategory("Cat1", new System.IO.StreamReader(file));
m_Classifier.TeachPhrases("Cat2", new string[] { "Hi", "HoHo" });

Finally the method BayesClassifier.Classifier.Classify will return the classification result.

C#
Dictionary<string, double> score = 
    m_Classifier.Classify(new System.IO.StreamReader(file));

Let me know if you have any questions or suggestions, and let me know if you have any experiences with the applicability of the naive bayesian approach. (Since the (wrong) assumption of word independence might turn out to influence the result).

History

  • 28th May, 2006: Version 1.0

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Germany Germany
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
AnswerRe: I ve a Question Pin
Junaid_Arif_Mufti26-Apr-07 20:05
Junaid_Arif_Mufti26-Apr-07 20:05 
QuestionQuestions/suggestions - please contact me Pin
MalteSteckmeister17-Dec-06 14:23
MalteSteckmeister17-Dec-06 14:23 
Questionhow to use your classifier Pin
rkamalakar12-Dec-06 0:57
rkamalakar12-Dec-06 0:57 
AnswerRe: how to use your classifier Pin
ErichG2-Jan-07 5:19
ErichG2-Jan-07 5:19 
QuestionHow can I use your classifier to classify 20NewsGroup? Pin
johnny198329-Nov-06 13:49
johnny198329-Nov-06 13:49 
AnswerRe: How can I use your classifier to classify 20NewsGroup? Pin
ErichG2-Jan-07 6:02
ErichG2-Jan-07 6:02 
QuestionRe: How can I use your classifier to classify 20NewsGroup? Pin
Tokes Erno25-Apr-07 13:29
Tokes Erno25-Apr-07 13:29 
Generalwhy results tends to negative Pin
abdo12345678930-May-06 12:00
abdo12345678930-May-06 12:00 
My First Qusition:

why results tends to zero and not normalized

My Second Qustion:
why you don't say cat1 relevant with 10% and cat2 with 20% and so on

My Last Qustion:
Does Naive Bayesian suitable for Story Link Detection ... Like all the relevant documents assigned to category 1 and else to category 2

Thanks alot



AG
GeneralRe: why results tends to negative Pin
ErichG1-Jun-06 22:39
ErichG1-Jun-06 22:39 
General5-stars all the same Pin
Grav-Vt28-May-06 5:01
Grav-Vt28-May-06 5:01 
GeneralRe: 5-stars all the same Pin
ErichG28-May-06 21:06
ErichG28-May-06 21:06 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.