Introduction
The DBXParser can read an Outlook Express DBX file, and extract the raw content (MIME) to EML file, then you can use MIME parsers to get the actual subject body and attachments. It's the first and only (as far as Google tells) pure C# source code that provides simple methods to read and extract, free, pure, without over-headed third-party DLLs. ;)
Background
I've been searching for source code to read an Outlook Express DBX file for a very long time (believe me, it's really long). I have found quite a few DBX readers/codes in other languages, such as:
But none of them are pure C#, so I made my own. It's a good way to port similar codes from another language, here I choose the DBX Parser(PHP), because it's much simpler than the C++ version:)
Where Are My DBX Files?
Your DBX files will normally be found here:
C:\Documents and Settings\{UserName}\Local Settings\
Application Data\Identities\{Guid}\Microsoft\Outlook Express
Take a look at {Guid}, it's different on different computers, so all you need to do is to use Directory.GetFiles()
to go through all the sub folders for searching *.dbx files.
File Format
I am not going to go deep into the file format here, because there are already quite a few documents there:
How It Works
It reads the raw DBX file, without any third-party or related DLLs provided by Outlook Express. It only reads through the file, decoding byte by byte, then internally stores the position and sizes of chunks of each message in order to be used by the Extract
method, because it only stores the positions, so it uses very little memory.
Using the Code
First create a new instance of DBX, then use the Parse
function to read the file, it will return how many messages the DBX file contains. If it returns -1
, then it means there is something wrong with the file. Then you could use the Extract
method to save the content to a file or to read the content to memory.
Here goes a sample code:
using (DBX DBX = new DBX())
{
int count = DBX.Parse(@"test.dbx");
if (count > 0)
{
for (int i = 0; i < count; i++)
{
DBX.Extract(i, (i + 1) + ".eml");
}
}
}
How to Decode MIME (EML File)?
Choose any one of the following codes that you like:
Points of Interest
Because I suffered a lot while finding such code, I contribute it here as others won't have to get crazy looking for it. After porting the code from PHP, I feel that PHP is really mature. There are different kinds of PHP codes out there, and I'm really surprised to see that even PHP has such a sample code, but Java/VB/C# do not. I try to keep this code as simple as possible, read and extract, that's all, which I think fits most of the situations. If you have any comments or suggestions, please feel free to tell me, or just modify the code yourself.
History
- Version 1.0 - 2009-5-1
- Version 1.1 - 2009-5-5: some code clean up, added a VC++ code
- Version 1.2 - 2009-7-2: fixed a problem not correctly return the exact amount of mails in some special situations, thanks to Cato.