Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

FileDiff Contest Entry

0.00/5 (No votes)
12 Aug 2009 1  
Text Difference between two files

Introduction

This is a contest entry for file differences.

Using the Code

This application is pretty basic. It uses FileStream objects to perform its task.

ASCIIEncoding encode = new ASCIIEncoding();
FileStream fileA = File.OpenRead(args[0]); 
FileStream fileB = File.OpenRead(args[1]);

int b = 0;
int l = 0;

fileA.Position = 0;
fileB.Position = 0;

We start off by opening the files and setting the positions within the files to 0. The var b is the last byte read and the var l is the length of the changed bytes. 

while (fileA.Position <= (fileA.Length - 1))
{
    b = fileA.ReadByte();

    if (fileB.Position <= (fileB.Length - 1))
    {
        if (b != fileB.ReadByte())
        {
            l = 1;

            while (fileB.Position <= (fileB.Length - 1) && 
                fileB.ReadByte() != b)
            l += 1;

            byte[] s = new byte[l];
            fileB.Seek(fileB.Position - l, 0);
            fileB.Read(s, 0, l);
 
            Console.WriteLine("FileDiff Pos:{0}, Len{1}, Str:{2}",
                              fileA.Position, 
                              l, 
                              encode.GetString(s));
        }
    }
}

fileA.Close();
fileB.Close(); 

This is the main application loop. As you can see, it steps through the file byte by byte. When two bytes are different, it stops looking and scans stream B for the next byte that's equal to stream A.

Points of Interest

This is a contest entry, written in C# with .NET v2, is 73 lines including blank lines / comments and formatted code + the timers, etc. The number of lines that are not overhead, blank or comments are 28. It uses 4,384K Memory (Private Working Set) and the EXE is 5.5k.

Run-time is roughly 30 milliseconds, output is the position in the file, length of the Diff and the textual representation. 

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here