Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

File Compare Utility for Lean and Mean Contest

0.00/5 (No votes)
27 Aug 2009 1  
File Compare Utility for entry in the Lean and Mean competition

Introduction 

This is a candidate entry for The Code Project's Lean and Mean competition to write an efficient text diffing tool.

The problem can be described like this: Create an application that will calculate and display the changes between two text files as fast as possible using the least amount of memory possible. Provide timing data and maximum memory use data to prove you're the leanest and meanest..

My solution is developed in C#, that compares two files. The resulting program is compact and very fast, and scales perfectly with any file size. 

Screen Shots

This is the screen when we execute this application: 

This is the page when we compare two different files:

Analysis 

For making this application fast and simple, we first retrieve the location of two files. If the location and name of both these files are the same, then we assume that the content of these files is same.

In the second step, if the names of these two files are different we first retrieve the length of content of these two files. If there is a difference in their lengths, then the function returns false and we calculate time and give the message that the files are not the same. If the lengths of both these files are the same, then we start reading both files, and comparing the lines consecutively, as long as the lines in both files are identical, we continue without any problem. However as soon as there is a mismatch, we have to decide whether the current line in file1 has disappeared, or the current line in file2 has been inserted, or the one line has been edited to become the other. We show the result in a message box.

Using the Code

I basically have created a function FileCompare:

private bool FileCompare(string file1, string file2)
        {
            int file1byte;
            int file2byte;
            FileStream fs1;
            FileStream fs2;

            if (file1 == file2)
            {
                return true;
            }

            fs1 = new FileStream(file1, FileMode.Open);
            fs2 = new FileStream(file2, FileMode.Open);

            if (fs1.Length != fs2.Length)
            {
                fs1.Close();
                fs2.Close();

                return false;
            }

            do
            {
                file1byte = fs1.ReadByte();
                file2byte = fs2.ReadByte();
            }
            while ((file1byte == file2byte) && (file1byte != -1));

            fs1.Close();
            fs2.Close();

            return ((file1byte - file2byte) == 0);
        }

This function takes as an input two files which have to be compared further and gives us the output in the form of a boolean value that is true or false.

The sample code that is described in this article performs a byte-by-byte comparison until it finds a mismatch or it reaches the end of the file.

The code also performs two simple checks to increase the efficiency of the comparison:

  • If both file references point to the same file, the two files must be equal.
  • If the size of the two files is not the same, the two files are not the same.

Points of Interest

This functionality is similar to the MS-DOS-based Fc.exe utility that is included with various versions of Microsoft Windows and Microsoft MS-DOS, and with some development tools.

This application takes less than 10 milliseconds to compare two files and it takes 40k memory for the processing of the full application.

History

  • 28th August, 2009: Initial post

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here