Introduction
The presented code snippet compares two given files using the IEqualityComparer
.
The comparison does the following:
- Use the FileInfo's equality operator (==) to try to determine whether these are the same instances of the FileInfo class
- If one of the objects is null
, they can't represent the same files, thus false is returned
- If the file path is for both objects the same, true is returned since the file must be the same
- If the file sizes differ, the files can't be the same either thus false is returned
- And at the end we resort to comparing the MD5 hash of both files.
Please keep in mind that MD5 hashing is an expensive operation, which is not suitable for comparing a lot of files or large files. The code presented here was initally intended to be used within an integration test. If you need to compare a lot of files (or very large ones) you may resort to your own implementation - You may want to start by reading
this stackoverflow discussion, though.
Background
Please keep in mind that this implementation reads the file contents into the memory to create the MD5 for each file. If you're trying to compare very large files, this may slow down your application considerably.
The code
public class FileMd5EqualityComparer : IEqualityComparer<FileInfo>
{
public bool Equals(FileInfo x, FileInfo y)
{
if(x == y)
{
return true;
}
if(x == null || y == null)
{
return false;
}
if(x.FullName == y.FullName)
{
return true;
}
if(x.Length != y.Length)
{
return false;
}
var md5X = GetMd5(x.FullName);
var md5Y = GetMd5(y.FullName);
return md5X == md5Y;
}
public int GetHashCode(FileInfo obj)
{
return obj.GetHashCode();
}
private string GetMd5(string filePath)
{
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(filePath))
{
return Encoding.Default.GetString(md5.ComputeHash(stream));
}
}
}
}
History
2018-05-25 Initial version