Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Find files that have the same contents

0.00/5 (No votes)
1 Sep 2012 1  
Find the files that have the same contents in under special folder

Introduction 

How to find the files that have the same contents.

Background 

I found  there are many files that have the same contents but have different file name in my computer, i want to clear them and just maint one copy.  Therefore i wrote this small tool to find the same files.

Using the code 

 I  used the dictionary to store the file's mark(calculate by MD5 alg) and the corresponding files:

 Dictionary<string, List<string>> dtFiles = new Dictionary<string, List<string>>() 

and then find each file under the folder and get it's mark. In order to use the Dictionary.ContainsKey method, i convert the byte[] to base64 string.

                string[] strFiles = Directory.GetFiles(strFolder_);
                foreach (string strFullFile in strFiles)
                {
                    if (_bToStop)
                        return;

                    try
                    {
                        byte[] byMd5 = Xugd.Hash.XMd5.CalcFile(strFullFile);
                        string strMd5 = XConvert.BytesToString(byMd5);
                        if (dtFiles_.ContainsKey(strMd5))
                        {
                            List<string> lstFiles = dtFiles_[strMd5];
                            lstFiles.Add(strFullFile);
                            dtFiles_[strMd5] = lstFiles;
                        }
                        else
                        {
                            List<string> lstFiles = new List<string>(2);
                            lstFiles.Add(strFullFile);
                            dtFiles_.Add(strMd5, lstFiles);
                        }
                    }
                    catch { }
                }

                // Find in sub dir
                string[] strDirs = Directory.GetDirectories(strFolder_);
                foreach (string strSub in strDirs)
                    FindSameFile(strSub, delStart_, dtFiles_);
            }

 after process all files, we can check each item in the dictionary and find out the files that has the same contents.

                foreach (KeyValuePair<string, List<string>> kvFile in dtFiles)
                {
                    if (kvFile.Value.Count > 1)
                    {
                        // found the files
                    }
                } 

 Points of Interest  

This small article is written for those developers who are want to found the same contents file.

History

31 August 2012: First version  

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here