Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / threads

Writing a Folder Synchronization Application

3.18/5 (7 votes)
8 Nov 2010CPOL8 min read 77.9K   8.9K  
This articles uses light threading, mutex, and simple algorithm to determine file to synchronize

Begin to Use

For user wanting to test the application without source, run the downloaded Installer and install the application, then run the application by clicking the short cut in the desktop, open the second tab, and click Add button located below. Manually type source folder location, and destination location, and click Save. Wait for the processing window to close, then click the Hide button. The application will then synchronize the two folders in the background. Note that both the folders must exist, if one or both of the folders do not exist, the application assumes that it is offline and continues monitoring until it is available. Once available, it will start searching for differences and start the Syncing task.

Introduction

The application attached in this article is fully functional. At first, I am confused since I have lots of files needed to be backed up in the office and home computer, but sometimes I often forget whether the file in my computer is the latest or the file in the office is the latest? Sometimes I have made changes on both computers, so I need a way to synchronized the two folders!

Beginning from there, I am writing this application FolderSync, the very first version 1.0. This application will synchronize two folders either in both direction (copy from source to destination and destination to source) or only one direction (destination).

Background

The background of this application is a class name FolderSynchronization, when started, this class will start 3 threads. The first thread is Syncing Thread to pop out any new FileOperation from Queue, a FileOperation is an instance of an object which has polymorphism DoOperation() method, the method will be called and do operation such as copy file from one folder to another, or creating a folder (if it does not exist). These FileOperation objects are created by the second thread and push to queue.

The second thread, which is a Scanning thread will scan two folders, and searches for any differences. If difference is found, it will create a FileOperation object and push to Queue so that the first thread can operate, while if somehow the folders needed to be synchronized do not exist (either source or destination or both) this thread will then queue a monitor job for the third thread. The third thread or Monitor Thread, is basically popping job from the queue and checking if source and destination folders both exist, if it does not exist, it will continue monitoring until it exists and when it exists, it will search for differences and push FileOperation objects to Queue.

Using the Code

First create the instance of the synchronization object, then start it. After that, passing a FolderSynchronizationScannerItem to the object AddScan method, and it will begin scanning and synchronizing both folders:

C#
FolderSynchronization _Sync = new FolderSynchronization();

 _Sync.Start();

FolderSynchronizationScannerItem fssi = new FolderSynchronizationScannerItem();

fssi.Source = "C:\users\username\desktop\Folder1";

fssi.Destination = "C:\users\username\desktop\Folder2";

fssi.Option = FolderSynchorizationOption.Both;

fssi.Monitor = true;
_Sync.AddScan(fssi);

How the Code Works

As already explained, there are 3 threads running when you call _Sync.Start, and when you call _Sync.AddScan, the current running thread will add the FolderSynchronizationScannerItem into _ScanQueue, this scanning queue will popup by Scanning Thread, and when a dequeue occurs, Scanning Thread will call Sync() method from class FolderSynchronizationScanner and it will continue working on that thread until all directories and sub directories are found and return a MultiKeyCollection<FileOperation>. A MultiKeyCollection class is basically a wrapper for Dictionary<string, Object> but the differences is, it will create the key (string) from properties of a class. If you create MultiKeyCollection with string[] {"PropA", "PropB", "PropC"} everytime you add an object to the collection, it will create a key from property PropA, B and C of that object and that object must have those properties.

Then, Scanning Thread will loop and add all FileOperation return from FolderSynchronizationScanner and queue it inside _SyncQueue of FolderSynchronization class. And automatically the Syncing Thread will dequeue it and process it in the background.

The interesting part of the scanning process is this method:

C#
protected void StartFolder(string folder, int level)
        {
            string sourcePath = Path.Combine(_Source, folder);
            string destinationPath = Path.Combine(_Destination, folder);

            if (Directory.Exists(sourcePath) == false) AddCreateFolderTask(sourcePath);
            if (Directory.Exists(destinationPath) == false) 
				AddCreateFolderTask(destinationPath);

            if (Directory.Exists(sourcePath) == true) 
		LoadFilesInFirstPath(sourcePath, destinationPath, false);
            if (_Options == FolderSynchorizationOption.Both && 
		Directory.Exists(destinationPath) == true) 
			LoadFilesInFirstPath(destinationPath, sourcePath, true);

            if (Directory.Exists(sourcePath) == true)
            {
                foreach (string subfolder in Directory.GetDirectories(sourcePath))
                {
                    string shortfoldername = subfolder.GetLastPathName(level + 1);
                    StartFolder(shortfoldername, level + 1);
                }
            }

            if (_Options == FolderSynchorizationOption.Both && 
			Directory.Exists(destinationPath) == true)
            {
                foreach (string subfolder in Directory.GetDirectories(destinationPath))
                {
                    string shortfoldername = subfolder.GetLastPathName(level + 1);
                    StartFolder(shortfoldername, level + 1);
                }
            }
        } 

This method will be called when you call AddScan(), this method is basically searching for differences between two folders and specifying the option whether it is Both or Destination sync option. The first time StartFolder() is called, it will be called using StartFolder("", -1) which is basically the root folder, the value -1 is used for getting the last folder name of a specified path, for example calling method stringObject.GetLastPathName(0) where stringObject is "c:\users\username\John" will return "John", if it is called with level 1, it will return "username\John", etc. This Extension Method is used for getting a number of last sub directories of a path in a string. Where this string later will be concatenated with Source and Destination folder to check the existence of the directory, if it does not exist, it will then add a FileOperation task for creating a folder.

While the comparison of two files is done by using ComponentLibrary, this ComponentLibrary is actually worth a discussion on its own, but here in short, this ComponentLibrary is basically querying all DLL files inside a folder and checks if any type of the DLL matches the template class, here is the code:

C#
ComponentLibrary<IFileComparer> libraries = new ComponentLibrary<IFileComparer>();

When the above code is executed, it will search for all DLL files inside the executable directory, and check if any type inside the DLL file has a Type that implements IFileComparer, if found, it will create an instance of that type, and add it into the collection. Then this libraries class is wrapped inside the class name MultiFileComparer. Here is the code:

C#
MultiFileComparer<ifilecomparer> _Comparer = 
		new MultiFileComparer<ifilecomparer>(libraries);

This class is basically a wrapper for ComponentLibrary class. This MultiFileComparer class has a method name Compare(), where it will loop through all instances that implement IFileComparer inside ComponentLibrary, and execute the Compare() method of those instances and sum up the result and return the result. If the result is < 0, then it means that the source file needs to be overwritten when the result is > 0, then it means the destination file needs to be overwritten, while 0 means the file is identical. Class DefaultSyncComparer is the class that implements IFileComparer and it resides in FolderSync.Library.Comparer project. It resides in another DLL because it is extendable, where programmers like you can write your own specification where file will be overwritten or not. Here is the MultiFileComparer class source:

C#
public class MultiFileComparer<t>
   {
       protected ComponentLibrary<t> _ComponentList = null;

       public MultiFileComparer(ComponentLibrary<t> _componentList)
       {
           _ComponentList = _componentList;
       }

       public int Compare(FileInfo file1, FileInfo file2)
       {
           int result = 0;
           if (_ComponentList == null || _ComponentList.Components.Count <= 0)
       return result;
           foreach (IFileComparer _comparer in _ComponentList.Components)
           {
               result = result + _comparer.Compare(file1, file2);
           }
           return result;
       }
   }

Basically, MultiFileComparer will add all values generated from all instances of IFileComparer, and the final value will be used as result (whether <0 or >0). This means that you can create a type that returns high -1000 value for example, because that condition is highly definitely must overwrite the source file, where when it adds up with other type, it will still get a negative value. This means that you can create your own extension for file comparing condition with returning different value.

Synchronization Works

Basically, when StartFolder() is called, it will loop through all files and all sub directories of source folder and check if destination exists, if not, it will create the file/directory, if it exists, it will check which has the latest date, the DefaultSyncComparer only checks for latest date, and without checking size, you can extend it with your own logic using the FolderSync.Library.Comparer and let me know. When all source folders' files and directories are compared with destination's folder, this means that we are only half done, where destination folder must have files or folders which source folder does not have. When Sync Option is specified as Both in FolderSynchronizationScannerItem.Option property, it will also loop through all files and folders of destination folder and compare it with source.

For example, we have a case like this:

  • Source Folder has a file a.txt which is dated 1 January 2011
  • Destination Folder has a file a.txt too which is dated 2 January 2011

When the first loop is called, it will get all files from source folder which has a.txt, it then checks the destination folder and found out that source folder needs to be overwritten. It will then create FileOperation object and add it inside collection, collection checks that no similar key exists before so it add it into collection. A key is created using SourceFileName concatenated with delimiter which is '#' then concatenate with DestinationFileName.

Then the second loop is called, this time the parameter is destination to source, it will compare destination a.txt with source a.txt, it then determines the same result, where source file needs to be overwritten, when it calls _SyncCollection.GetAddObject, the _SyncCollection type (which is MultiKeyCollection, will determine that an object already exists with the same key, and it will ignore the second FileOperation task and not adding it into collection, so the file will not be copied twice. You may need to refer to FolderSynchronizationScanner.cs for the source code for more clarity. The second loop is important when a file exists in the destination folder, while it does not exist in source folder, if that is the case, a FileOperation task will be created for copying destination folder file to source folder file.

Points of Interest

If you have any input, ideas, improvement, please write to my email jokenjp@yahoo.com in subject, please prepend with [FolderSync].

After testing, I found out several bugs related to UAC. Can anyone help me to sort it out? I am planning to impersonate the application as the logon user, so the application might need to ask for password, because it is just troublesome if everytime it runs at startup, the user is prompt to run the application as administrator, anyone has any idea? Thanks.

History

This is the very first version 1.0. The features are:

  1. Running at start up
  2. Synchronizing two folders at background
  3. Single instance application
  4. Extendable, algorithm to determine synchronization can be extended
  5. Full object oriented code

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)