Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Scan directories using recursion

0.00/5 (No votes)
15 Jul 2008 4  
A generic class for scanning directories using recursion and events

Introduction

The class described in this article allows you to scan through a directory and it's subdirectories. The class uses recursion and events. The events allow the class to be unaware of whatever you want to do to the directory tree, so you will be able to build all sorts of functionality which requires you to scan (sub)directories. The class as it can be downloaded has the following features:

  • Public methods to start scanning a directory (and it's subdirectories) by providing a path as a string, or a DirectoryInfo object.
  • An event raised for each file found during the scan, providing file details in a FileInfo object.
  • An event raised when a scan enters or leaves a directory, providing directory details in a DirectoryInfo object
  • Arguments in the events to notify the class to stop the scan.
  • Overridable methods which allow you to handle found files and/or directories without the events.
  • Wildcarch search pattern.

In this article I will only explain the basics of how I built this class. The combined code snippets here will not reveal the complete implementation. For that please refer to the downloads and the XML documentation tags in the code.

Possible uses

I've used the class described here in several projects already. Some possible uses for this class:

  • Find files in a given directory that match a certain regular expression, rather then a wildcard.
  • Kill all files that are 0 bytes long.
  • Change the last modified date for all files in a specified directory.

Scanning the directories

To scan through the directories I first created a simple method which uses recursion to "walk" the tree. The basic class looks like this:

public class ScanDirectory
{
    public void WalkDirectory(string directory)
    {
        WalkDirectory(new DirectoryInfo(directory));
    }

    private void WalkDirectory(DirectoryInfo directory)
    {
        // Scan all files in the current path
        foreach (FileInfo file in directory.GetFiles())
        {
            // Do something with each file.
        }

        DirectoryInfo [] subDirectories = directory.GetDirectories();

        // Scan the directories in the current directory and call this method 
        // again to go one level into the directory tree
        foreach (DirectoryInfo subDirectory in subDirectories)
        {
            WalkDirectory(subDirectory);
        }
    }
}        

The public method WalkDirectory calls the private method WalkDirectory with a new instance of the DirectoryInfo object. The private method uses recursion to walk through all the levels of the directory tree.

At this point I have a very simple way of scanning all files in a directory tree. But I still needed a way to allow an action to take place for each file without having to implement it in this method. For that I created an event which is raised for each file found in during the scan. The event will have to send information about the current file to calling application. So first define the FileEventArgs object which will be sent to the calling application when the event is raised:

public class FileEventArgs : EventArgs
{
    /// <SUMMARY>
    /// Defined internal to prevent construction by other processes
    /// </SUMMARY>
    /// <param  name="fileInfo">
    internal FileEventArgs(FileInfo fileInfo)
    {
        if (fileInfo == null) throw new ArgumentNullException("fileInfo");
        
        // Get File information 
        _fileInfo = fileInfo;
    }

    private FileInfo        _fileInfo;

    /// <SUMMARY>
    /// Gets the current file information.
    /// </SUMMARY>
    /// <VALUE>The <SEE cref="FileInfo" /> object for the current file.</VALUE>
    public FileInfo Info
    {
        get { return _fileInfo; }
    }
}        

Now that this is done, the event can be implemented in the basic class. To do this, we first need to add a delegate to the code which is basicaly the prototype definition for the event.

/// <SUMMARY>
/// Definition for the FileEvent.
///    </SUMMARY>
public delegate void FileEventHandler(object sender, FileEventArgs e);         

Then we can define the a public event property to allow the calling application to connect an event procedure to this class.

/// <SUMMARY>
/// Event is raised for each file in a directory.
/// </SUMMARY>
public event FileEventHandler FileEvent;        

A method is implemented in this class to make sure the event can be raised safely. It checks if the calling application has connected an event to this class. If so, the event argument object is instantiated and the event is raised.

/// <SUMMARY>
/// Raises the file event.
/// </SUMMARY>
/// <param  name="fileInfo"><SEE cref="FileInfo" /> object for the current file.
private void RaiseFileEvent(FileInfo fileInfo)
{
    // Only do something when the event has been declared.
    if (FileEvent != null)
    {
        // Create a new argument object for the file event.
        FileEventArgs args = new FileEventArgs(fileInfo);

        // Now raise the event.
        FileEvent(this, args);
    }
}        

Last thing to do is raise the event when a file has been found. The loop in the WalkDirectory method that scans the files in a specific directory will now look like this:

// Scan all files in the current path
foreach (FileInfo file in directory.GetFiles())
{
    // Raise the event for the current file.
    RaiseFileEvent(file);
}        

This is the basic code for the ScanDirectory class which is included in the download. The complete code, which is in the download, also features a DirectoryEvent, an appropriate EventArgs object for that event and properties in both event arguments to allow you to stop the scan. All methods and properties have XML comments.

Overridable methods

I introduced overridable methods for both the DirectoryEvent and the FileEvent because of a discussion in the messageboard below about performance issues with the event driven model. I tested both scenarios and found overridable methods to be 50% faster then the events.

I introduced two overridable methods for this class. The first is called when the scan enters or leaves a directory:

/// <summary>
/// Processes the directory.
/// </summary>
/// <param name="directoryInfo">The directory info.</param>
/// <param name="action">The action.</param>
/// <returns><see langword="true"/> when the scan is allowed to continue. <see langword="false"/> if otherwise;</returns>
public virtual bool ProcessDirectory(DirectoryInfo directoryInfo, ScanDirectoryAction action)
{
    if (DirectoryEvent != null)
    {
        return RaiseDirectoryEvent(directoryInfo, action);
    } 
    return true;
}

The second overridable method is called for each file found in a directory during the scan:

/// <summary>
/// Processes the file.
/// </summary>
/// <param name="fileInfo">The file info.</param>
/// <returns><see langword="true"/> when the scan is allowed to continue. <see langword="false"/> if otherwise;</returns>
public virtual bool ProcessFile(FileInfo fileInfo)
{
    // Only do something when the event has been declared.
    if (FileEvent != null)
    {
        RaiseFileEvent(fileInfo);
    }
    return true;
}

As you can see, the overridable methods will call the default RaiseFileEvent and RaiseDirectoryEvent methods. By doing this, the class still supports the event driven model while introducing the option to inherit from the base class and "Roll your own" directory and file handling without events.

SearchPattern property

Last, but not least, is the implementation of a SearchPattern property. You can specify a search pattern to the class. The class will make sure that only files that match that pattern are scanned and returned to your application. This works for the event driven model as well as for the overridable methods. It is possible to specify more than one pattern by separating each pattern with a semi-colon. The code for the SearchPattern property looks like this:

/// <summary>
/// Gets or sets the search pattern.
/// </summary>
/// <example>
/// You can specify more than one seach pattern
/// </example>
/// <value>The search pattern.</value>
public string SearchPattern
{
	get { return _searchPattern;  }
	set 
	{
		// When an empty value is specified, the search pattern will be the default (= *.*)
		if (value == null || value.Trim().Length == 0)
		{
			_searchPattern = _patternAllFiles;
		}
		else
		{
			_searchPattern = value; 
			// make sure the pattern does not end with a semi-colon
			_searchPattern = _searchPattern.TrimEnd(new char [] {';'});
		}
	}
}

The reason for trimming any trailing semi-colons will be evident when you look at the following code. This shows the final implementation of the method which is responsible for scanning all files in a directory:

/// <summary>
/// Walks the directory tree starting at the specified path.
/// </summary>
/// <param name="directory"><see cref="DirectoryInfo"/> object for the current path.</param>
/// <returns><see langword="true"/> when the scan was cancelled. <see langword="false"/> if otherwise;</returns>
private bool WalkFilesInDirectory(DirectoryInfo directory)
{
	bool continueScan = true;

	// Break up the search pattern in separate patterns
	string [] searchPatterns = _searchPattern.Split(';');

	// Try to find files for each search pattern
	foreach (string searchPattern in searchPatterns)
	{
		if (!continueScan)
		{
			break;
		}
		// Scan all files in the current path
		foreach (FileInfo file in directory.GetFiles(searchPattern))
		{
			if (!(continueScan = this.ProcessFile(file))) 
			{
				break;
			}
		}
	}
	return continueScan;
}

As you can see, the value for the SearchPattern property is split into an array of separate patterns. If the trailing semi-colons would still be in the SearchPattern, then the array would contain an empty pattern at the end. This would then result in a search that would never return any results, so this is a small optimization for that problem. The DirectoryInfo object will not throw an error when you attempt a call to GetFiles() with an empty search pattern. You just get an empty FileInfo collection

Using the code

Below you will find two examples on how this code could be used.

Using the event driven model

If, for example, you just want to list all the files that can be found in the Program Files directory, you can use the ScanDirectory class as follows:

// Create a new ScanDictory object
ScanDirectory scanDirectory = new ScanDirectory();

// Add a FileEvent to the class
scanDirectory.FileEvent += new ScanDirectory.FileEventHandler(scanDirectory_FileEvent);

scanDirectory.WalkDirectory("C:\\Program Files");        

The scanDirectory_FileEvent will then look like this:

private void scanDirectory_FileEvent(object sender, FileEventArgs e)
{
    Console.WriteLine(e.Info.FullName);
}        

The demo project features a windows form application which will add nodes to a TreeView control.

Using overridable methods

The code below shows an example of a class that inherits from ScanDirectory to write the names of the files found to the console:

public class ShowInConsole : ScanDirectory
{
    /// <summary>
    /// Processes the file.
    /// </summary>
    /// <param name="fileInfo">The file info.</param>
    /// <returns>
    /// <see langword="true"/> when the scan is allowed to continue. <see langword="false"/> if otherwise;
    /// </returns>
    public override bool ProcessFile(FileInfo fileInfo)
    {
        Console.WriteLine(fileInfo.FullName);
        return true;
    }
}

History

  • 01/06/2006 - New feature SearchPattern.
  • 31/05/2006 - Published update with the following changes:
  • 30/05/2006 - First version of this article published at CodeProject.

Things to consider

When processes take too long to finish, you might run into a problem where the enumeration of directories gets lost. I personally have not yet encountered that problem and have not made provisions for that situation in the code. More information on this issue can be found here.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here