The demo application, QuickOpenFiles.
Introduction
Microsoft provided a sample code for Listing Files in a Directory, but that example code cannot be used to list files in the sub-directory, so how can we do that?
Enumerating/searching files and sub folders is in fact a rather basic skill, people have posted many articles on that, you can see the other examples here. However, many of those implementations enumerate files in a recursive way. As a beginner in programming, I just wonder how to do that non-recursively.
Recursive Implementation
In many cases, recursive implementation can be really elegant, if you already know how to do this recursively, then you may just simply skip this part.
To search files in Windows, we can use the APIs FindFirstFile
/FindNextFile
(or SHGetDataFromIDList
, or the FileSystem
library in boost), and a typical implementation is to do this recursively:
#include "strsafe.h"
bool EnumerateFolder(LPCTSTR lpcszFolder, int nLevel = 0)
{
WIN32_FIND_DATA ffd;
TCHAR szDir[MAX_PATH];
HANDLE hFind = INVALID_HANDLE_VALUE;
StringCchCopy(szDir, MAX_PATH, lpcszFolder);
StringCchCat(szDir, MAX_PATH, TEXT("\\*"));
hFind = FindFirstFile(szDir, &ffd);
if (INVALID_HANDLE_VALUE == hFind)
{
return false;
}
TCHAR szOutLine[MAX_PATH] = {0};
for (int ii = 0; ii < nLevel; ++ii)
szOutLine[ii] = _T('\t');
do
{
if (ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
{
if ( _T('.') != ffd.cFileName[0] )
{
_tprintf(TEXT("%s%s <DIR>\n"),
szOutLine, ffd.cFileName);
StringCchCopy(szDir+_tcslen(lpcszFolder)+1,
MAX_PATH, ffd.cFileName);
EnumerateFolder(szDir, nLevel+1); }
}
else
{
_tprintf(TEXT("%s%s\n"), szOutLine, ffd.cFileName);
}
}
while (FindNextFile(hFind, &ffd) != 0);
FindClose(hFind);
return true;
}
The Issues of Recursion
The readability of recursive implementation is usually good, but I started to think about the potential Stack Overflow problem.
In Windows, each thread is granted a default stack size of 1 MB, and that might be big enough in most cases.
In the above function EnumerateFolder()
, it will take around 580 bytes for the local variables (ignore the szOutLine
). That means, in theory, we can enumerate up to 1808 levels of subdirectories.
However, since the buffer size (MAX_PATH
) of the path is limited to 260 characters, it is not possible to run into that amount of levels.
In fact, Microsoft does provide the mechanism to break that limitation.
In the ANSI version of this function, the name is limited to MAX_PATH
characters. To extend this limit to 32,767 widecharacters, call the Unicode version of the function and prepend "\\?\" to the path.
Although some people said that 260 characters is good enough, and the stack might not that easily overflow, I guess avoiding recursion when possible can't be a bad idea; am I correct? This implementation is more like a practice for beginners like me, so let's just have fun coding it.
Iterative Implementation
In general, there are two ways to do a searching iteratively:
- Depth-first searching (DFS)
- Breadth-first searching (BFS)
Since Raymond Chen explained that breadth-first searching is better for file system tree walking, I decided to post the BFS implementation snippet here.
For those who still want to know how to implement this using DFS, you can dig into the demo code in this page and open the source file FileEnumerator.h, then you will see that I have prepared some #ifdef
switch there, you can simply try comment out/uncomment this macros:
#ifndef FILEENUMERATOR_RECURSION
#define FILEENUMERATOR_BFS
#endif
Now let's write a BFS implementation. First, we need a class that encapsulates the basic operations:
class CFileFinder
{
public:
CFileFinder(LPCTSTR lpcszInitDir, FindFileData& fileInfo)
: m_fileInfo(fileInfo)
{
Init(lpcszInitDir);
}
public:
inline bool FindFirst()
{
return EnumCurDirFirst();
}
inline bool FindCurDirNext()
{
bool bRet = ::FindNextFile(m_hFind, &m_fileInfo) != FALSE;
if ( bRet )
{
m_szPathBuffer.resize(m_nFolderLen);
m_szPathBuffer += m_fileInfo.cFileName;
}
else
{
::FindClose(m_hFind);
m_hFind = INVALID_HANDLE_VALUE;
}
return bRet;
}
virtual bool Finish() const
{ return INVALID_HANDLE_VALUE == m_hFind; }
inline LPCTSTR GetPath() const
{return STRBUFFER(m_szPathBuffer) + EXTEND_FILE_PATH_PREFIX_LEN;}
inline const FindFileData& GetFileInfo() const { return m_fileInfo; }
inline bool IsDirectory() const
{ return (m_fileInfo.dwFileAttributes &
FILE_ATTRIBUTE_DIRECTORY) == FILE_ATTRIBUTE_DIRECTORY; }
inline bool IsDot() const
{ return (m_fileInfo.cFileName[0] == '.') &&
((m_fileInfo.cFileName[1] == '.') ||
(m_fileInfo.cFileName[1] == '\0')); }
protected:
virtual bool EnumCurDirFirst()
{
m_szPathBuffer.resize(m_nFolderLen+2);
m_szPathBuffer[m_nFolderLen++] = _T('\\');
m_szPathBuffer[m_nFolderLen] = _T('*');
HANDLE hFind = ::FindFirstFile(STRBUFFER(m_szPathBuffer), &m_fileInfo);
bool bRet = hFind != INVALID_HANDLE_VALUE;
if ( bRet )
{
m_hFind = hFind;
m_szPathBuffer.resize(m_nFolderLen);
m_szPathBuffer += m_fileInfo.cFileName;
}
return bRet;
}
void Init(LPCTSTR lpcszInitDir)
{
m_nFolderLen = _tcslen(lpcszInitDir);
m_szPathBuffer = lpcszInitDir;
if ( m_szPathBuffer[m_nFolderLen-1] == _T('\\') )
{
m_szPathBuffer.erase(m_nFolderLen-1);
--m_nFolderLen;
}
}
protected:
FindFileData& m_fileInfo;
tstring m_szPathBuffer;
UINT m_nFolderLen;
HANDLE m_hFind;
};
Then we need to put the pending sub-directories into a queue:
#include <boost/shared_ptr.hpp>
typedef boost::shared_ptr<CFileFinder> FileFindPtr;
typedef std::queue<FileFindPtr> FileFindQueue;
bool CFileEnumeratorBase::EnumerateBFS
( LPCTSTR lpcszInitDir, FindFileData& findFileData, HANDLE hStopEvent )
{
FileFindPtr finder = NULL;
try
{
finder = new CFileFinder(lpcszInitDir, findFileData);
}
catch (bad_alloc&)
{
CFE_ASSERT(0);
return false;
}
bool bRet = true;
FileFindQueue finderQueue;
if ( !finder->FindFirst() )
{
m_dwLastError = ::GetLastError();
OnError(finder->GetPath());
return false;
}
else
{
while( !finder->Finish() && !IsStopEventSignaled() )
{
const FindFileData& fileInfo = finder->GetFileInfo();
if( finder->IsDirectory() )
{
if ( !finder->IsDot() )
{
if ( CheckUseDir
(finder->GetPath(), fileInfo) )
{
HandleDir(finder->GetPath(),
fileInfo);
FileFindPtr newFinder = NULL;
try
{
newFinder =
new CFileFinder
(finder->GetPath(),
findFileData);
finderQueue.push
(newFinder);
}
catch (bad_alloc&)
{
CFE_ASSERT(0);
}
}
}
}
else
{
if ( CheckUseFile(finder->GetPath(), fileInfo) )
{
HandleFile(finder->GetPath(), fileInfo);
}
}
if ( !finder->FindCurDirNext() )
{
FinishedDir( finder->GetPath() );
if ( finderQueue.empty() )
break;
else
{
while ( !IsStopEventSignaled() )
{
FileFindPtr nextFinder =
finderQueue.front();
finderQueue.pop();
finder = nextFinder;
if ( !finder->FindFirst() )
{
m_dwLastError =
::GetLastError();
if ( !OnError
(finder->GetPath()) )
{
return false;
}
}
else
break;
}
}
}
}
}
return bRet;
}
Using the Code
The initial design of classes was inspired by Andreas Saurwein Franci Gonçalves' article, please read his article first! Then you need to implement your own class derived from either CFileEnumeratorBase
or CFilteredFileEnumerator
.
CFileEnumeratorBase
is provided for basic file enumeration, and CFilteredFileEnumerator
is provided for pattern filtering enumeration.
Following is the example class derived from CFilteredFileEnumerator
:
class CFileInfoEnumerator : public CFilteredFileEnumerator
{
public:
CFileInfoEnumerator();
virtual ~CFileInfoEnumerator();
public:
enum FileInfoType
{
FIT_INVALID = -1,
FIT_FIRST,
FIT_FILENAME = FIT_FIRST,
FIT_FILEPATH,
FIT_FILEEXT,
FIT_MODIFIED,
FIT_SIZE,
};
struct FileInfo
{
std::string sFileInfo[FIT_SIZE];
};
public:
int GetFileCount() const { return m_arrFileInfo.size(); }
const FileInfo* GetFileInfo(int nIndex);
protected:
virtual void HandleFile(LPCTSTR lpcszPath, const FindFileData& ffd);
protected:
typedef std::vector<fileinfo> FileInfoArray;
FileInfoArray m_arrFileInfo;
};
#define DEFAULT_SORT_ASCENDING true
CFileInfoEnumerator::CFileInfoEnumerator()
: m_nSortInfoType(FIT_INVALID)
, m_bSortAscending(DEFAULT_SORT_ASCENDING)
{
}
CFileInfoEnumerator::~CFileInfoEnumerator()
{
}
void CFileInfoEnumerator::HandleFile( LPCTSTR lpcszPath, const FindFileData& ffd )
{
FileInfo fileInfo;
fileInfo.sFileInfo[FIT_FILENAME] = ffd.cFileName;
fileInfo.sFileInfo[FIT_FILEPATH] = lpcszPath;
std::string::size_type nDotPos = fileInfo.sFileInfo[FIT_FILENAME].rfind(_T('.'));
if ( nDotPos >= 0 )
fileInfo.sFileInfo[FIT_FILEEXT] =
fileInfo.sFileInfo[FIT_FILENAME].c_str()+nDotPos+1;
SYSTEMTIME st;
#define FTIME_BUFFER_SIZE 255
TCHAR szModified[FTIME_BUFFER_SIZE+FTIME_BUFFER_SIZE],
szLocalDate[FTIME_BUFFER_SIZE], szLocalTime[FTIME_BUFFER_SIZE];
FILETIME ft = ffd.ftLastWriteTime;
FileTimeToLocalFileTime( &ft, &ft );
FileTimeToSystemTime( &ft, &st );
GetDateFormat( LOCALE_USER_DEFAULT, DATE_SHORTDATE,
&st, NULL, szLocalDate, FTIME_BUFFER_SIZE );
GetTimeFormat( LOCALE_USER_DEFAULT, TIME_NOSECONDS,
&st, NULL, szLocalTime, FTIME_BUFFER_SIZE );
_sntprintf(szModified, FTIME_BUFFER_SIZE+FTIME_BUFFER_SIZE,
_T("%s %s"), szLocalDate, szLocalTime);
fileInfo.sFileInfo[FIT_MODIFIED] = szModified;
m_arrFileInfo.push_back(fileInfo);
}
const CFileInfoEnumerator::FileInfo* CFileInfoEnumerator::GetFileInfo( int nIndex )
{
if ( 0 <= nIndex && nIndex < (int)m_arrFileInfo.size() )
{
return &m_arrFileInfo.at(nIndex);
}
return NULL;
}
And to use it:
CFileInfoEnumerator enumerator;
TCHAR chDir[MAX_PATH];
GetCurrentDirectory(MAX_PATH, chDir);
enumerator.SetFilterPatterns(_T("*.exe;*.dll;*.h;*.c;*.cpp;"));
enumerator.Enumerate(chDir);
About the Demo
I wrote a console application to demonstrate the use of the source code.
Other Projects in CodeProject that Inspired Me
- CEnum - File Enumeration and File Globbing Class
- File and Directory Enumeration
- XEditPrompt - CEdit-derived control with web-like prompt
History
- 2010-7-10: Initial release
- 2010-11-25: Provided the DFS implementation
- 2010-12-29: Skip the junction (reparse point) to avoid infinite loop