Introduction
This article and code will show you how to write a class that wraps around two different APIs to present you with the synthesis of both using one interface. For this problem, I take the particular case of showing you how to write a class to solve the problem of managing processes and modules in the Windows operating system.
For managing processes Windows API provide two different libraries are present:
- PSAPI.DLL functions (in psapi.dll) - avail in Windows NT and 2k/XP
- ToolHelp functions (in kernel32.dll) - avail in Windows 9x/ME/2k/XP
As you can see, if you want to support Windows 9x then you must code using ToolHelp API and if you want to support WinNT then you have to code using PSAPI. Also note that these different libraries use different data structures to present and to manage processes.
In this article I will show you how to:
- Load DLL functions dynamically.
- Combine two data structure and define a common structure based on the union of different data structures.
- Use the Standard Template Library to support your code.
- Write the C++ class, design the data structure and the design the functions to meet your requirements.
By the end of this article, you will not only learn how to use this code or manage processes and modules, but you will also learn how to write your own code for any other functionality that is present through two or more interfaces. I will be explaining the data structures I used and the functions I wrote in order to give you the experience and ideas on how to build anything like that on your own.
Background Information
For this article it is recommended that you know:
- What is a process
- What is a process module
- Knowledge of C/C++ language
- Little understanding of the concept of enumeration.
- STL's vectors
For your convenience, I have explained a bit about each of these concepts:
Term |
Explanation |
Process |
A process, in the simplest terms, is an executing program. A process is represented by its process ID assigned by the system when it creates the process.
To create a process we mainly use CreateProcess() .
You cannot directly function with the process ID (PID ) instead you will need a handle to the process. That can be obtained with the OpenProcess() . |
Module |
A module is an additional program that cannot be present and running by itself. Instead it is used and called by a running process. You can think of a module as a library such as kernel32.dll or user32.dll.
Every process has its own module dependency. This dependency can be defined in the program statically (in its import table ) or dynamically established through the use of LoadLibrary() .
A module (when loaded in memory) is mainly represented by an HMODULE . HMODULE is handle to a module. The value is the base address of the module. |
Enumeration concept |
Many of the Windows API provide a service to enumerate objects such has:
-
windows enumeration through EnumWindows
-
process enumeration through the PSAPI / ToolHelp functions
-
files and directories enumeration through FindFirst family.
Usually, the enumeration process is coupled with a callback mechanism that allows the developer to attach his own function whenever a new object is enumerated.
Callbacks in this context allows the developer to write extensible code while also allowing him to release his work without releasing source code yet allowing other developers to use his code in a flexible manner.
If no callback is presented then the enumeration process will be animated by other functions such as:
More about the enumeration concept will materialize through this article. |
STL's vector |
This is simply described as dynamic array of variable size. We can add to this array using push_back() method. Determine size using size() , we can also retrieve a certain record using the at() . |
For more information about this subject please refere to the MSDN Library.
More specifically, these sections:
- Windows Base Services / Performance Monitoring / Process Status Helper (for PSAPI reference)
- Windows Base Services / Performance Monitoring / Tool Help Library (for ToolHelp reference)
Please note that you will need the Platform SDK to compile this code. Or at least the presence of PSAPI.H is enough. In case you cannot have PSAPI.H just write your own PSAPI.H that have the definition of two or three structures used in this code. (Look up the SDK documentation for definition of these structs).
The design
I have looked in the MSDN library for the description and specifications of the different functions used to manage processes. The functions are presented as the following:
Task/Provider |
PSAPI |
ToolHelp |
Enumerate processes |
EnumProcesses() - returns an array of all the processes IDs that are currently running. This function provide no more extra information about the processes. |
CreateToolhelp32SnapShot() with parameter TH32CS_SNAPPROCESS . Then combined with the functions Process32First() and Process32Next() . These last two functions return lots of information other than just the process ID. They will essentially provide us the process image base, parent process id, ... |
Enumerate modules |
EnumProcessModules() - It will return us an array of HMODULE of every module attached to a process. Again, no other meaningful information is provided by this function. |
CreateToolhelp32SnapShot() with parameter TH32CS_SNAPMODULE . Then combined with the functions Module32First() and Module32Next() . These two last functions also provide meaningful information that leave us satisfied and looking for no further information regarding a given module. |
As you can clearly see that the PSAPI functions does not provide us with much information regarding either a process or a module, that is why we will have to use additional PSAPI functions to get almost same information we can get with ToolHelp API.
I will now present the data structures involved.
The keywords I marked are the keywords that are of importance to us and through them I will build my own data structure that is not dependant on both PSAPI and ToolHelp.
Context/Provider |
PSAPI |
ToolHelp |
Processes |
PSAPI does not have one function that help us build one data structure to fit our needs. That is why I will use the following PSAPI functions in addition:
GetModuleFileNameEx() - given a process handle and a module base it will return us its full path. And as we know the EnumProcess () will return us the process ID. Now we are full comparing to what we took from the ToolHelp part regarding processes. |
typedef struct tagPROCESSENTRY32 { DWORD dwSize; DWORD cntUsage; DWORD th32ProcessID; ULONG_PTR th32DefaultHeapID; DWORD th32ModuleID; DWORD cntThreads; DWORD th32ParentProcessID; LONG pcPriClassBase; DWORD dwFlags; TCHAR szExeFile[MAX_PATH]; } PROCESSENTRY32, *PPROCESSENTRY32; |
Modules |
We do not have more than the HMODULE s array provided by the ) that is why I seek other information using PSAPI's function:
GetModuleFileNameEx - As explain above.
GetModuleInformation - which will fill us a MODULEINFO structure:take from it the following:
typedef struct _MODULEINFO { LPVOID lpBaseOfDll; DWORD SizeOfImage; LPVOID EntryPoint; } MODULEINFO, *LPMODULEINFO; |
typedef struct tagMODULEENTRY32 { DWORD dwSize; DWORD th32ModuleID; DWORD th32ProcessID; DWORD GlblcntUsage; DWORD ProccntUsage; BYTE* modBaseAddr; DWORD modBaseSize; HMODULE hModule; TCHAR szModule[MAX_MODULE_NAME32 + 1]; TCHAR szExePath[MAX_PATH]; } MODULEENTRY32, *PMODULEENTRY32; |
As you can see from the above table, we have managed to mark common information about a process and a module from both data structures provided by PSAPI and ToolHelp. Now I present my own data structure as the following:
typedef struct tProcessInfo
{
DWORD pid;
TCHAR FileName[MAX_PATH];
};
typedef struct tModuleInfo
{
LPVOID ImageBase;
DWORD ImageSize;
TCHAR FileName[MAX_PATH];
};
Basically, the presented structures are all we need to enumerate either processes or modules. As for the class design and what functions it should export, I concluded the following:
Function |
Description |
Init |
This function is used to detect and dynamically load either PsApi or ToolHelp functions |
ProcessesGetList() / ModulesGetList() |
These functions will take a snapshot of all modules or processes. |
ProcessesWalk() / ModulesWalk() |
These functions will allow us to walk into the enumerated process or modules list. |
ProcessesCount() / ModulesCount() |
These functions will return us the count of the enumerated items. |
ProcessesFreeList() / ModulesFreeList() |
Free a previously created snapshot made by xxxxGetList functions. |
Building the code
The code has been written as a C++ class called CProcessApi
.
Here is the class interface, and in short we will explore how to build each of these functions:
class CProcessApi
{
public:
typedef struct tProcessInfo
{
DWORD pid;
TCHAR FileName[MAX_PATH];
};
typedef struct tModuleInfo
{
LPVOID ImageBase;
DWORD ImageSize;
TCHAR FileName[MAX_PATH];
};
private:
typedef vector<tProcessInfo> tProcessesList;
typedef vector<tModuleInfo> tModulesList;
typedef struct tProcessesData
{
DWORD Pos;
tProcessesList *pl;
};
typedef struct tModulesData
{
DWORD Pos;
tModulesList *ml;
};
typedef BOOL (WINAPI *t_psapi_EnumProcesses)(
DWORD *lpidProcess,
DWORD cb,
DWORD *cbNeeded
);
typedef BOOL (WINAPI *t_psapi_EnumProcessModules)(
HANDLE hProcess,
HMODULE *lphModule,
DWORD cb,
LPDWORD lpcbNeeded
);
typedef DWORD (WINAPI *t_psapi_GetModuleFileNameEx)(
HANDLE hProcess,
HMODULE hModule,
LPTSTR lpFilename,
DWORD nSize
);
typedef BOOL (WINAPI *t_psapi_GetModuleInformation)(
HANDLE hProcess,
HMODULE hModule,
LPMODULEINFO lpmodinfo,
DWORD cb
);
t_psapi_GetModuleFileNameEx psapi_GetModuleFileNameEx;
t_psapi_EnumProcessModules psapi_EnumProcessModules;
t_psapi_EnumProcesses psapi_EnumProcesses;
t_psapi_GetModuleInformation psapi_GetModuleInformation;
typedef HANDLE (WINAPI *t_tlhlp_CreateToolhelp32Snapshot)(
DWORD dwFlags,
DWORD th32ProcessID
);
typedef BOOL (WINAPI *t_tlhlp_Process32First)(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
);
typedef BOOL (WINAPI *t_tlhlp_Process32Next)(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
);
typedef BOOL (WINAPI *t_tlhlp_Module32First)(
HANDLE hSnapshot,
LPMODULEENTRY32 lpme
);
typedef BOOL (WINAPI *t_tlhlp_Module32Next)(
HANDLE hSnapshot,
LPMODULEENTRY32 lpme
);
t_tlhlp_CreateToolhelp32Snapshot tlhlp_CreateToolhelp32Snapshot;
t_tlhlp_Process32First tlhlp_Process32First;
t_tlhlp_Process32Next tlhlp_Process32Next;
t_tlhlp_Module32First tlhlp_Module32First;
t_tlhlp_Module32Next tlhlp_Module32Next;
HMODULE m_hPsApi;
HMODULE m_hTlHlp;
bool m_bPsApi;
bool m_bToolHelp;
bool Load_TlHlp();
bool Load_PsApi();
DWORD ProcessesPopulatePsApi(tProcessesData *pd);
DWORD ProcessesPopulateToolHelp(tProcessesData *pd);
DWORD ModulesPopulatePsApi(DWORD pid, tModulesData *md);
DWORD ModulesPopulateToolHelp(DWORD pid, tModulesData *md);
public:
enum
{
paeSuccess = 0,
paeNoApi,
paeNoEntryPoint,
paeNoMem,
paeNoSnap,
paeNoMore,
paeOutOfBounds,
paeYYY
};
DWORD LastError;
CProcessApi();
~CProcessApi();
bool Init(bool bPsApiFirst = true);
DWORD ProcessesGetList();
bool ProcessesWalk(DWORD lid, tProcessInfo *pi, DWORD Pos = -1);
DWORD ProcessesCount(DWORD lid) const;
void ProcessesFreeList(DWORD lid);
DWORD ModulesGetList(DWORD ProcessID);
bool ModulesWalk(DWORD lid, tModuleInfo *mi, DWORD Pos = -1);
DWORD ModulesCount(DWORD lid) const;
void ModulesFreeList(DWORD lid);
};
Error codes
Most functions return a boolean value to denote success or failure. Additional information about the error that occurred can be extracted through LastError
member variable The error codes are described as CProcessApi::paeXXXX
.
What library to use and how to dynamically load it?
As we have already defined our problem that is: what to use depends on what is available in the system.
I am not statically linking against any of PSAPI or Toolhelp API functions that is why I am defining prototypes of their functions and later I am filling the pointers to these functions during runtime based on the availability of the helper library.
typedef HANDLE (WINAPI *t_tlhlp_CreateToolhelp32Snapshot)(
DWORD dwFlags,
DWORD th32ProcessID
);
This syntax allows me to define or describe how the CreateToolhelp32Snapshot function work. And this:
t_tlhlp_CreateToolhelp32Snapshot tlhlp_CreateToolhelp32Snapshot;
is an instance of t_tlhlp_CreateToolhelp32Snapshot function type. We (classically) intialize this instance as:
HMODULE hMod = LoadLibrary("kernel32.dll");
if (!hMod)
return FALSE;
t_tlhlp_CreateToolhelp32Snapshot tlhlp_CreateToolhelp32Snapshot =
(t_tlhlp_CreateToolhelp32Snapshot)GetProcAddress(hMod,
"CreateToolhelp32Snapshot");
if (tlhlp_CreateToolhelp32Snapshot == NULL)
{ FreeLibrary(hMod); return FALSE;
As you see this last snippet is required to load every function dynamically. That is why I simplified this task by creating a macro, that is UNICODE capable, as:
#ifdef _UNICODE
#define Modifier "W"
#else
#define Modifier "A"
#endif
;
PVOID p;
#define DynamicGetProcAddress(modname, Mod) \
p = GetProcAddress(m_hTlHlp, #modname Mod); \
if (!p) { FreeLibrary(m_hTlHlp); \
return (LastError = paeNoEntryPoint, false); } \
tlhlp_##modname = (t_tlhlp_##modname)p;
As you see, this macro simply saves me writing the same piece of code over and over. This macro also checks for failure and return to caller reporting appropriate error codes. So the task of dynamically loading functions is simplified to:
DynamicGetProcAddress(CreateToolhelp32Snapshot, _T(""));
DynamicGetProcAddress(Process32First, _T(""));
DynamicGetProcAddress(Process32Next, _T(""));
DynamicGetProcAddress(Module32First, _T(""));
DynamicGetProcAddress(Module32Next, _T(""));
DynamicGetProcAddress(GetModuleFileNameEx, Modifier);
DynamicGetProcAddress(EnumProcessModules, _T(""));
DynamicGetProcAddress(EnumProcesses, _T(""));
DynamicGetProcAddress(GetModuleInformation, _T(""));
The code of loading of function dynamically is propagated in two functions:
Load_PsApi()
and
Load_TlHlp()
. The
Init(bool bUsePsApiFirst)
will call either
Load_PsApi
or
Load_TlHlp
and then will have for itself a set of functions ready. Internal boolean flags
m_bPsApi
/
m_bTlHlp
are used throughout the code to determine what function are to be used internally.
bool CProcessApi::Init(bool bPsApiFirst)
{
bool loaded;
if (bPsApiFirst)
{
loaded = Load_PsApi();
if (!loaded)
loaded = Load_TlHlp();
}
else
{
loaded = Load_TlHlp();
if (!loaded)
loaded = Load_PsApi();
}
return (loaded ? (LastError = paeSuccess, true) : (LastError = paeNoApi,
false));
}
The internal data structure used to described / manage / enum processes:
typedef vector<tProcessInfo> tProcessesList;
typedef vector<tModuleInfo> tModulesList;
typedef struct tProcessesData
{
DWORD Pos;
tProcessesList *pl;
};
typedef struct tModulesData
{
DWORD Pos;
tModulesList *ml;
};
I am using STL's vector library to store all the enumerated information. So as you see a
tProcessesList
/
tModulesList
are vectors of the previously explained structures
tProcessInfo
and
tModulesInfo
.
tModulesData
and tProcessesData
are internal data structures used to command/drive the enumeration functions xxxFreeList
/xxxWalkList
. They are composed of an element list (process or module) and by a DWORD Pos that is used as an internal current list index indicator.
The ProcessesGetList
will return a DWORD
named as 'lid' (or list id). This list id is actually a tProcessesData
pointer cast as a DWORD
. Same apply for the ModulesGetList
. As I return to the user a list id; this LID (list id) will be used with functions like xxxxWalk
to get the desired information. Inside the xxxWalk
I cast back the list id (which is a DWORD
) into a tProcessesData
or a tModulesData
and then access the vector inside that structure to serve the user.
DWORD CProcessApi::ProcessesGetList()
{
tProcessesData *pd = new tProcessesData;
if (!pd)
return (LastError = paeNoMem, NULL);
pd->pl = new tProcessesList;
if (!pd->pl)
{
delete pd;
return (LastError = paeNoMem, NULL);
}
if (m_bPsApi)
LastError = ProcessesPopulatePsApi(pd);
else if (m_bToolHelp)
LastError = ProcessesPopulateToolHelp(pd);
return (DWORD) pd;
}
First we allocate new memory of
tProcessData
then I allocate a new list as
tProcessList
, then we call the appropriate function to fill in the list. We return a cast
tProcessData
pointer that will be for him a list id (but for us a meaningful pointer to a
tProcessData struct
).
The code to 'walk' in the list goes like this:
bool CProcessApi::ProcessesWalk(DWORD lid, tProcessInfo *pi, DWORD Pos)
{
tProcessesData *pd = reinterpret_cast<tProcessesData *>(lid);
if (Pos == -1)
Pos = pd->Pos;
if (Pos < 0 || Pos > pd->pl->size())
return (LastError = paeOutOfBounds, false);
else if (Pos == pd->pl->size())
return (LastError = paeNoMore, false);
*pi = pd->pl->at(Pos);
pd->Pos++;
return (LastError = paeSuccess, true);
}
As you noticed it is not dependant on any other calls, it simply returns data from the already filled list. I first check the boundaries and check the user desired start position. If no position is specified I use the last position, this mechanism allows us to later use the code as:
while (ProcessApi::ProcessesWalk(lid, &procinf))
{
}
The
xxxxFreeList
simply deletes the data structure that was allocated.
void CProcessApi::ProcessesFreeList(DWORD lid)
{
tProcessesData *pd = reinterpret_cast<tProcessesData *>(lid);
delete pd->pl;
delete pd;
}
Enumerating using PSAPI:
I will illustrate processes enumeration using PSAPI. For more information refer to the source code and the SDK manual, you will also find lots of examples there.
DWORD CProcessApi::ProcessesPopulatePsApi(tProcessesData *pd)
{
DWORD nProcess,
nCount(4096);
DWORD *processes = new DWORD[nCount];
if (!psapi_EnumProcesses(processes, nCount * sizeof(DWORD), &nProcess))
{
delete processes;
return paeNoSnap;
}
nProcess /= 4;
tProcessInfo pi = {0};
for (DWORD i=0;
(i < nProcess);
i++)
{
HANDLE hProcess;
hProcess = OpenProcess(PROCESS_QUERY_INFORMATION | PROCESS_VM_READ,
FALSE, processes[i]);
if (!hProcess)
continue;
DWORD nmod;
HMODULE mod1;
if (!psapi_EnumProcessModules(hProcess, &mod1, sizeof(mod1), &nmod))
_tcscpy(pi.FileName, _T("-"));
else
psapi_GetModuleFileNameEx(hProcess, mod1, pi.FileName,
sizeof(pi.FileName));
pi.pid = processes[i];
pd->pl->push_back(pi);
CloseHandle(hProcess);
}
pd->Pos = 0;
delete processes;
return paeSuccess;
}
- use the EnumProcesses() which takes a large array of DWORDs. This array will be filled by appropriate process ids.
- Loop in the returned processes IDs
- Get first module for this process, usually first module is the process module itself
- Retrieve the process name
- Store both process ID and process name into tProcessInfo and add it the the list
- Reset the list position as pd->Pos = 0. We could have set the position to 3 for example to skip system processes.
Enumerating using ToolHelp:
I will illustrate processes enumeration using ToolHelp API.
DWORD CProcessApi::ProcessesPopulateToolHelp(tProcessesData *pd)
{
HANDLE hSnap = tlhlp_CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
if (hSnap == INVALID_HANDLE_VALUE)
return paeNoSnap;
BOOL bMore;
tProcessInfo pi = {0};
PROCESSENTRY32 pe32 = {sizeof(PROCESSENTRY32), 0};
pd->pl->clear();
pd->Pos = 0;
bMore = tlhlp_Process32First(hSnap, &pe32);
while (bMore)
{
pi.pid = pe32.th32ProcessID;
_tcscpy(pi.FileName, pe32.szExeFile);
pd->pl->push_back(pi);
bMore = tlhlp_Process32Next(hSnap, &pe32);
}
CloseHandle(hSnap);
return paeSuccess;
}
- Create a process snap shot
- Start walking in the list
- Get relevant information from
PROCESSENTRY32
and store them into a tProcessInfo
struct.
- Get the next process using Process32Next().
Throughout these function notice that I haven't directly called ToolHelp or PsApi functions as: EnumProcesses
or Process32Next
instead I called them through the dynamically set functions tlhlp_XXXX
or psapi_XXXXX
.
The count functionality:
I simply return the size of the item vector size as:
DWORD CProcessApi::ProcessesCount(DWORD lid) const
{
return (reinterpret_cast<tProcessesData *>(lid))->pl->size();
}
- cast list id into a tXXXXData
- return the vector size
Using the code
All of the hard and long work is done so that the implementers can easily access processes and modules. Below is a simple console application (proc1.cpp) that list processes and their appropriate modules.
#include <stdio.h>
#include <stdlib.h>
#include "processapi.h"
int main(void)
{
CProcessApi papi;
if (!papi.Init(false))
{
printf("Failed to load either of process api libraries!");
return 1;
}
DWORD pl = papi.ProcessesGetList();
if (pl)
{
CProcessApi::tProcessInfo pi;
while (papi.ProcessesWalk(pl, &pi))
{
printf("Process[%d]: %s\n", pi.pid, pi.FileName);
CProcessApi::tModuleInfo mi = {0};
DWORD ml = papi.ModulesGetList(pi.pid);
while (papi.ModulesWalk(ml, &mi))
{
printf(" Module(%08X/%08X): %s\n", mi.ImageBase, mi.ImageSize,
mi.FileName);
}
papi.ModulesFreeList(ml);
}
}
papi.ProcessesFreeList(pl);
return 0;
}
- Define a CProcessApi instance (say papi)
- Initialize the papi instance as papi.Init(false). 'false' means try to load ToolHelp first
- Get the processes list as: DWORD pl = papi.ProcessesGetList()
- You should check papi.LastError before proceeding
- Define a tProcessInfo struct as: CProcessApi::tProcessInfo pi
- Start walking in the processes list
- For each process display some information and then build a modules list
- take modules list as: DWORD ml = papi.ModulesGetList(processID);
- Define a module information struct as: CProcessApi::tModuleInfo mi
- Walk in the modules list and display information
- Free the modules list when done
- Free the processes list when done
Here is another example code (proc2.cpp) to show you how to walk in the list in a backward manner (using PsApi):
#include <stdio.h>
#include <stdlib.h>
#include "processapi.h"
int main(void)
{
CProcessApi papi;
if (!papi.Init(true))
{
printf("Failed to load either of process api libraries!");
return 1;
}
DWORD pl = papi.ProcessesGetList();
int i, j;
if (pl)
{
CProcessApi::tProcessInfo pi;
for (i=papi.ProcessesCount(pl)-1;i>=0;i--)
{
if (!papi.ProcessesWalk(pl, &pi, i))
{
printf("failed to process walk @ %d\n", i);
continue;
}
printf("Process[%d]: %s\n", pi.pid, pi.FileName);
CProcessApi::tModuleInfo mi = {0};
DWORD ml = papi.ModulesGetList(pi.pid);
for (j=papi.ModulesCount(ml)-1;j>=0;j--)
{
if (!papi.ModulesWalk(ml, &mi, j))
{
printf("failed to module walk @ %d\n", j);
continue;
}
printf(" Module(%08X/%08X): %s\n", mi.ImageBase, mi.ImageSize,
mi.FileName);
}
papi.ModulesFreeList(ml);
}
}
papi.ProcessesFreeList(pl);
return 0;
}
Extending the code
As you have seen, the design of CProcessApi
class allows us to add any new function that is initially available in either PsApi or ToolHelp.
For example, try to add functions like:
DWORD FindProcessID(LPTSTR ImagePath)
: given an image path (or path to the exe) this function will return the ID of this currently running process or (DWORD
)-1 if the process is not running.
bool GetProcessNameByPID(DWORD pid)
: will return a process image name given its PID
I leave additional utility functions to your imagination. Good luck!
Points of Interest
Throughout writing this code I learned and practiced again how to design such a unification system using my own defined data structures and the lovely STL containers.
I would be happy to read your comments or your bug correction suggestions. Hope you will also learn from this article.