Contents
Introduction
This article is a follow-up to my Mobile Processor Usage tip which shows how to get the overall processor usage. Here, we will examine how to get per-process (and even per-thread) usage statistics.
The basic algorithm is fairly simple:
- Query each running thread for the amount of time spent in kernel and user space. Call this query A.
- Wait for some predetermined amount of time.
- Repeat the query and call this query B.
- Subtract the time from query B from the time in query A. That tells us the amount of time that thread spent using the processor during the wait.
We will simply apply that algorithm to every thread in every process in the system!
We will begin with the secret sauce that tells us the amount of time a thread has spent running, GetThreadTimes.
As a side note, the attached demo applications use boost and WTL. I considered writing them using with only what comes with the Windows Mobile 6 SDK, but the boost libraries do such an incredible job of making the code more readable, exception-safe, and usable, that I decided to use them. If you're not using boost (and you really should!), the concepts presented in this article will still apply. Note that the article itself only uses C++03 and the Windows Mobile SDK.
GetThreadTimes
GetThreadTimes gives us a set of general information about each running thread in the system. In particular, we are interested in knowing how much time the thread has spent executing in kernel mode and how much time the thread has spent executing in user mode. GetThreadTimes
gives us these values as FILETIME
structures. To make use of them, we will have to convert them to milliseconds.
DWORD GetThreadTick( const FILETIME& time )
{
__int64 tick = MAKEDWORDLONG( time.dwLowDateTime, time.dwHighDateTime );
return static_cast< DWORD >( tick /= 10000 );
}
FILETIME creation = { 0 },
exit = { 0 },
kernel = { 0 },
user = { 0 };
::GetThreadTimes( ( HANDLE )thread_id,
&creation,
&exit,
&kernel,
&user )
DWORD kernel_tics = GetThreadTick( kernel );
DWORD user_tics = GetThreadTick( user );
Now that we can calculate the time each thread spends using the processor, we must find a way to list every thread running in a process. For that, we will use the ToolHelp API.
ToolHelp API
The ToolHelp API is a set of diagnostic tools in the Windows Mobile Core OS that allows us to take a snapshot of the heaps, modules, and threads used by running processes at a single point in time. In this example, we are able to iterate over every thread running in the system. The THREADENTRY32
structure tells us both the ID of each thread and the ID of its parent process.
HANDLE snapshot = ::CreateToolhelp32Snapshot( TH32CS_SNAPTHREAD, 0 );
if( INVALID_HANDLE_VALUE != snapshot )
{
THREADENTRY32 te = { 0 };
te.dwSize = sizeof( THREADENTRY32 );
if( ::Thread32First( snapshot, &te ) )
{
do
{
} while( ::Thread32Next( snapshot, &te ) );
}
::CloseToolhelp32Snapshot( snapshot );
}
Unfortunately, if you were to actually run this code on your system as it is, you would quickly find that only the threads from 2 processes are shown: your process and NK.exe. To get around this limitation in permissions, we will look at the SetProcPermissions API.
Permissions
SetProcPermissions is part of the platform builder pkfuncs.h API and is typically only used by drivers needing access to the entire virtual address space. To ensure we use these phenomenal cosmic powers judiciously, we will define a structure that guarantees we restore our original permissions when our function completes even if it throws an exception.
struct CosmicPowers
{
CosmicPowers()
{
old_permissions_ = ::SetProcPermissions( 0xFFFFFFFF );
}
~CosmicPowers()
{
::SetProcPermissions( old_permissions_ );
}
private:
DWORD old_permissions_;
};
All we have to do now is create an instance of CosmicPowers
in our thread iteration function and we should see threads from every process in the system. If you don't have access to platform builder, that's okay. MSDN gives you the function signature and tells you it's exported by coredll.lib (which everybody has).
Now, we have all the components we need to begin gathering the thread processor usage statistics!
Gathering Thread Statistics
Now we must consider how we want to store the thread usage statistics. For this example, I've chosen nested std::map
associative containers. That allows us easy array-style access to the data therefore keeping our code simple and elegant.
struct thread_times {
FILETIME kernel;
FILETIME user;
};
typedef std::map< DWORD, thread_times > Threads;
typedef std::map< DWORD, Threads > Processes;
Now we are ready to put it all together. In this example, we will iterate over every thread running in the system, get the time that thread has spent using the CPU and return our container associating process IDs to thread IDs to thread usage.
Processes GetProcessList()
{
Processes process_list;
CosmicPowers we_are_powerful;
HANDLE snapshot = ::CreateToolhelp32Snapshot( TH32CS_SNAPTHREAD, 0 );
if( INVALID_HANDLE_VALUE != snapshot )
{
DWORD process_total = 0;
THREADENTRY32 te = { 0 };
te.dwSize = sizeof( THREADENTRY32 );
if( ::Thread32First( snapshot, &te ) )
{
do
{
FILETIME creation = { 0 },
exit = { 0 },
kernel = { 0 },
user = { 0 };
if( ::GetThreadTimes( ( HANDLE )te.th32ThreadID,
&creation,
&exit,
&kernel,
&user ) )
{
thread_times t = { kernel, user };
process_list[ te.th32OwnerProcessID ][ te.th32ThreadID ] = t;
}
} while( ::Thread32Next( snapshot, &te ) );
}
::CloseToolhelp32Snapshot( snapshot );
}
return process_list;
}
Right now, our statistics are related entirely to 32-bit process identifiers which isn't very useful to anybody looking at an application wanting to know how much processor time it's taking. So, we need a way to associate PIDs with process names.
Associating PIDs with Useful Names
Because users know their applications by name and not by PID, it would be nice for us to associate our information with that process name. In keeping with our earlier usage of an associative container for storing data, we will use another one here. This one will associate the unique 32-bit process identifiers with their executable's name.
typedef std::map< DWORD, std::wstring > ProcessNames;
We will again turn to the ToolHelp API, but this time we will get a snapshot of the running processes rather than the much larger list of threads.
ProcessNames GetProcessNameList()
{
ProcessNames name_list;
HANDLE snapshot = ::CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS |
TH32CS_SNAPNOHEAPS,
0 );
if( INVALID_HANDLE_VALUE != snapshot )
{
PROCESSENTRY32 pe = { 0 };
pe.dwSize = sizeof( PROCESSENTRY32 );
if( ::Process32First( snapshot, &pe ) )
{
do
{
name_list[ pe.th32ProcessID ] = pe.szExeFile;
} while( ::Process32Next( snapshot, &pe ) );
}
::CloseToolhelp32Snapshot( snapshot );
}
return name_list;
}
For a trivial example, if you wanted to print the names of every process in the process list, you could now do something like this:
Processes procs = GetProcessList();
ProcessNames names = GetProcessNameList();
for( Processes::const_iterator p = procs.begin(); p != procs.end(); ++p )
NKDbgPrintfW( L"%s\r\n", names[ p->first ].c_str() );
Now we have a complete snapshot of the per-thread processor usage statistics of the system up to this point in time. We even know the names of every process in that snapshot. But, to get a percent usage, we must know how much time a thread spends running over a fixed period of time. So, now we must calculate the process usage statistics.
Calculating Process Usage Statistics
While the code below looks large and intimidating, it is mostly statistical calculations. The algorithm it follows is fairly straightforward:
- Get an initial list of the names and PIDs of all running processes.
- Get an initial list of how long each PID has spent in kernel and user time.
- After the delay interval has expired, get another list of each PID and its kernel and user time.
- Calculate how much time was spent for each process in kernel and user space during the wait interval.
- If any process PID isn't in our list of names, refresh our process name list. It means we have a new process.
- Report that statistical information to the user somehow.
- Repeat at step 3.
DWORD interval = 3000;
PI::ProcessNames names = PI::GetProcessNameList();
PI::Processes old_list = PI::GetProcessList();
DWORD start = ::GetTickCount();
while( true )
{
Sleep( interval );
PI::Processes new_list = PI::GetProcessList();
DWORD duration = ::GetTickCount() - start;
DWORD system_total = 0;
for( PI::Processes::const_iterator p2 = new_list.begin();
p2 != new_list.end();
++p2 )
{
PI::Processes::const_iterator p1 = old_list.find( p2->first );
if( p1 != old_list.end() )
{
DWORD user_total = 0;
DWORD kernel_total = 0;
for( PI::Threads::const_iterator t2 = p2->second.begin();
t2 != p2->second.end();
++t2 )
{
PI::Threads::const_iterator t1 = p1->second.find( t2->first );
if( t1 != p1->second.end() )
{
kernel_total += PI::GetThreadTick( t2->second.kernel ) -
PI::GetThreadTick( t1->second.kernel );
user_total += PI::GetThreadTick( t2->second.user ) -
PI::GetThreadTick( t1->second.user );
}
}
float user_percent = ( user_total ) /
static_cast< float >( duration ) * 100.0f;
float kernel_percent = ( kernel_total ) /
static_cast< float >( duration ) * 100.0f;
system_total += user_total + kernel_total;
PI::ProcessNames::const_iterator found_name = names.find( p2->first );
if( found_name == names.end() )
{
names = PI::GetProcessNameList();
found_name = names.find( p2->first );
if( found_name == names.end() )
continue;
}
}
}
float percent_used = system_total / static_cast< float >( duration ) * 100.0f;
old_list = new_list;
start = ::GetTickCount();
}
Conclusion
In this article, we've discussed how to get permission to gather information on all running threads, how to get the time each thread has spent using the processor, which data structures are useful for organizing that information, and finally how to use that information to calculate per-process CPU usage statistics. For an example of how to display that information to the user, please take a look at the attached sample application.
History
- 18th February, 2011: Initial version