Performance diagnosis is a very important aspect of a software creation process. Higher and higher requirements for application responsiveness and speed should encourage us to start collecting performance metrics even during the application development phase. Fortunately in Windows we have a big number of sources that we can query in order to retrieve the performance and trace logs, including: Event Tracing for Windows (ETW), Event Logging and Performance Counters. Today I would like to write about the last item of this list. Performance counters are used to provide information how well the operating system or an application, service, or driver is performing. Therefore they can often help you determine system bottlenecks or application performance problems. I split the post in four parts: the first part is a general description of performance counters architecture, in the second part you can read how to collect data provided by the counters, in the third how to filter and organize it and finally, in the forth part, how to display and analyze it.
Choosing performance counters
Each performance counter in the system is defined by its unique path:
\\Computer\PerfObject(ParentInstance/ObjectInstance#InstanceIndex)\Counter
PerfObject
is a name of the performance object to query. It may be either a physical component, such as disk or processor, or a system (application) object, such as a process or a thread.
ObjectInstance
is a name of the object that will provide the performance counter data (we can use a wildcard sign (*) here to include all objects with their instances). If the given object has multiple instances we may want to specify, through the
InstanceIndex
field, which instance we would like to measure. The
Counter
field is a name of the performance counter that is defined in the performance object. To better understand the counter path syntax let’s look at few examples:
- \Process(*)\% Processor Time – shows the percentage of time that the processor spent on executing a non-idle thread. Here, we do not specify a particular instance of the object so we will gather performance data from all the running processes in the system.
- \Process(svchost#2)\% Processor Time – the same counter as above, but we narrow our performance investigation to the third instance of the svchost.exe process (first instance does not have an instance numer – you will learn more about instances later in this post)
- \Thread(notepad/0#1)\% Processor Time – shows the percentage of time that the processor spent running this thread. By using this path we specify that we are interested in the % of the processor time used by the first thread of the second instance of the notepad process
In order to list all the performance counters installed in the system you may use typeperf -q command (or if you’re more of a GUI person the “Add Counters Dialog Box” in the perfmon). As the number might be overwhelming I usually also specify the performance object I would like to query, eg.
C:\>typeperf -q Process
\Process(*)\% Processor Time
\Process(*)\% User Time
\Process(*)\% Privileged Time
\Process(*)\Virtual Bytes Peak
\Process(*)\Virtual Bytes
...
\Process(*)\IO Data Bytes/sec
\Process(*)\IO Other Bytes/sec
\Process(*)\Working Set - Private
The command completed successfully.
The list of the system performance objects with counters that they define can be found
on technet site. The most interesting system performance objects from the developer perspective are:
Process
and
Thread
objects. Later come performance counters installed by the libraries or frameworks that you host in your system (such as .NET Framework CLR counters, WCF counters, ASP.NET counters etc.). For different diagnostics scenarios you usually use different sets of performance counters.
Matching the process PID with the counter object instance
It is common in software diagnostics that you will be interested in counters for one particular process (sometimes even thread). As you usually know only the identifier of the interesting process (PID) you need to match it with the performance object instance. Fortunately each of the system performance objects usually has a special counter which will do this job for you. It’s enough to collect just one sample from the counter to find the instance index. Let’s see how it works for the Process
object:
C:>typeperf -c "\Process(notepad*)\ID Process" -sc 1
"(PDH-CSV 4.0)","\\PECET\Process(notepad#1)\ID Process","\\PECET\Process(notepad)\ID Process"
"04/14/2012 17:27:54.235","2528.000000","5224.000000"
The command completed successfully.
In the snippet above we’ve used
typeperf -c <counter-path> command to collect the performance data and the
-sc 1 (sample count) switch which instructs typeperf to collect just one data sample. Now, we can read from the output that the notepad process with PID 2528 has a Process performance counter path:
\Process(notepad#1)\<counter-name>
. In
.NET diagnosis you will use
\.NET CLR Memory(*)\Process ID
to find the instance index of the managed process that you’d like to track. The snippet below shows also a way how to collect performance data in
Powershell (using
Get-Counter cmdlet):
PS C:\temp> get-counter "\.NET CLR Memory(*)\Process ID" -MaxSamples 1
Timestamp CounterSamples
--------- --------------
2012-04-16 00:27:29 \\pecet\.net clr memory(_global_)\process id :
0
\\pecet\.net clr memory(powershell)\process id :
4724
\\pecet\.net clr memory(testprogram)\process id :
6496
Start collecting performance data
We’ve already started collecting some performance data so it’s time to dwell on different tools and their options to control this process. We will diagnose a very simple .NET application that has a high memory usage:
using System;
using System.Collections.Generic;
public static class Program
{
static IList<byte[]> mems = new List<byte[]>();
public static void Main(String[] args) {
Console.WriteLine("Press any key to start...");
Console.ReadKey();
for (int i = 0; i < 100000; i++) {
byte[] mem = new byte[5120];
if ((i & 0x1) == 0x0) {
mems.Add(mem);
Console.WriteLine(i);
}
}
}
}
We will run only one instance of the application so we can safely use only the object name in our performance counters paths. As memory is our subject of investigation we will use performance counters that present information related to it. Our counters set would be as follows:
\.NET CLR Memory(TestProgram)\Process ID
\.NET CLR Memory(TestProgram)\Gen 0 heap size
\.NET CLR Memory(TestProgram)\Gen 1 heap size
\.NET CLR Memory(TestProgram)\Gen 2 heap size
\.NET CLR Memory(TestProgram)\# Gen 0 Collections
\.NET CLR Memory(TestProgram)\# Gen 1 Collections
\.NET CLR Memory(TestProgram)\# Gen 2 Collections
\.NET CLR Memory(TestProgram)\Promoted Memory from Gen 0
\.NET CLR Memory(TestProgram)\Promoted Memory from Gen 1
\.NET CLR Exceptions(TestProgram)\# of Exceps Thrown
\Process(TestProgram)\Working Set
\Process(TestProgram)\% Processor Time
Save this list to a
memcounters.txt
file and let’s collect some data.
typeperf -c <counter-path> [options]
The first tool we will use for this purpose it typeperf which is available with every installation of Windows. As we’ve already seen it has two query options: -q [PerfObject] and -qx [PerfObject] – the former lists all performance counters (with their paths) installed on the machine, the latter makes this list even longer by adding, for each performance counter, information about all its accessible instances. You can start collecting performance data with typeperf by using -c <counter-path> [second-counter] … or -cf <counters-file-path> switch. If you have a set of performance counters that you would like to query (as in our case), the second option might be more comfortable to use. Run the following command and start the application:
typeperf -cf memcounters.txt -si 1 -sc 10 -f CSV -o memperfdata.csv
This command instructs the typeperf to collect
10 samples (-sc 10) of performance data provided by the
counters defined in the memcounters.txt file (-cf memcounters.txt). The data will be collected
every second (-si 1) (which is a minimum for performance counters) and
saved to the memperfdata.csv file (-o memperfdata.csv) in a form of
a comma separated file (-f CSV). Now, open the memperfdata.csv in your favorite text editor and you should see something similar to the text below:
"(PDH-CSV 4.0) (Central European Daylight Time)(-120)","\\PECET\.NET CLR Memory(TestProgram)\Process ID","\\PECET\.NET CLR Memory(TestProgram)\Gen 0 heap size","\\PECET\.NET CLR Memory(TestProgram)\Gen 1 heap size","\\PECET\.NET CLR Memory(TestProgram)\Gen 2 heap size","\\PECET\.NET CLR Memory(TestProgram)\# Gen 0 Collections","\\PECET\.NET CLR Memory(TestProgram)\# Gen 1 Collections","\\PECET\.NET CLR Memory(TestProgram)\# Gen 2 Collections","\\PECET\.NET CLR Memory(TestProgram)\Promoted Memory from Gen 0","\\PECET\.NET CLR Memory(TestProgram)\Promoted Memory from Gen 1","\\PECET\.NET CLR Exceptions(TestProgram)\# of Exceps Thrown","\\PECET\Process(TestProgram)\Working Set","\\PECET\Process(TestProgram)\% Processor Time"
"04/17/2012 06:44:15.869"," "," "," "," "," "," "," "," "," "," "," "," "
"04/17/2012 06:44:16.893"," "," "," "," "," "," "," "," "," "," "," "," "
"04/17/2012 06:44:17.931"," "," "," "," "," "," "," "," "," "," "," "," "
"04/17/2012 06:44:18.951"," "," "," "," "," "," "," "," "," "," "," "," "
"04/17/2012 06:44:19.983","1000","6291456","12573168","50113928","23","8","3","3137840","0","0","78536704"," "
"04/17/2012 06:44:21.022","1000","6291456","6300064","112969496","42","14","4","3137840","0","0","138080256","28.498552246289755"
"04/17/2012 06:44:22.050","1000","6291456","12575744","163257040","61","19","5","3137840","12535928","0","196440064","22.78365172471138"
"04/17/2012 06:44:23.080","1000","6291456","9437904","226109840","80","24","5","3137840","0","0","254152704","33.318696629872328"
"04/17/2012 06:44:24.102"," "," "," "," "," "," "," "," "," "," "," "," "
"04/17/2012 06:44:25.139"," "," "," "," "," "," "," "," "," "," "," "," "
You probably noticed that the counters:
Promoted Memory from Gen 0
and
Promoted Memory from Gen 1
in our output are not very accurate. This is probably because they are not cumulative and display the values observed at the end of the last garbage collection. In our case, as memory is eagerly allocated, garbage collections occur quite often (80 for gen0, 24 for gen1 and 5 for gen2) which forces counters to reset their values. In most application those counters will bring more meaningful values (the list of .NET memory counters can be found on
msdn and I strongly encourage you to experiment with them). In the next paragraph we will display the collected data in an easy to read format. Let’s now switch to Powershell window and look at its typeperf rival – Get-Counter cmdlet.
Powershell: Get-Counter <counters> [options]
To collect 10 samples with the interval of 1s of data provided by our counters we will type:
Get-Counter (gc .\memcounters.txt) -sampleinterval 1 -maxsamples 10
Unfortunately this command will break if the counter path is not valid (i.e. our TestProgram is not running). So if you are going to use this cmdlet remember to start your application first. This command returns an array of
Microsoft.PowerShell.Commands.GetCounter.PerformanceCounterSampleSet
instances which you can further pass through the pipeline. To receive the same output as in the typeperf example, we would need to issue:
Get-Counter (gc .\memcounters.txt) -sampleinterval 1 -maxsamples 10 | Export-Counter memperfdata.csv -FileFormat CSV
Performance monitor
That’s probably the most common tool used to collect the performance data. It’s a powerful GUI application that makes controlling the performance counters really easy. Though, I would say that this tool is more useful to system admins than developers. In order to create the same CSV file as we prepared above you would need to create a data collector set , run it and then export the data. But it might help you a lot when choosing performance counters for your scripts or when monitoring the overall system performance, as it has a great live preview.
After having collected the performance data it’s time to analyze it. In our example we can easily open the generated text file in a text editor and read the interesting information, but in scenarios when you use tens of performance counters and hundreds of samples this approach may not be the best option. Fortunately we have tools that will help us analyze this amount of data.
Manipulating the collected data
relog [options] <data-file>
If we want to remove garbage data (unnecessary counters or samples) from our performance data file or, for some reason, change the file format then what we are looking for is the relog application. The first interesting option it the -q (query) switch which shows summary information about the performance data file. For instance in our case:
C:\> relog -q memperfdata-blog.csv
Input
----------------
File(s):
memperfdata-blog.csv (CSV)
Begin: 2012-4-17 6:44:15
End: 2012-4-17 6:44:25
Samples: 10
\\PECET\Process(TestProgram)\% Processor Time
\\PECET\Process(TestProgram)\Working Set
\\PECET\.NET CLR Exceptions(TestProgram)\# of Exceps Thrown
\\PECET\.NET CLR Memory(TestProgram)\Gen 0 heap size
\\PECET\.NET CLR Memory(TestProgram)\# Gen 1 Collections
\\PECET\.NET CLR Memory(TestProgram)\Process ID
\\PECET\.NET CLR Memory(TestProgram)\Gen 1 heap size
\\PECET\.NET CLR Memory(TestProgram)\# Gen 2 Collections
\\PECET\.NET CLR Memory(TestProgram)\Gen 2 heap size
\\PECET\.NET CLR Memory(TestProgram)\# Gen 0 Collections
\\PECET\.NET CLR Memory(TestProgram)\Promoted Memory from Gen 0
\\PECET\.NET CLR Memory(TestProgram)\Promoted Memory from Gen 1
The command completed successfully.
From the output we can read what is the format of our file (CSV), the begin and end date of the data collection and, most importantly, a list of counters from which we’ve collected the data. With this information we can filter our log to contain data from just 3 counters:
\\PECET\.NET CLR Memory(TestProgram)\Gen 0 heap size
\\PECET\.NET CLR Memory(TestProgram)\Gen 1 heap size
\\PECET\.NET CLR Memory(TestProgram)\Gen 2 heap size
Let’s also
change the format (-f) to the tab separated file and filter the results
from (-b) time 06:44:19
till (-e) 06:44:23:
> relog memperfdata-blog.csv -c "\\PECET\.NET CLR Memory(TestProgram)\Gen 0 heap size" "\\PECET\.NET CLR Memory(TestProgram)\Gen 1 heap size" "\\PECET\.NET CLR Memory(TestProgram)\Gen 2 heap size" -b "2012-04-17 06:44:19" -e "2012-04-17 06:44:24" -y -f TSV -o out.tsv
Input
----------------
File(s):
memperfdata-blog.csv (CSV)
Begin: 2012-4-17 6:44:15
End: 2012-4-17 6:44:25
Samples: 10
100.00%
Output
----------------
File: out.tsv
Begin: 2012-4-17 6:44:19
End: 2012-4-17 6:44:24
Samples: 4
The command completed successfully.
Another great side of the relog application is the ability to merge different performance data files into one result set.
Powershell: Import-Counter, Export-Counter
Powershell equivalents of the relog command are Import-Counter and Export-Counter cmdlets. The former is able to read, filter and convert performance data from the file to a PerformanceCounterSampleSet
array. The latter can transform this array back to a performance file (it has interesting options: -Circular and -MaxSize which may be very useful when saving big amounts of samples). Combining them in a pipeline will produce the same result file as the relog example above:
PS C:\> import-counter -Counter @("\\pecet\.net clr memory(testprogram)\gen 0 heap size","\\pecet\.net clr memory(testprogram)\gen 1 heap size","\\pecet\.net clr memory(testprogram)\gen 2 heap size") -StartTime (Get-Date '2012-04-17 06:44:19') -EndTime (Get-Date '2012-04-17 06:44:24') -Path .\memperfdata-blog.blg | Export-Counter -Force -FileFormat TSV -Path out.tsv
I run into some issues when using Import-Counter on CSV files (sometimes it reported
Import-Counter : The data in one of the performance counter samples is not valid. View the Status property for each PerformanceCounterSample object to make sure it contains valid data
). That’s why I’m using BLG file (which was converted from CSV using relog) in the command above.
Publishing and displaying the collected data
Till now we should have collected and prepared our performance data so it’s time to analyze and display it.
Performance Monitor
We will start with the Performance Monitor (perfmon.exe) which is installed with Windows. It is not only able to display live performance counter data (feature I really like) but it can also read .CSV and .BLG files. Just double click the .BLG file or, for our .CSV file, type:
perfmon /sys /open c:\logs\memperfdata-blog.csv
and you should see a chart with all the counters from the file. As different counters collect data using different formats (raw numbers, percentage) it’s quite difficult to present them on the same chart. That’s why you can implicitly set a scale for each performance counter in the list. Just select a counter, right click on it and choose properties. At the bottom of the dialog there should be a scale drop box. After you select a scale, numbers provided by the selected counter will be multiplied by the scale value, thus allowing you to display different counters data formats using the same Y axis. To avoid confusion just use the same scale for counters that display data in the same format.
Logparser
Another tool we will focus on is a swiss knife in the log analysis world and it’s called logparser. Logparser is freely available for download from Microsoft. Although it parses a great bunch of log file formats, the .BLG is not yet supported, thus we need to use CSV performance data files. I hope you won’t have any problems converting .BLG files to this format after reading the previous paragraph of this post:). Let’s start from selecting heap size counters for records which have % Processor Time greater than 25%:
>logparser "select [(PDH-CSV 4.0) (Central European Daylight Time)(-120)] as ts, [\\PECET\.NET CLR Memory(TestProgram)\Gen 0 heap size] as gen0, [\\pecet\.net clr memory(testprogram)\gen 1 heap size] as gen1, [\\pecet\.net clr memory(testprogram)\gen 2 heap size] as gen2, to_real([\\PECET\Process(TestProgram)\% Processor Time]) as proc into test.gif from memperfdata-blog.csv where proc >= 25.0"
-o:DATAGRID
Statistics:
-----------
Elements processed: 10
Elements output: 2
Execution time: 176.51 seconds (00:02:56.51)
The results of this command are displayed in a grid window:
This was just a simple query but imagine all TSQL queries that you can run on your data files…:) I strongly encourage you to have a look at logparser’s .chm help file as it’s a great, very-detailed source of information about this tool. To interest you even more have a look at this query and its result (but first install Office 2003 Web Components to draw the chart):
>logparser "select [(PDH-CSV 4.0) (Central European Daylight Time)(-120)], to_real([\\pecet\.net clr memory(testprogram)\gen 0 heap size]), to_real([\\pecet\.net clr memory(testprogram)\gen 1 heap size]), to_real([\\pecet\.net clr memory(testprogram)\gen 2 heap size]) into test.gif from memperfdata-blog.csv" -o:CHART
Statistics:
-----------
Elements processed: 10
Elements output: 10
Execution time: 0.05 seconds
SQL Server
If you don’t feel comfortable with the logparser interface you can always save your collected data in a SQL Server database and query it using your favorite SQL Server client. Surprisingly, relog is able to do that! First you need to create a database and then a System DSN (Data Source Name) – detailed explanation can be found here. Assuming that you named the DSN Test issue the following command:
> relog -f SQL -o SQL:Test!fd .\memperfdata-blog.csv
Input
----------------
File(s):
.\memperfdata-blog.csv (CSV)
Begin: 2012-4-17 6:44:15
End: 2012-4-17 6:44:25
Samples: 10
100.00%
Output
----------------
File: SQL:Test!fd
Begin: 2012-4-17 6:44:15
End: 2012-4-17 6:44:25
Samples: 4
The command completed successfully.
In the database that you have specified in the DSN (I hope you did that:)) relog will create three new tables:
- CounterData – stores counter values
- CounterDetails – stores metadata information about each counter
- DisplayToID – stores information about log session (the DisplayString column contains the string that we specified after the exclamation mark in the relog command above)
As pictures sometimes tell more than a thousand words just have a look at the one below (click on the image to enlarge it):
Excel
The last tool that I wanted to mention is our old friend – Excel. It easily opens .CSV files and loads values to worksheet cells (thus sometimes it might require some data conversions to make excel recognize the datetime or decimal values in columns). You are then free to use any excel functions, charts and all other facilities.
Conclusion
In this post I’ve tried to present you with different ways of collecting and analyzing performance data. I hope that you find this information useful and that you are eager to start using the performance counters to their fullest. Happy profiling!