Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

How the Windows built-in watchdog infrastructure can monitor performance counters and trigger alerts

0.00/5 (No votes)
9 Mar 2010 1  
Use the new PLA interface to monitor performance counters and trigger alerts.

Introduction

Two built-in tools are available for systems administrators to perform detailed diagnostic analyses:

  • Performance monitor
  • Resource monitor

These tools are part of the Microsoft Management Console (MMC) Snap-in named «Performance Monitor». The name of this snap-in has unfortunately changed. It used to be called «Reliability and Performance Monitor» (RPM), which better describes its purpose. Starting with Windows 7, this tool has been renamed «Performance Monitor» (PM). In this article, I'll stick with the RPM name!

Stop creating your own programs that run in the background in order to monitor and trigger other utilities to perform diagnose, performance, and intrusion analyzis. Start using the extended RPM infrastructure that already makes these watch-dog mechanisms available.

Using RPM, you can define complex criteria that can trigger any kind of action you define.

Programmatic interface to RPM

Performance Logs und Alerts (PLA) is a new interface to programmatically access the RPM. PLA is a collection of DCOM objects. As a programmatic interface to RPM, PLA exists since Windows Vista. PLA has been extended in Windows 7 and 2008. Using PLA, you can do things that are not even possible with RPM.

PLA_Layers.png

Fundamentally, two kinds of tasks can be accomplished (manually) with RPM or (programmatically) with PLA:

  1. Configure data collection for protocol files
  2. Define triggers based on performance counters

Create_a_new_Data_Collector_Set.png

RPM and PLA are two interfaces to the same technology that allows the generation of log files and the generation of alerts based on performance counters.

Goal

The goal of this article is to introduce PLA and its potential. Additionally, we will present a little project that uses one of the *many* aspects of PLA to configure an alert that starts an application (like notpad.exe) when more than a select % of CPU is consumed.

Potential

The potential of PLA can be divided in five categories:

PLA_-_Potential.png

These categories are implemented as Data Collectors in the RPM parlance.

Some data collectors can be setup and configured manually using RPM, or programmatically using PLA. One data collector is only accessible using PLA, and is therefore only available to software developers.

For a detailed description of each collector type, please refer to the PLA specification which is available at http://msdn.microsoft.com/en-us/library/cc238489(PROT.10).aspx.

Configuration Data Collectors

In this section, we concentrate on the Configuration data collectors.

The following type of configuration data can be managed using RPM or using PLA:

  • Registry: the contents of the given Registry key are copied to a given protocol file.
  • Registry.png

  • WMI: the result of the given WMI queries are copied to a given protocol file.
  • WMI_Groups.png

  • Files capture: the given directories or files are copied (backup) to a chosen location. This feature can be used to capture a chosen set of files and/or directories before proceeding to the installation of a new program. Using this feature, you can even create your own Windows backup system based on a trigger.
  • Network adapters: the configuration of existing network interfaces are gathered and copied to the given protocol file.
  • Network_adapter.png

One single configuration data collector must not manage all of these types of collectors. One configuration data collector might be used only to gather Registry keys. Once the collected data is in a given protocol file, it is up to your needs to see what you want to do with the data. You might transform these, using XPATH, and fill a nice look-and-feel report, or feed a repository to record the configuration of the Windows machines in a sub-network. Make information out of your data!

Alert Data Collectors

As stated above, the goal of this article is to show how to program PLA to configure an alert. Using RPM, you can monitor one or more Windows or application performance counters and trigger any application via the task scheduler. This framework is very powerful, and can be used to cascade data collections and other activities. Complex chain reactions can be created when given performance counter thresholds are reached. Depending on the selected performance counters, an alert can be a positive or negative message.

PLA_-_Alert.png

An alert can also be configured to document the reason of the trigger in the Event Log.

Event_Log.png

Sample

App_ui.png

As mentioned earlier, PLA is a collection of DCOM objects, and not directly available in the .NET Framework Class Library (FCL). For this reason, you have to use the well known .NET wrapper to use them.

To take advantage of the PLA objects, within Visual Studio 2008, do the following:

  1. Reference the «Performance Data Service» COM library. Referencing this library automatically calls TLIMP.EXE in the background and creates the expected .NET wrappers.
  2. Add_Reference_-_PLA.png

  3. Import the PLA namespace.
  4. // Use the referenced Performance Logs and Alerts Library 
    using PlaLibrary; 
  5. As with other administrative tools (e.g.: Event Log, Disks Manager, etc...), PLA must be started with administrative credentials. Please take this into account and start Visual Studio with these credentials; otherwise, you would have to struggle with «Access denied» problems.
  6. Since we use PLA to trigger an alert, we must integrate the Task Scheduler API into our project, which is also a COM Library!
  7. Add_Reference_-_Task_Scheduler.png

  8. As for the previous one, we should reference it with the using statement.
  9. // Use the Windows Task Scheduler COM Library 
    using TaskScheduler;

Once these preconditions are set, we can program our Alert service using PLA.

  1. Create an Alert by setting up the appropriate IAlertDataCollector object. In our sample, we use the (well-known) "Processor Time" Windows performance counter. As mentioned previously, using the PLA infrastructure, you can create an alert based on any performance counter. Once the collector is created, we start it programmatically; otherwise, it would exist but never trigger our alert based on our condition.
  2. // Create and configure the Alert Data Collector 
    IDataCollectorSet dataCollectorSet = null; 
    dataCollectorSet = new DataCollectorSetClass(); 
    IAlertDataCollector alert = 
      (IAlertDataCollector)
      dataCollectorSet.DataCollectors.CreateDataCollector(DataCollectorType.plaAlert); 
    
    // Set its name 
    alert.name = plaAlertName; 
    // Write Event to the Event Log? 
    alert.EventLog = checkBoxWriteEventToEventLog.Checked; 
    // Task Scheduler task name selected 
    alert.Task = comboBoxSchedulerTasks.Text; 
    // Poll the Performance counter once per second 
    alert.SampleInterval = 1; 
    // Update the Data Collector Set 
    dataCollectorSet.DataCollectors.Add(alert); 
    string[] thresholds = new string[1]; 
    thresholds[0] = string.Format("\\Processor(_Total)\\% Processor Time>{0:0%}" , 
                                  comboBoxCpuThreshold.Text); 
    alert.AlertThresholds = thresholds; 
    // Validate it first 
    dataCollectorSet.Commit("service\\" + alert.name, null, 
                            CommitMode.plaValidateOnly); 
    // Save it... 
    dataCollectorSet.Commit("service\\" + alert.name, null, 
                            CommitMode.plaCreateOrModify); 
    // Start it.. 
    dataCollectorSet.start(true);
  3. As mentioned, the PLA alert mechanism is based on performance counters and triggers a task that has been previously registered using the Windows Task Scheduler. We use the TaskSchedulerClass API to enumerate the registered tasks and fill up the UI.
  4. // Enumerate the registered Task Scheduler tasks
    ITaskService taskService = new TaskSchedulerClass();
    taskService.Connect(null, null, null, null);
    // We are here connected with the task scheduler
    ITaskFolder folder = taskService.GetFolder("\\");
    IRegisteredTaskCollection collection = folder.GetTasks(0);
    foreach (IRegisteredTask item in collection)
    {
      comboBoxSchedulerTasks.Items.Add( item.Name);
    }
  5. Once the Windows Tasks Scheduler names are enumerated, we ask the PLA framework whether our own PLA Data Collector Set exists. When not found, we receive an exception and can react appropriately.
  6. // Check the existence of our PLA Alert Data collector Set. 
    dataCollectorSet = null; 
    dataCollectorSet = new DataCollectorSetClass(); 
    dataCollectorSet.Query("service\\" + plaAlertName, null); 
    // At this point our PLA Alert Data Collector has been found. 
    buttonCreateAlert.Text = PLA_BUTTON_TEXT_DELETE; 

Tests

To test this dummy Alert, do the following:

  • Create a new Task using the Windows Task application.
  • Start our sample and select the CPU consumption to use as threshold.
  • Select the task that should be triggered as soon as the threshold will be reached.
  • Use an application (why not mspaint.exe?) to increase the CPU consumption above the threshold you set.

Once the threshold is reached, you will see the registered task scheduler task triggered!

Once registered (and running), PLA data collectors will continue to alert you when the CPU reaches the threshold you set. Don't forget to stop and/ or delete it using this sample application, or using PERFMON.EXE directly.

Conclusion

I hope this article has motivated you to take a serious look at the potential of the performance counters and at the PLA. Consider using the PLA interfaces for diagnostic purposes. The time you'll invest will be worthwhile!

We have used the % of CPU consumption as the typical performance counter in this sample. You might take a look at a comprehensive list of performance counters I have made available at my web page (www.winssential.net/pdf/Windows7UltimateEngPerformanceCounters.pdf). You can use alerts based on any of the thousands of performance counters available on a system. For example, the number of JITs performed by .NET, or the total elapsed time a thread has been running, or the number of receive failures for a PNRP Cloud, or the disk reads/sec of read operations on the disk, or the total Inbound Packets/Received packets successfully processed by IPsec, etc.

Links

Version

  • 1.0 - March 10, 2010.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here