Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / team-communication / Zoom

App for CPU Temps, Fan Speeds, etc., Part II

5.00/5 (5 votes)
20 Feb 2019CPOL17 min read 13.3K   666  
App to monitor a system's sensors, with added features

Introduction

In a previous article, App for CPU Temps, Fan Speeds, etc., I developed a minimalist app to fully exploit OpenHardwareMonitorLib.dll (referred to here as OHM) to monitor temperatures, fan speeds, etc. It includes the ability to set maximum and minimum thresholds which, when exceeded, ring an alarm to warn that a component is operating outside its acceptable range. As I mentioned there, the app alerted me to the fact that I had a CPU fan that was operating intermittently, and possibly failing. It also occurred to me that it might not be the fan that was failing, but rather the fan speed sensor built into the motherboard, which jacked up my fear index quite a bit. I really don't want to have to replace the mobo.

To diagnose the situation, I wanted to look at a plot of the reported fan speeds over time. And given that I might be looking at several hours of data at 1 sample per second (3600 samples/hr.), I would need the ability to zoom into the graph data by selecting an area with the mouse. Microsoft's chart control includes a zoom feature, but it is extremely non-intuitive, buggy and sadly deficient.

I also occasionally got an alarm for one or more sensors. If I was in another room when I heard the submarine dive claxon shrieking at me, by the time I got to my computer, the alarm had ended, and I didn't know which sensor had triggered it. So in addition to the ability to track and graph a sensor's history, I decided to add an automatic alarm log that records alarm events.

Since either one is a major undertaking, and both together are twice that, I thought it best to include their addition here in a separate article. Hence, Part II.

Using the Code

The code is implemented as a Visual Studio 2015 self-contained project. It includes OpenHardwareMonitorLib.dll version 0.8.0 Beta. You should be able to download the code file, unzip it, load it into VS, and compile and execute it.

You can also copy the salient executables and DLLs out of the bin/Release directory and use them, though I do believe you must install the appropriate .NET Runtime Libraries.

To understand the basic functions of the app and how to configure it, go back to the previous article, App for CPU Temps, Fan Speeds, etc. There, I describe all the basic features and configuration settings, while in this article, I will only be discussing the new features developed here:

  1. Background recording and tracking of a sensor's history
  2. Storing an alarm log for sensors that operate outside acceptable thresholds, and
  3. Adding visual indicators to the sensor widgets to warn that a sensor has triggered an alarm

Additional Main Menu Items

To accommodate the new features, I added two sub items to the main context menu View item, History and Alarm Log, as illustrated here:

Image 1

 

Tracking a Sensor's History

When you select the View>History menu item, the History Tracking dialog opens, and if no sensors are being tracked, then it appears with the selection pane open as illustrated here:

Image 2

The selection pane is on the right and it contains a check box for each sensor being monitored—those selected earlier in the Select Sensors dialog. At this point, all you have to do is:

  1. check the boxes for the sensors you want to track,
  2. right-click on the selection pane and click on the Close Selection menu item.

In the image below, I checked all of the Temperature and Fan sensors, then closed the selection pane:

Image 3

This is also how the dialog will appear when opened with sensors already selected for tracking.

Note that there is one chart for each type of sensor, with all sensors of that type displayed on the same plot. Also note that values on all x-axes line up—more on how to do that later.

By default, the Update Continuously box is checked, which means that during each timer tick sent by the OHM data tree (see App for CPU Temps, Fan Speeds, etc.), any new data available will be added to the chart. If you want to freeze the chart display, uncheck this box. In the background, the app will continue to accumulate new data on each timer tick, but the chart won't change until you re-check the box.

Memory demand is fairly simple. Each data pair consists of a double time value and a float sensor value, or a total of 12 bytes. In this case, we're monitoring six sensors, with updates each second (3600 upd/hr.), so

(6 sensors) x (12 Bytes/smpl) x (3600 smpl/hr.) = 259,200 B/hr.

which, when divided by 1024, yields 253 KB/hr. My computer is usually on all day, which means that after 12 hours the app requires a little over 3 MB to maintain all the data, not a really serious demand for any computer nowadays.

This computation does ignore a small amount of overhead for the lists containing the stored date.

Zoom in on Details

In the previous image, I'm in the process of zooming in on certain details. This is accomplished by pressing the Ctrl key while Left-Clicking the mouse and dragging the dashed zoom rectangle around the area of interest. Note that I've dragged the zoom rectangle to include both charts, which zooms in on both at the same time, keeping the x-axes scales and values aligned. The result appears as follows:

Image 4

Only those charts where the zoom rectangle overlaps a portion of the inner plot area will zoom. For example, if I had three charts showing, and the zoom rectangle overlapped only two of them, only those two would zoom. After zooming in on a chart, you can zoom in again, and again, displaying finer detail on the data at each zoom level.

Note the context menu that appears when I right click on a chart. The functions of the three items shown are as follows:

  • Expand will hide all of the other charts and expand the one chart to fill the entire chart panel. When a chart is expanded in this way, the Expand menu item is replaced with View All to allow you to return to showing all charts.
  • Zoom Out only appears if a chart has been zoomed into a narrower range of data. Selecting it will zoom the chart all the way back out to the full view.
  • Select Sensors will open the select sensors pane, allowing you to change the sensors you are tracking. Note that if you change your selection, when you close the selection pane, all prior history data is lost and the tracking history starts over.

Organizing Multiple Charts

The chart pane and selection pane are panel controls contained within an upper panel control and named pnlCharts and pnlSelection respectively. The Docking property of pnlCharts is set to Fill, and the Docking property of pnlSelection is set to Right. During construction of the form, labels for each type of sensor and check boxes for each sensor are added to pnlSelection, then its width is adjusted to just accommodate the labels and check boxes, with 10 pixels of padding. This is done with the following code:

C#
//Set the selection panel's width so it is just wide enough 
//to display all check boxes with 10 pixels of padding
Control.ControlCollection cc = pnlSelection.Controls;
int xMax = 0;
for (int i = 0; i < cc.Count; i++)
    if (cc[i].Right > xMax)
        xMax = cc[i].Right;
pnlSelection.Width = xMax + 10;

This allows the charts to occupy the maximum amount of space when the selection panel is visible. The selection panel is opened or closed in the code by setting its Visible property to true or false.

When all charts are visible, their Docking properties are set to None, and they are all given an equal amount of vertical space by adjusting their size during the Resize event for pnlCharts, as follows:

C#
private void pnlCharts_Resize(object sender, EventArgs e)
{
    //If all charts are showing, then charts[0] is visible and its Docking property is None.
    //Alternatively, if charts[0] is not visible, 
    //or its Docking property is set to Fill, then one
    //chart has been Expanded and its Docking property is set to Fill, 
    //so exit without sizing the charts.
    if (charts == null || charts.Length == 0 || 
        !charts[0].Visible || charts[0].Dock == DockStyle.Fill)
        return;

    Rectangle ca = pnlCharts.ClientRectangle;
    int chartH = ca.Height / charts.Length;
    for (int i = 0; i < charts.Length; i++)
    {
        charts[i].Left = 0;
        charts[i].Top = i * chartH;
        charts[i].Width = ca.Width;
        charts[i].Height = chartH;
    }
    //Do this in case rounding error left us 1 or more pixels short of phlCharts' entire height
    charts[charts.Length - 1].Height = ca.Height - charts[charts.Length - 1].Top;
}

When the Expand context menu item is selected for a chart, that chart is given a Docking property of Fill, the Visible property of the other charts is set to false, and the code exits the Resize event without messing with the chart sizes.

To get the values of the horizontal axes of all charts to line up, when the charts are created in SetupCharts(), the following code snippet is executed:

C#
charts[i].ChartAreas[0].Position.Auto = false;
charts[i].ChartAreas[0].Position.X = 8;
charts[i].ChartAreas[0].Position.Y = 10;
charts[i].ChartAreas[0].Position.Width = 80;
charts[i].ChartAreas[0].Position.Height = 80;
charts[i].ChartAreas[0].InnerPlotPosition.Auto = false;
charts[i].ChartAreas[0].InnerPlotPosition.X = 10;
charts[i].ChartAreas[0].InnerPlotPosition.Y = 10;
charts[i].ChartAreas[0].InnerPlotPosition.Width = 80;
charts[i].ChartAreas[0].InnerPlotPosition.Height = 80;

The Position and InnerPlotPosition properties of the ChartArea object are ElementPosition objects that accept values from (0,0) to (100,100). This code disables auto-positioning and positions the ChartArea within the Chart, and the InnerPlotPosition within the ChartArea, as a percentage of each. And since the charts are always the same width, their X-axes align nicely.

I've toyed with the idea of setting up a more robust algorithm to maximize the width of the inner plot area. It would have to first determine the widths required for Y-axis labels, Y-axis values, and the legend, then set the ChartArea and InnerPlotPosition accordingly. But I haven't had time to do that, and this simpler algorithm is doing the job nicely, so I don't know if I'll ever get around to that.

Charts With Smooth Zoom and Nice Round Numbers

I've experimented several times with the built-in zoom in MS Chart, and for the life of me, I can't follow the logic for the way it works. The zoom rectangle appears to snap to the nearest interval. Sometimes, it insists on covering the entire width or height of the chart, and when it does that, it only zooms in one dimension. So after considerable frustration, I long ago abandoned it, and implemented my own.

Another issue with MS Chart is that when it comes to axis scaling and the grid lines displayed, if you leave it up to the chart control, the axes of your chart will display with odd intervals and nothing close to round numbers.

To see how I implemented a smooth zoom and nice round numbers in this app, go to my earlier article, Smooth Zoom & Round Numbers in MS Chart.

Capturing Alarm Events

To capture alarm events for later review, I created the container class AlarmEvent—I know, I'm not very good at coming up with cool names. It stores several pieces of data for an event, as illustrated in this excerpt from the code:

C#
public enum AlarmStatus { Silent, Sounding, StartDelay, StopDelay, Aborted }

public class AlarmEvent
{
    public DateTime? triggeredDT = null;            //DateTime the event was initially triggered
    public DateTime? startOfStopDelayDT = null;     //DateTime a triggered alarm first 
                                                    //returned to operating within acceptable limits
    public DateTime? clearedDT = null;              //DateTime the triggered event cleared
    public string DisplayName = ""; //May not be unique
    public string sName = "";       //Sensor name, may not be unique
    public string id = "";          //Always unique on a particular system
    public SensorType sType;        //Sensor type
    public float? alarmMin;         //Minimum alarm threshold (nullable)
    public float? alarmMax;         //Maximum alarm threshold (nullable)
    public AlarmStatus alarmStatus;

    //List of (DateTime, Value, AlarmStatus) triplet for each timer tick during
    //which the sensor's value dropped outside of the alarm limits
    public List<AlarmDetail> details = new List<AlarmDetail>(64);
    
    ...
}

An alarm is triggered when a sensor first begins operating outside its accepted limits, and cleared when it returns to normal operation. The configuration for JLDProbeII contains a falseAlarmDelay variable that can be set in the Configure Sensors dialog on the Font & Misc. tab page. If its value is greater than zero, then after an alarm is triggered, the sensor's widget waits that amount of time before flashing red and sounding the alarm. The default value for falseAlarmDelay is 3.0 seconds. This prevents momentary spikes in the data from sounding an alarm for a second or two, which is considered a FALSE alarm.

The startOfStopDelayDT time is used to implement the same delay when an alarm is cleared. If an alarm has been sounding for some time, when the sensor returns to operating within its accepted limits, the alarm is not considered cleared until it has been doing so for a period of at least falseAlarmDelay seconds. This prevents a FALSE stop in which an alarm that is sounding returns to normal operation for just a brief moment.

The AlarmStatus enumeration in combination with falseAlarmDelay and startOfStopDelayDT are used to control the logic behind triggering and clearing alarms. Once an alarm begins operating outside its limits, the DateTime, sensor Value, and AlarmStatus for each timer tick are stored in the details list. This includes FALSE alarm events.

Note that you can specify whether or not JLDProbeII will save and store FALSE alarm events with a check box in the Configure Sensors dialog.

JLDProbeII maintains two lists of alarms defined in ProbeForm.cs as:

C#
public List<AlarmEvent> alarmEventsCurrent = new List<AlarmEvent>();
public List<AlarmEvent> alarmEventsPast;

Current alarm events are those that have occurred since the application was started. Past alarm events are those that were triggered prior to the start of the present instance of JLDProbeII, and were stored in the log file. When the application exits, it automatically saves all current and past alarm events to a log file named "JLDProbeII_Alarm.log," which is stored in the executable folder, typically the "bin\Release" folder.

To view a list of alarm events, right-click anywhere on the probe form and select the View>Alarm Log menu item to open the following dialog:

Image 5

Past alarm events are displayed in black. Current alarm events are displayed in blue. In both, FALSE alarms events are indicated with a duration displayed in red. If the sensor id associated with an alarm event does not match any of the sensors being monitored, its name is displayed with a StrikeOut font. This can happen if a sensor that had previously triggered an alarm is de-selected in the Select Sensors dialog before opening this dialog, or if the log file was copied from another computer with different sensor ids.

The dialog displays the date and time for each event that was triggered. Both date and time are displayed for events that occurred on previous days, while only time is displayed for events that occurred on the same day the dialog is opened. It also displays the duration, the alarm limits set for each sensor in the Configure Sensors dialog, and the maximum and minimum sensor values that occurred during the event, as taken from the details list in the AlarmEvent object. If you want to see the data stored in the details list, select the alarm events of interest, right-click on the ListView, and select Export to csv. You can specify a file name, and the application will export the events to a tab-delimited CSV text file that can be opened in Microsoft Excel®.

Note that an alarm event is flagged as FALSE only if it was an event with a duration less than falseAlarmDelay. That means that if you set falseAlarmDelay to zero, there will be no alarm events flagged as FALSE. In the above image, look closely at the two alarm events for the GPU Core that are immediately below the selected sensors. The first had a duration of 2.026 seconds and was flagged as FALSE. The second had a shorter duration of 1.003 seconds but was not flagged as false. This happened because between the two events, I changed falseAlarmDelay from 3.0 to 0.0 seconds.

In this dialog, you can also right-click and delete selected events, which permanently deletes them if you close the dialog by clicking the Ok button. Deletes are ignored if you click the Cancel button.

If you'd like to simulate alarms on your own system, set some maximum alarm thresholds on a few sensors, at the top of ProbeConfig.cs, change #define SimulateAlarmsNo to #define SimulateAlarms, then recompile and run the app. In the first few minutes of operation, it will simulate several types of alarms. But be sure to change it back, or you'll always get those simulated alarms.

Mini-Control Alarm Indicators

To keep the user aware of the status of alarms, I implemented a MiniAlarmControl class to display alarm status on each sensor. It maintains an image of an alarm bell that is green if no alarms have occurred for that sensor, and red if alarms have been triggered. Two MiniAlarmControls are painted on the surface of each SensorWidget, one for past events on the left, and one for current events on the right. The mini-controls appear as in this image:

Image 6

The user can select the position of the mini-controls in the Configure Sensors dialog. In the image above, from left to right, the mini-controls are positioned as None, Top, Middle, and Bottom.

The mini-control also superimposes a transparent Label control over each bell. The label provides a popup tooltip to display the status of the events if the mouse cursor hovers over an alarm bell, as can be seen in the second figure from the right in the above image. The Label control also makes it easy to capture mouse events. Double-click on a red or green bell and the Alarm Log dialog will open to display alarm events for only that sensor.

I could have used a PictureBox instead of a Label control, and then all I would have to do is position it on the SensorWidget, set its Image property, and its tooltip. It would then handle painting, as well as the tooltip popup, and capturing mouse events. But the PictureBox control doesn't provide any flexibility on image painting methods, and since I'm using 64x64 pixel bitmaps, I wanted to control the interpolation method. Hence, the bells are painted to the SensorWidget's surface in its OnPaint method in the following way:

C#
if (miniControlPosition != MiniControlPosition.None)
{
    e.Graphics.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBilinear;
    for (int i = 0; i < miniAlarmControls.Length; i++)
        e.Graphics.DrawImage(miniAlarmControls[i].alarmImage, miniAlarmControls[i].ClientRectF);
}

The scaled down images appear with much better quality when painted in HighQualityBilinear interpolation mode.

These mini-controls allow the user to immediately see if any alarm events have been triggered for a sensor.

Fan Diagnosis: My Problem Fan

I used the new history tracking capability to analyze the issues with the misbehaving CPU fan I discussed in App for CPU Temps, Fan Speeds, etc. The first thing I noticed was that it would momentarily spike up to 50,000 or 60,000 rpms. Now no computer fan is going to spin that fast unless it is powered by a 2,000 horsepower, Rolls Royce, V-12 aircraft engine, like the Rolls Royce Merlin engine used in the Spitfire, the most widely produced and strategically important British single-seat fighter aircraft of World War II—I took a little tangent there, didn't I? But I don't see clouds of exhaust fumes boiling out of my computer, so I guess that must be a false reading.

I swapped the fan out for another that I had sitting around, an old one that had been stored in a box in my garage for about ten years. The CPU Fan curves displayed earlier in this article were recorded with that newer fan installed. But take a look at those curves again. That one fan still occasionally spikes up to over 6,000 rpms, and I know it's not capable of doing that. And it also drops to zero for short moments. But overall, it seems to be working, and I've set a minimum alarm threshold of 100 rpm to catch it if it shuts down. Remember that with a falseAlarmDelay of 3.0 seconds, the alarm won't trigger unless it remains below that threshold for 3 continuous seconds. I have configured the app on my system to capture FALSE alarm events.

So now, I know that something weird is happening with two different fans, but the symptoms are decidedly less pronounced with the second fan, so it's highly likely the fans are causing the problem, and not the motherboard sensor—whew, hopefully I dodged that bullet. But to be absolutely certain, I bought a new fan and installed it. Here is some tracking history for the new fan:

Image 7

This fan is capable of operating up to 1,200 rpms. It does exceed that on occasion, but not excessively, so I'm not bothered by a few slightly high values that are probably false readings. I had to think carefully about the short instants where the fan speed drops to zero. The BIOS in my Asus P6T motherboard includes algorithms to control CPU fan speed based on CPU temperature. Asus doesn't publish details on the Fan Speed vs. Temperature curve they use, but it obviously cranks the fan speed up when the CPU gets hotter. Since my CPU runs fairly cool most of the time (< 50°C [122°F]), I assume the BIOS algorithm is telling the fan to operate at a speed below its rated minimum, which is 400 rpm, so it drops to zero momentarily. I can live with that. But I have set an alarm minimum for all fans in my system, along with a falseAlarmDelay of 3.0 seconds, so I'll know if a fan shuts down completely.

Interestingly enough, when I installed the new fan, I got a "CPU Fan Error" when I booted my system, which meant the fan wasn't running. I opened the case and visually confirmed that the fan was indeed running, so I ignored the error and let the system complete its boot sequence. Looking again at the above curve, the answer was obvious: my system runs too cool—no pun intended. The typical CPU fan speed is running around 700 rpm, but when I first boot the system and the CPU is at room temperature, the BIOS algorithm probably tells it to run below its minimum, so it stops completely, and I get the boot error. The message here is that minimum rated fan speed can be important as well.

So the problem is solved without pointing a finger at the mobo—whew!

I hope this illustrates how this app can be used to diagnose and protect your system.

To Do

  • If OHM ever implements the ability to set a fan's speed, I'll need to incorporate that into this app.

History

  • 2018.10.06: First implementation and publication
  • 2019.02.20: Fixed a few bugs; see ChangeLog.txt

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)