Introduction
In a previous article, App for CPU Temps, Fan Speeds, etc., I developed a minimalist app to fully exploit OpenHardwareMonitorLib.dll (referred to here as OHM) to monitor temperatures, fan speeds, etc. It includes the ability to set maximum and minimum thresholds which, when exceeded, ring an alarm to warn that a component is operating outside its acceptable range. As I mentioned there, the app alerted me to the fact that I had a CPU fan that was operating intermittently, and possibly failing. It also occurred to me that it might not be the fan that was failing, but rather the fan speed sensor built into the motherboard, which jacked up my fear index quite a bit. I really don't want to have to replace the mobo.
To diagnose the situation, I wanted to look at a plot of the reported fan speeds over time. And given that I might be looking at several hours of data at 1 sample per second (3600 samples/hr.), I would need the ability to zoom into the graph data by selecting an area with the mouse. Microsoft's chart control includes a zoom feature, but it is extremely non-intuitive, buggy and sadly deficient.
I also occasionally got an alarm for one or more sensors. If I was in another room when I heard the submarine dive claxon shrieking at me, by the time I got to my computer, the alarm had ended, and I didn't know which sensor had triggered it. So in addition to the ability to track and graph a sensor's history, I decided to add an automatic alarm log that records alarm events.
Since either one is a major undertaking, and both together are twice that, I thought it best to include their addition here in a separate article. Hence, Part II.
Using the Code
The code is implemented as a Visual Studio 2015 self-contained project. It includes OpenHardwareMonitorLib.dll version 0.8.0 Beta. You should be able to download the code file, unzip it, load it into VS, and compile and execute it.
You can also copy the salient executables and DLLs out of the bin/Release directory and use them, though I do believe you must install the appropriate .NET Runtime Libraries.
To understand the basic functions of the app and how to configure it, go back to the previous article, App for CPU Temps, Fan Speeds, etc. There, I describe all the basic features and configuration settings, while in this article, I will only be discussing the new features developed here:
- Background recording and tracking of a sensor's history
- Storing an alarm log for sensors that operate outside acceptable thresholds, and
- Adding visual indicators to the sensor widgets to warn that a sensor has triggered an alarm
Additional Main Menu Items
To accommodate the new features, I added two sub items to the main context menu View item, History and Alarm Log, as illustrated here:
Tracking a Sensor's History
When you select the View>History menu item, the History Tracking dialog opens, and if no sensors are being tracked, then it appears with the selection pane open as illustrated here:
The selection pane is on the right and it contains a check box for each sensor being monitored—those selected earlier in the Select Sensors dialog. At this point, all you have to do is:
- check the boxes for the sensors you want to track,
- right-click on the selection pane and click on the Close Selection menu item.
In the image below, I checked all of the Temperature and Fan sensors, then closed the selection pane:
This is also how the dialog will appear when opened with sensors already selected for tracking.
Note that there is one chart for each type of sensor, with all sensors of that type displayed on the same plot. Also note that values on all x-axes line up—more on how to do that later.
By default, the Update Continuously box is checked, which means that during each timer tick sent by the OHM data tree (see App for CPU Temps, Fan Speeds, etc.), any new data available will be added to the chart. If you want to freeze the chart display, uncheck this box. In the background, the app will continue to accumulate new data on each timer tick, but the chart won't change until you re-check the box.
Memory demand is fairly simple. Each data pair consists of a double
time value and a float
sensor value, or a total of 12 bytes. In this case, we're monitoring six sensors, with updates each second (3600 upd/hr.), so
(6 sensors) x (12 Bytes/smpl) x (3600 smpl/hr.) = 259,200 B/hr.
which, when divided by 1024, yields 253 KB/hr. My computer is usually on all day, which means that after 12 hours the app requires a little over 3 MB to maintain all the data, not a really serious demand for any computer nowadays.
This computation does ignore a small amount of overhead for the lists containing the stored date.
Zoom in on Details
In the previous image, I'm in the process of zooming in on certain details. This is accomplished by pressing the Ctrl key while Left-Clicking the mouse and dragging the dashed zoom rectangle around the area of interest. Note that I've dragged the zoom rectangle to include both charts, which zooms in on both at the same time, keeping the x-axes scales and values aligned. The result appears as follows:
Only those charts where the zoom rectangle overlaps a portion of the inner plot area will zoom. For example, if I had three charts showing, and the zoom rectangle overlapped only two of them, only those two would zoom. After zooming in on a chart, you can zoom in again, and again, displaying finer detail on the data at each zoom level.
Note the context menu that appears when I right click on a chart. The functions of the three items shown are as follows:
- Expand will hide all of the other charts and expand the one chart to fill the entire chart panel. When a chart is expanded in this way, the Expand menu item is replaced with View All to allow you to return to showing all charts.
- Zoom Out only appears if a chart has been zoomed into a narrower range of data. Selecting it will zoom the chart all the way back out to the full view.
- Select Sensors will open the select sensors pane, allowing you to change the sensors you are tracking. Note that if you change your selection, when you close the selection pane, all prior history data is lost and the tracking history starts over.
Organizing Multiple Charts
The chart pane and selection pane are panel controls contained within an upper panel control and named pnlCharts
and pnlSelection
respectively. The Docking
property of pnlCharts
is set to Fill
, and the Docking
property of pnlSelection
is set to Right
. During construction of the form, labels for each type of sensor and check boxes for each sensor are added to pnlSelection
, then its width is adjusted to just accommodate the labels and check boxes, with 10 pixels of padding. This is done with the following code:
Control.ControlCollection cc = pnlSelection.Controls;
int xMax = 0;
for (int i = 0; i < cc.Count; i++)
if (cc[i].Right > xMax)
xMax = cc[i].Right;
pnlSelection.Width = xMax + 10;
This allows the charts to occupy the maximum amount of space when the selection panel is visible. The selection panel is opened or closed in the code by setting its Visible
property to true
or false
.
When all charts are visible, their Docking
properties are set to None
, and they are all given an equal amount of vertical space by adjusting their size during the Resize
event for pnlCharts
, as follows:
private void pnlCharts_Resize(object sender, EventArgs e)
{
if (charts == null || charts.Length == 0 ||
!charts[0].Visible || charts[0].Dock == DockStyle.Fill)
return;
Rectangle ca = pnlCharts.ClientRectangle;
int chartH = ca.Height / charts.Length;
for (int i = 0; i < charts.Length; i++)
{
charts[i].Left = 0;
charts[i].Top = i * chartH;
charts[i].Width = ca.Width;
charts[i].Height = chartH;
}
charts[charts.Length - 1].Height = ca.Height - charts[charts.Length - 1].Top;
}
When the Expand context menu item is selected for a chart, that chart is given a Docking
property of Fill
, the Visible
property of the other charts is set to false
, and the code exits the Resize
event without messing with the chart sizes.
To get the values of the horizontal axes of all charts to line up, when the charts are created in SetupCharts()
, the following code snippet is executed:
charts[i].ChartAreas[0].Position.Auto = false;
charts[i].ChartAreas[0].Position.X = 8;
charts[i].ChartAreas[0].Position.Y = 10;
charts[i].ChartAreas[0].Position.Width = 80;
charts[i].ChartAreas[0].Position.Height = 80;
charts[i].ChartAreas[0].InnerPlotPosition.Auto = false;
charts[i].ChartAreas[0].InnerPlotPosition.X = 10;
charts[i].ChartAreas[0].InnerPlotPosition.Y = 10;
charts[i].ChartAreas[0].InnerPlotPosition.Width = 80;
charts[i].ChartAreas[0].InnerPlotPosition.Height = 80;
The Position
and InnerPlotPosition
properties of the ChartArea
object are ElementPosition
objects that accept values from (0,0) to (100,100). This code disables auto-positioning and positions the ChartArea
within the Chart
, and the InnerPlotPosition
within the ChartArea
, as a percentage of each. And since the charts are always the same width, their X-axes align nicely.
I've toyed with the idea of setting up a more robust algorithm to maximize the width of the inner plot area. It would have to first determine the widths required for Y-axis labels, Y-axis values, and the legend, then set the ChartArea
and InnerPlotPosition
accordingly. But I haven't had time to do that, and this simpler algorithm is doing the job nicely, so I don't know if I'll ever get around to that.
Charts With Smooth Zoom and Nice Round Numbers
I've experimented several times with the built-in zoom in MS Chart, and for the life of me, I can't follow the logic for the way it works. The zoom rectangle appears to snap to the nearest interval. Sometimes, it insists on covering the entire width or height of the chart, and when it does that, it only zooms in one dimension. So after considerable frustration, I long ago abandoned it, and implemented my own.
Another issue with MS Chart is that when it comes to axis scaling and the grid lines displayed, if you leave it up to the chart control, the axes of your chart will display with odd intervals and nothing close to round numbers.
To see how I implemented a smooth zoom and nice round numbers in this app, go to my earlier article, Smooth Zoom & Round Numbers in MS Chart.
Capturing Alarm Events
To capture alarm events for later review, I created the container class AlarmEvent
—I know, I'm not very good at coming up with cool names. It stores several pieces of data for an event, as illustrated in this excerpt from the code:
public enum AlarmStatus { Silent, Sounding, StartDelay, StopDelay, Aborted }
public class AlarmEvent
{
public DateTime? triggeredDT = null;
public DateTime? startOfStopDelayDT = null;
public DateTime? clearedDT = null;
public string DisplayName = "";
public string sName = "";
public string id = "";
public SensorType sType;
public float? alarmMin;
public float? alarmMax;
public AlarmStatus alarmStatus;
public List<AlarmDetail> details = new List<AlarmDetail>(64);
...
}
An alarm is triggered when a sensor first begins operating outside its accepted limits, and cleared when it returns to normal operation. The configuration for JLDProbeII contains a falseAlarmDelay
variable that can be set in the Configure Sensors dialog on the Font & Misc. tab page. If its value is greater than zero, then after an alarm is triggered, the sensor's widget waits that amount of time before flashing red and sounding the alarm. The default value for falseAlarmDelay
is 3.0 seconds. This prevents momentary spikes in the data from sounding an alarm for a second or two, which is considered a FALSE alarm.
The startOfStopDelayDT
time is used to implement the same delay when an alarm is cleared. If an alarm has been sounding for some time, when the sensor returns to operating within its accepted limits, the alarm is not considered cleared until it has been doing so for a period of at least falseAlarmDelay
seconds. This prevents a FALSE stop in which an alarm that is sounding returns to normal operation for just a brief moment.
The AlarmStatus
enumeration in combination with falseAlarmDelay
and startOfStopDelayDT
are used to control the logic behind triggering and clearing alarms. Once an alarm begins operating outside its limits, the DateTime
, sensor Value
, and AlarmStatus
for each timer tick are stored in the details
list. This includes FALSE alarm events.
Note that you can specify whether or not JLDProbeII will save and store FALSE alarm events with a check box in the Configure Sensors dialog.
JLDProbeII maintains two lists of alarms defined in ProbeForm.cs as:
public List<AlarmEvent> alarmEventsCurrent = new List<AlarmEvent>();
public List<AlarmEvent> alarmEventsPast;
Current alarm events are those that have occurred since the application was started. Past alarm events are those that were triggered prior to the start of the present instance of JLDProbeII, and were stored in the log file. When the application exits, it automatically saves all current and past alarm events to a log file named "JLDProbeII_Alarm.log," which is stored in the executable folder, typically the "bin\Release" folder.
To view a list of alarm events, right-click anywhere on the probe form and select the View>Alarm Log menu item to open the following dialog:
Past alarm events are displayed in black. Current alarm events are displayed in blue. In both, FALSE alarms events are indicated with a duration displayed in red. If the sensor id
associated with an alarm event does not match any of the sensors being monitored, its name is displayed with a StrikeOut
font. This can happen if a sensor that had previously triggered an alarm is de-selected in the Select Sensors dialog before opening this dialog, or if the log file was copied from another computer with different sensor id
s.
The dialog displays the date and time for each event that was triggered. Both date and time are displayed for events that occurred on previous days, while only time is displayed for events that occurred on the same day the dialog is opened. It also displays the duration, the alarm limits set for each sensor in the Configure Sensors dialog, and the maximum and minimum sensor values that occurred during the event, as taken from the details
list in the AlarmEvent
object. If you want to see the data stored in the details
list, select the alarm events of interest, right-click on the ListView
, and select Export to csv. You can specify a file name, and the application will export the events to a tab-delimited CSV text file that can be opened in Microsoft Excel®.
Note that an alarm event is flagged as FALSE only if it was an event with a duration less than falseAlarmDelay
. That means that if you set falseAlarmDelay
to zero, there will be no alarm events flagged as FALSE. In the above image, look closely at the two alarm events for the GPU Core that are immediately below the selected sensors. The first had a duration of 2.026 seconds and was flagged as FALSE. The second had a shorter duration of 1.003 seconds but was not flagged as false. This happened because between the two events, I changed falseAlarmDelay
from 3.0 to 0.0 seconds.
In this dialog, you can also right-click and delete selected events, which permanently deletes them if you close the dialog by clicking the Ok button. Deletes are ignored if you click the Cancel button.
If you'd like to simulate alarms on your own system, set some maximum alarm thresholds on a few sensors, at the top of ProbeConfig.cs, change #define SimulateAlarmsNo
to #define SimulateAlarms
, then recompile and run the app. In the first few minutes of operation, it will simulate several types of alarms. But be sure to change it back, or you'll always get those simulated alarms.
Mini-Control Alarm Indicators
To keep the user aware of the status of alarms, I implemented a MiniAlarmControl
class to display alarm status on each sensor. It maintains an image of an alarm bell that is green if no alarms have occurred for that sensor, and red if alarms have been triggered. Two MiniAlarmControl
s are painted on the surface of each SensorWidget
, one for past events on the left, and one for current events on the right. The mini-controls appear as in this image:
The user can select the position of the mini-controls in the Configure Sensors dialog. In the image above, from left to right, the mini-controls are positioned as None, Top, Middle, and Bottom.
The mini-control also superimposes a transparent Label
control over each bell. The label provides a popup tooltip to display the status of the events if the mouse cursor hovers over an alarm bell, as can be seen in the second figure from the right in the above image. The Label
control also makes it easy to capture mouse events. Double-click on a red or green bell and the Alarm Log dialog will open to display alarm events for only that sensor.
I could have used a PictureBox
instead of a Label
control, and then all I would have to do is position it on the SensorWidget
, set its Image
property, and its tooltip. It would then handle painting, as well as the tooltip popup, and capturing mouse events. But the PictureBox
control doesn't provide any flexibility on image painting methods, and since I'm using 64x64 pixel bitmaps, I wanted to control the interpolation method. Hence, the bells are painted to the SensorWidget
's surface in its OnPaint
method in the following way:
if (miniControlPosition != MiniControlPosition.None)
{
e.Graphics.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBilinear;
for (int i = 0; i < miniAlarmControls.Length; i++)
e.Graphics.DrawImage(miniAlarmControls[i].alarmImage, miniAlarmControls[i].ClientRectF);
}
The scaled down images appear with much better quality when painted in HighQualityBilinear
interpolation mode.
These mini-controls allow the user to immediately see if any alarm events have been triggered for a sensor.
Fan Diagnosis: My Problem Fan
I used the new history tracking capability to analyze the issues with the misbehaving CPU fan I discussed in App for CPU Temps, Fan Speeds, etc. The first thing I noticed was that it would momentarily spike up to 50,000 or 60,000 rpms. Now no computer fan is going to spin that fast unless it is powered by a 2,000 horsepower, Rolls Royce, V-12 aircraft engine, like the Rolls Royce Merlin engine used in the Spitfire, the most widely produced and strategically important British single-seat fighter aircraft of World War II—I took a little tangent there, didn't I? But I don't see clouds of exhaust fumes boiling out of my computer, so I guess that must be a false reading.
I swapped the fan out for another that I had sitting around, an old one that had been stored in a box in my garage for about ten years. The CPU Fan curves displayed earlier in this article were recorded with that newer fan installed. But take a look at those curves again. That one fan still occasionally spikes up to over 6,000 rpms, and I know it's not capable of doing that. And it also drops to zero for short moments. But overall, it seems to be working, and I've set a minimum alarm threshold of 100 rpm to catch it if it shuts down. Remember that with a falseAlarmDelay
of 3.0 seconds, the alarm won't trigger unless it remains below that threshold for 3 continuous seconds. I have configured the app on my system to capture FALSE alarm events.
So now, I know that something weird is happening with two different fans, but the symptoms are decidedly less pronounced with the second fan, so it's highly likely the fans are causing the problem, and not the motherboard sensor—whew, hopefully I dodged that bullet. But to be absolutely certain, I bought a new fan and installed it. Here is some tracking history for the new fan:
This fan is capable of operating up to 1,200 rpms. It does exceed that on occasion, but not excessively, so I'm not bothered by a few slightly high values that are probably false readings. I had to think carefully about the short instants where the fan speed drops to zero. The BIOS in my Asus P6T motherboard includes algorithms to control CPU fan speed based on CPU temperature. Asus doesn't publish details on the Fan Speed vs. Temperature curve they use, but it obviously cranks the fan speed up when the CPU gets hotter. Since my CPU runs fairly cool most of the time (< 50°C [122°F]), I assume the BIOS algorithm is telling the fan to operate at a speed below its rated minimum, which is 400 rpm, so it drops to zero momentarily. I can live with that. But I have set an alarm minimum for all fans in my system, along with a falseAlarmDelay
of 3.0 seconds, so I'll know if a fan shuts down completely.
Interestingly enough, when I installed the new fan, I got a "CPU Fan Error" when I booted my system, which meant the fan wasn't running. I opened the case and visually confirmed that the fan was indeed running, so I ignored the error and let the system complete its boot sequence. Looking again at the above curve, the answer was obvious: my system runs too cool—no pun intended. The typical CPU fan speed is running around 700 rpm, but when I first boot the system and the CPU is at room temperature, the BIOS algorithm probably tells it to run below its minimum, so it stops completely, and I get the boot error. The message here is that minimum rated fan speed can be important as well.
So the problem is solved without pointing a finger at the mobo—whew!
I hope this illustrates how this app can be used to diagnose and protect your system.
To Do
- If OHM ever implements the ability to set a fan's speed, I'll need to incorporate that into this app.
History
- 2018.10.06: First implementation and publication
- 2019.02.20: Fixed a few bugs; see ChangeLog.txt