|
Wjousts,
your signal indeed looks nice.
If it was perfect, all you would have to do is indeed look for changes in the sign of the derivative, i.e. detect monotonous sequences (increasing then decreasing, in alternation).
Added noise causes local perturbations of this dream situation, resulting in small breaches of the monotonocity. The closer you get to the peak, the more likely they get (as the derivative gets smaller and smaller).
My way to deal with that is to consider "quasi-increasing sequences" (resp. decreasing), i.e. values that go increasing but allow a backtracking limited by a threshold value. For example, assuming a threshold of 10, the following sequence is quasi-increasing: 0, 22, 31, 27, 45, 63..., while this one is not: 0, 22, 31, 20, 45, 63... (said differently, it stops being quasi-increasing at the value 20).
This approach is more robust than mere derivative computation and you can also filter on the total increase (decrease) of the sequence. You can also limit the length of the allowed backtracking to a given number of samples.
When will it detect a peak ? When the signal value has decreased to the peak value minus the threshold.
I guess that in any case you cannot avoid having to wait some times after a peak, to be sure it is a true one.
|
|
|
|
|
Thanks YDaoust. I like that idea a lot. It looks like it should be able to deal with shoulders or split peaks fairly nicely so long as a sensible threshold is used. I'll experiment with that idea a little.
|
|
|
|
|
I think so too. Your signal is clean, you should get good results. The nicety with the approach is that you work on the signal, not its derivative.
The threshold value can be set to a small multiple of the noise amplitude.
|
|
|
|
|
Removing spikes could be easily done by an FFT (converting to frequency spectrum), a low pass (removing high frequency components) and a reverse FFT 8going back to the time signal again). However, this pretty much kills your response time.
There are low-pass filter implementaitons in the time domain (wikipedia 1[^], wikipedia 2[^])
(I don't know wnough to recommend a specific algorithm, I am just parroting what "the guys" do.)
|
|
|
|
|
Thanks Peter. I've used the low pass filter implementation in the second Wikipedia article before with fairly good results, but I wasn't peak picking, just smoothing. I actually have a simple RC filter to electronically filter the data a little, but my electronics knowledge isn't quite up to understanding it properly. I'm not 100% sure that it's really helping at all, but at least it doesn't use up processor cycles (although, maybe it does lag the signal some, like I said, I don't understand it well enough).
|
|
|
|
|
Make several simple calculations and see which one works for you on sample/real data.
*slope / gradient (as mentioned in other replies)
*keep 6/10/12 etc.. Most recent highs - to get a rolling average of them.
*Exponential moving average (more responsive that a moving avergage)
Collect a few of these and you will find that one of them in comparison to the others gives you the indication than the samples are 'out of band'. Which is what I understand you to be seeking.
Most of the calculations above can be maintained on a rolling basis without having to keep too much data.
|
|
|
|
|
Try a running average of the last X samples and subtract it from the previous average. When the sign changes positive or negative, it's a peak or trough. You can change the value of X to filter out the noise from the patient moving.
|
|
|
|
|
I've done this with heart rate variability (HRV) and it's rather simple. Determine the frequency range of your useful data and setup a bandpass IIR or FIR filter to filter out all unwanted noise. Use matlab or similar tool to determine the coefficients of the filter. IIR filters are very easy to implement in code as they are just some floating point multiply ops. After your data is filtered, peaks can be easily identified with a simple rule e.g. if(sample[t] < sample[t-1]) . All is in the design of the filter (compromise between delay vs and bandwidth).
modified on Thursday, June 2, 2011 3:38 AM
|
|
|
|
|
I found the following article useful when I was trying to find peaks & troughs in some data:-
http://billauer.co.il/peakdet.html[^]
Looking at your graph the only slight hiccup would be at 31s, but you can set the delta value to an appropriate value to overcome this I think.
|
|
|
|
|
Hi,
One good way to remove spikes is to use a median filter. Keep a window of the data, sort the values in the window, and use the one in the center of the range.
Don't know if that is appropriate for a real time application, but we use it for seismic data post-processing all the time.
Is that sleep-study data? I had to have one a few weeks ago, I have sleep apnea!
Regards,
Tom
|
|
|
|
|
I'm sure there's a relatively-straightforward way to do this, but my research so far has been a bit overwhelming... Heavy statistics and calculus have never been one of my strengths...
Basically, I need to calculate a best-fit curve on a scatter graph... Points are not evenly spaced on either axis, and there's no cost function to work with (These are derivations of stock/bond price histories)... Just raw (X,Y) values.
I've been trying to work out an algorithm that'll start with a second-degree polynomial function, iterate through and minimize the difference, and then jump to third-degree if it's not within tolerance (And so on, up to a maximum degree). Juggling the data around is something I can do blindfolded, but there's one thing I can't figure out...
Given the polynomial coefficients ((a,b,c,d) -> y = a + bx + cx^2 + dx^3) and a function to determine the total difference across the curve (Or the total square difference, or whatever's needed), how do I adjust the coefficients for the next pass? I mean, how do I choose new values?
I was originally going to try to fix the highest-degree coefficient first, then drill down to the zeroth-degree one (I have a root-solving algorithm that I use for yield calculations - Was going to adapt that), but then I looked at it again with the proper amount of caffeine, and realized that solving one coefficient at a time wasn't going to work
SOLVED: See my reply below... Linear/Polynomial Regression
|
|
|
|
|
Not sure if this will help but, have you tried a spline interpolation algorithm?
|
|
|
|
|
Kind of the opposite of where I'm going here... The goal isn't to pass through every point, but to give more of a trend-line.
Generally going to be working with a few hundred data points, but only looking for maybe a second or third-degree function in most cases.
<small><i>Proud to have finally moved to the A-Ark. <a href="http://en.wikipedia.org/wiki/Places_in_The_Hitchhiker%27s_Guide_to_the_Galaxy#Golgafrincham">Which one are you in?</a></i><br>Author of the <a href="http://www.serpentooth.com">Guardians Saga</a> (Sci-Fi/Fantasy novels)</br></small>
|
|
|
|
|
Ok, now I understand it better. Then maybe bezier curves would be a better choice.
|
|
|
|
|
Hi Ian,
I don't think there are any closed formula's for what you are trying. What I would recommend is this, inspired by Newton-Raphson:
1. have some values a1,b1,c1,d1 and the corresponding value of your error function e1.
2. slightly change a1 (say by 1%) and recalculate e; find the "a-sensitivity" as (change in e)/(change in a)
3. same for b1, then c1, then d1
4. now choose new values a2,b2,c2,d2 based on the four sensitivities, maybe like so: a2=a1+coef*e/"a-sensitivity" where maybe coef is one.
then iterate until convergence is reached (i.e. sufficiently accurate); warning: this might diverge right away, if so try smaller values of coef (you could halve it in each consecutive try).
FWIW: with more-or-less random data, you'll never find a good fit with a single polynomial. What one normally does is a local approximation, which applies in a small region only. That is similar to a font being drawn as a sequence of Bezier curves; it pays of to have several simple Bezier curves, rather than trying to have a single very complex one (which in general will not be good enough).
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
|
|
|
|
|
Hmm, didn't think of the sensitivity thing... Giving that a shot now.
FWIW: with more-or-less random data, you'll never find a good fit with a single polynomial. What one normally does is a local approximation, which applies in a small region only. That is similar to a font being drawn as a sequence of Bezier curves; it pays of to have several simple Bezier curves, rather than trying to have a single very complex one (which in general will not be good enough).
It's not actually random data... It's graphing related values, so it should roughly cluster around y=x unless there's another factor (Which is what I'm trying to illustrate).
Proud to have finally moved to the A-Ark. Which one are you in? Author of the Guardians Saga (Sci-Fi/Fantasy novels)
|
|
|
|
|
If you suspect some periodicity, I'd suggest you first subtract x, so according to what you said the average becomes zero, and the first-order approximation would be the x-axis itself. Then consider a Fourier analysis.
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
|
|
|
|
|
Not a cyclic function, really... Bond valuations are a bit odd, so hard to explain... But dropping it to the X-axis does kinda help the initial approximation.
Still needs a lot more work (Got it to approximate the first degree, and working on expanding from there), but the sensitivity thing is working like a charm. Thanks
<small><i>Proud to have finally moved to the A-Ark. <a href="http://en.wikipedia.org/wiki/Places_in_The_Hitchhiker%27s_Guide_to_the_Galaxy#Golgafrincham">Which one are you in?</a></i><br>Author of the <a href="http://www.serpentooth.com">Guardians Saga</a> (Sci-Fi/Fantasy novels)</br></small>
|
|
|
|
|
you're welcome.
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
|
|
|
|
|
Assuming that standard numerical methods such as simplex aren't going to work here, and given that you do alreaduy have a fitness function available and a few coefficients to tweak, I'm wondering if perhaps a genetic algorithm or something like simulated annealing would work here?
|
|
|
|
|
Thanks for the input, guys... Got me thinking along the right lines, and that got me to the right google search...
I was searching for things like "curve fitting" and "best fit algorithm"... Turned out the 'proper' term was "Linear Regression"... And that got me here:
An Algorithm for Weighted Linear Regression[^]
Figures, the solution would be right here on CP
|
|
|
|
|
I'm surprised with all those answers, no one gave you the right approach: Polynomial Regression: http://en.wikipedia.org/wiki/Polynomial_regression[^]
It will give you a best-fit quadratic or cubic equation (or even higher orders if you want). The Wikipedia link defines it, but the explanation there isn't very intuitive. The basic idea is you get an equation for the error terms, take the derivative, and set it to zero. Solving that gives you the equation with the minimum error.
|
|
|
|
|
Well, to be fair, the answers got me thinking along the right lines, and led me to the right terms, which brought me right back one of Walt's articles on CP... See my above post
Marked the original as solved
|
|
|
|
|
But if your data set naturally has a curve, a linear model will be inaccurate.
|
|
|
|
|
Well it's multiple linear regression... It lets me specify the degree.
Using Walt's library, it's looking perfect... Using third-degree (x^3, so a possible S-shape in the curve), and releasing the update next week.
|
|
|
|
|