Introduction
WinRQA is a complete Windows desktop application written in C# that draws recurrence plots and calculates recurrence quantification measures to perform recurrence quantification analysis (RQA) of time series.
This first version only builds recurrence plots and calculates the RQA measures, allowing also cross or joint two different plots, but I plan to go extending it with more features in the future.
The application was created using Visual Studio 2013.
I have added some sample time series, some from the Lorenz, Henon and logistic systems, one with white noise, one with a sinoidal signal and one more with EEG data, that you can use to test the program.
Background
The recurrence analysis is not a trivial matter, so I have to supose that you have some knowledge on the subject. In any case, you can visit this website, devoted to the recurrence analysis, where you can find information and some valuable resources.
You can also visy my blog, where I have some articles related with the theme of complex time series analysis, also in spanish.
Anyway, I will try to provide a small summary of in what consists the recurrence analysis.
The idea is that, if we have a time series with a complex structure, it can form part of a dynamic system with more than one independent variables, and it can contain information that allows to aproximately reconstruct the phase space of the complete system.
The procedure to extend the one-dimensional series to more dimensions is to create a new time series for each extra dimension using the original series, but each displaced forward an integer multiple of a constant delay. So, the first dimension is the original series, the second is the series displaced one delay, the third is the series displaced two delays, and so on.
To build a recurrence plot, first we create a squared array with the distances between each combination of two points in the n-dimensional series, each [i,j] element contains the distance between the i and the j point of the series.
Then, you can simply draw the array of distances, giving each point a color in function of the value of the distance, or you can decide a maximum distance (radius) below which two distances are considered equals (i.e. two points are considered recurrents).
The recurrence plot shows those recurrent points, and we can detect not obvious relationships between the data through the point patterns formed in the plot, or changes of the dynamic in certain points of the series.
Additionally, some quantification measures are calculated, using the structure and distribution of diaginal and vertical lines formed in the plot, along with the count of recurrent points. These measures help to characterize and analize the dynamic sytem being studied.
Using the application
In this version of the application, we can only work with time series contained in text files, with a single value in each line and without any kind of header.
The first thing to do is to open the File menu and select the Open... option. Then, find the data file with the dialog box and the recurrence plot window will appear, empty at first:
There are two tool bars, where you can configure the settings for the plot. In the upper one, set the Delay and Embedded dimension to extend the time series, a maximum Radius to considerate that two points are recurrent (if the value is less than zero, the program shows a colored gradient of the distances), and select a Norm to calculate the distance between points.
In the Lower tool bar, you can define the Start and End point of the window in the time series that you want to use to buid the plot, select the minimum length for diagonal lines (Line) and for the vertical ones (VLine), the distance from the main diagonal used to calculate the Trend measure, and select if you want to Rescale the distances in order to standarize them, dividing by the mean or max distance.
There are also some command buttons. In the upper toolbar, from left to right, there is the button, to draw or redraw the plot, the button, to select a color for the recurrent points, the button is used to show the plot as a gray or color scale, in case that you provide a value for Radius less than zero. The last two buttons are to copy the recurrence plot as an image in the clipboard, and to copy the quantification measures as text, in a tabular form (tab separated), to the clipboard.
In the lower tool bar, the button shows a helper window that gives a visual interface to select the start and end point of the series window using the mouse:
In the upper half of the window there is the complete time series. You can click with the left and right mouse buttons to select the starting and ending point of the window used to build the plot. With the scroll bar in the left of the series you can move the window without altering its size.
The lower half is for the interval selected in the time series, with the scroll bars in their sides you can fine adjust the start and end points. These settings are reflected inmediately in the recurrence plot window, in the Start and End text boxes.
The application is MDI, so you can open more than one time series at once. You can drag one of the recurrence plots and drop it over one of the two gray panels in the right of the recurrence plot window (the size and dimension of the two plots must be compatible).
Select the upper panel if you want to obtain a cross recurrence plot, where the distances are calculated between the points of the two series (so they must have the same embedded dimension).
The lower panel is for the joint recurrence plot, which contains only the points that are recurrent in the same positions of the two recurrence plots.
Understanding the code
The three Forms in the project are the mainForm, which is the MDI parent form, frmRecurrencePlot, where the Recurrence Plot and associated measures are shown, and frmTimeSeries, where the user can select the window of the time series that are used in the recurrence analysis. As those are all only WinForms stuff, I will not deep inside them more than that.
From the four classes in the Data namespace, NamedItem is only a helper to put items in the dropdowns with a friendly name, VectorExport is another helper class to drag & drop plots between forms, DataFileManager encapsulates the logic to read data files, which are text files with a value of a time series in each line.
So, let's examine with detail the last class, RecurrencePlot, which implements the Recurrence Plot itself. First, there are three enums to configure the plot:
public enum Norm
{
Euclidean, Minimum, Maximum
}
public enum Rescaling
{
None, Mean, Max
}
public enum PlotType
{
Normal, Cross, Joint
}
As the plot shows distances between vectors, the Norm enum indicates which norm we use to calculate it. The Euclidean norm is the square root of the sum of the squared differences of the vectors components. The maximum norm is the greather of the distances between the vectors components, and the minimum is the smaller one.
Rescaling is to normalize the distances, is used mainly when you want to cross two plots between them, and consists in divide each distance by the maximum or the mean distance.
PlotType indicates which type of recurrence plot has to be built, Normal, to draw the recurrence of a time series with itself, or Cross / Joint, to draw a cross / joint recurrence plot between two different time series or two parts of the same time series.
The class variables are the following:
private DataFileManager _data = null;
private float _radius = 0f;
private int _delay = 1;
private int _embed = 3;
private int _line = 2;
private int _vline = 2;
private int _trendK = 10;
private int _start = 0;
private int _end = 0;
private Norm _norm = Norm.Euclidean;
private Rescaling _rescaling = Rescaling.None;
private PlotType _type = PlotType.Normal;
private float _maxDistance = 0f;
private float _meanDistance = 0f;
private float _maxJDistance = 0f;
private float _meanJDistance = 0f;
private float[,] _vectors = null;
private float[,] _ovectors = null;
private float[,] _dMatrix = null;
private int[,] _rMatrix = null;
private float _jRadius = 0f;
_data is a DataFileManager, used to read the time series data, _radius is the maximum distance between vectors that is considered a recurrence, _delay is the delay used to extend the time series to more than one dimension, _embed is the embedding dimension, _line the minimum diagonal line length, _vline the minimum vertical line length, _trendK is a constant used to calculate the TREND RQA measure, and is the distance from the main diagonal line to be considerad in this calculus, _start and _end are the start and end points of the window in the time series used to build the plot, _norm contains the type of distance used, _rescaling is used to normalize distances, _type is the type of recurrence plot to build, _maxDistance is the maximum distance in the entire plot and _meanDistance is the mean of all the distances, _maxJDistance and _meanJDistance are the same for the joint recurrence plot version, _vectors are the vectors built with the embedding dimension and the delay, _ovectors are the vectors of another recurrence plot, to cross or joint, _dMatrix is the matrix with the distances, _rMatrix is the recurrence matrix, which contains 0 for the recurrent points, or a value other than zero for the non recurrent points, finally, _jRadius is the radius used to calculate the distance between the vectors of the other plot when building a joint recurrence plot.
The RQA measures are exposed as properties:
public float RR { get; private set; }
public float DET { get; private set; }
public float RATIO { get; private set; }
public float L { get; private set; }
public int LMAX { get; private set; }
public float DIV { get; private set; }
public float ENTR2 { get; private set; }
public float ENTRe { get; private set; }
public float TREND { get; private set; }
public float LAM { get; private set; }
public int VMAX { get; private set; }
public float TT { get; private set; }
For an explanation of what these measures are, you can visit the following page: http://www.recurrence-plot.tk/rqa.php.
Almost all of the global variables are exposed as properties. As there are some restrictions on the acceptable value combination, there is a helper function that allows to give values to all of them at once:
public void SetAllValues(float radius,
int delay,
int embed,
int line,
int vline,
int trendk,
int start,
int end,
Norm norm,
Rescaling
rescaling,
PlotType type)
The boolean property ColorScheme is used to set the color palette (color or gray scale) to draw the plot when there is not defined a minimum radius to consider two points recurrent and the plot is drawn as a distance gradient map. With this function you can convert a distance in the matrix of recurrence in a color:
public Color GetPointColor(int level)
the nucleus of the class is the Process function, which makes all the calculations to build the recurrence plot.The first thing to do is to load the data and construct the array of vectors with the embedded dimension and delay provided:
_vectors = new float[Range, _embed];
_data.Open();
try
{
_data.Seek(_start);
int pos = 0;
float[] values = new float[1000];
int vlength = _data.ReadBuffer(values);
while (pos < Range + (_delay * _embed - 1))
{
int ix = 0;
while (ix < vlength)
{
for (int d = 0; d < _embed; d++)
{
int pd = (ix + pos) - (d * _delay);
if (pd >= 0)
{
if (pd / Range == 0)
{
_vectors[pd % Range, d] = values[ix];
}
}
else
{
break;
}
}
ix++;
}
pos += values.Length;
if (pos < Range + (_delay * _embed - 1))
{
vlength = _data.ReadBuffer(values);
}
}
}
finally
{
_data.Close();
}
The data is read in chunks of 1000 samples. The vectors have all a number of coordinates equal to the embedding dimension, each one containing the value of the time series in the current position plus an offset equals to the delay multiplied by the index of the coordinate (ix, ix + delay, ix + 2*delay and so on).
This means that the same value can form part of more than one vector. To avoid read the data once by each dimension, the inner loop puts the value in all the places where is needed.
The next step is to construct the matrix of distances between these vectors. Each position i,j in the matrix contains the distance between the i and j vectors, so, for the normal and joint plots, the matrix is symmetric respect their main diagonal, called Line Of Identity (LOI), and all the distances in this diagonal are ever 0, as they are the distances between each vector and itself. For the cross recurrence plot are, the distance between the i and j points, and the j and i ones is not the same, so in this case we have to process the entire matrix.
_dMatrix = new float[Range, Range];
_meanDistance = 0f;
_maxDistance = 0f;
float[,] jMatrix = null;
if (_type == PlotType.Joint)
{
jMatrix = new float[Range, Range];
_meanJDistance = 0f;
_maxJDistance = 0f;
}
int ixc = 0;
int iyc = 0;
int maxdim = TypeofPlot == PlotType.Joint ? Math.Max(_embed, _ovectors.GetLength(1)) : _embed;
float _dcount = 0f;
if (TypeofPlot == PlotType.Cross)
{
for (iyc = 0; iyc < Range; iyc++)
{
for (ixc = 0; ixc < Range; ixc++)
{
float sdif = InitialDistance;
for (int cv = 0; cv < maxdim; cv++)
{
sdif = Distance(_vectors[iyc, cv], _ovectors[ixc, cv], sdif);
}
if (_norm == Norm.Euclidean)
{
sdif = (float)Math.Sqrt(sdif);
}
_dcount++;
_dMatrix[ixc, iyc] = sdif;
_maxDistance = Math.Max(_maxDistance, sdif);
_meanDistance += sdif;
}
}
}
else
{
for (iyc = 0; iyc < Range; iyc++)
{
_dMatrix[iyc, iyc] = 0f;
for (ixc = iyc + 1; ixc < Range; ixc++)
{
float sdif = InitialDistance;
float jdif = InitialDistance;
for (int cv = 0; cv < maxdim; cv++)
{
switch (TypeofPlot)
{
case PlotType.Normal:
sdif = Distance(_vectors[iyc, cv], _vectors[ixc, cv], sdif);
break;
case PlotType.Joint:
if (cv < _embed)
{
sdif = Distance(_vectors[iyc, cv], _vectors[ixc, cv], sdif);
}
if (cv < _ovectors.GetLength(1))
{
jdif = Distance(_ovectors[iyc, cv], _ovectors[ixc, cv], jdif);
}
break;
}
}
if (_norm == Norm.Euclidean)
{
sdif = (float)Math.Sqrt(sdif);
}
_dcount++;
_dMatrix[ixc, iyc] = sdif;
_dMatrix[iyc, ixc] = sdif;
_maxDistance = Math.Max(_maxDistance, sdif);
_meanDistance += sdif;
if (TypeofPlot == PlotType.Joint)
{
if (_norm == Norm.Euclidean)
{
jdif = (float)Math.Sqrt(jdif);
}
jMatrix[ixc, iyc] = jdif;
jMatrix[iyc, ixc] = jdif;
_maxJDistance = Math.Max(_maxJDistance, jdif);
_meanJDistance += jdif;
}
}
}
}
_meanDistance /= _dcount;
_meanJDistance /= _dcount;
Depending on the type of recurrence plot we are building, the way to compute distances vary. In the normal plot, the distances between vectors are calculated, but in the case of csoss recurrence plot, we have two different set of vectors, and the distances are calculated between the i vector of one of them and the j vector in the other set.
In the case of the joint recurrence plot, we have also two set of vectors, but we build two separated matrixes of distances, one for each set of vectors.
Along with the matrix, the maximum and mean of the distances are calculated.
The next step is rescaling if needed, dividing by the maximum distance, giving all distances between 0 and 1, or dividing by the mean distance.
switch (_rescaling)
{
case Rescaling.Max:
for (int x = 0; x < Range; x++)
{
for (int y = 0; y < Range; y++)
{
_dMatrix[x, y] /= _maxDistance;
if (jMatrix != null)
{
jMatrix[x, y] /= _maxJDistance;
}
}
}
_meanDistance /= _maxDistance;
_maxDistance = 1f;
break;
case Rescaling.Mean:
for (int x = 0; x < Range; x++)
{
for (int y = 0; y < Range; y++)
{
_dMatrix[x, y] /= _meanDistance;
if (jMatrix != null)
{
jMatrix[x, y] /= _meanJDistance;
}
}
}
_maxDistance /= _meanDistance;
_meanDistance = 1f;
break;
}
Next, the recurrence matrix is built. The recurrent points count is stored in the prec variable, as it will be used later to calculate the quantification measures. Two points are recurrents if their distance is less than the radius. If the radius provided is less than 0, the matrix stores a value proportional to the distance to select a color or gray tone, but if it is greather than zero, the matrix contains a 0 for a recurrent point and 255 for the not recurrent ones. Again, the process is slightly different for cross recurrence plots:
_rMatrix = new int[Range, Range];
int prec = 0;
int tpnt = TypeofPlot == PlotType.Cross ? Range * Range : (Range * (Range - 1)) / 2;
if (TypeofPlot == PlotType.Cross)
{
for (iyc = 0; iyc < Range; iyc++)
{
for (ixc = 0; ixc < Range; ixc++)
{
int level = _radius < 0f ? (int)((_dMatrix[ixc, iyc] * ColorRange) / _maxDistance) : (_dMatrix[ixc, iyc] <= _radius ? 0 : 255);
_rMatrix[ixc, iyc] = level;
if (level == 0)
{
prec++;
}
}
}
}
else
{
for (iyc = 0; iyc < Range; iyc++)
{
for (ixc = iyc; ixc < Range; ixc++)
{
int level = _radius < 0f ? (int)((_dMatrix[ixc, iyc] * ColorRange) / _maxDistance) : (_dMatrix[ixc, iyc] <= _radius ? 0 : 255);
if ((_radius >= 0) && (TypeofPlot == PlotType.Joint))
{
int level2 = jMatrix[ixc, iyc] <= _jRadius ? 0 : 255;
if (level != level2)
{
level = 255;
}
}
_rMatrix[ixc, iyc] = level;
if ((ixc != iyc) && (level == 0))
{
prec++;
}
}
}
}
Now is time to calculate the recurrence quantification measures. These are the involved variables:
int[] pl = new int[Range - _line];
int[] pv = new int[Range - _vline];
float[] rr = new float[Range - 1];
int spl = 0;
int npl = 0;
int spv = 0;
int npv = 0;
int lmax = 0;
int vlmax = 0;
double sppl = 0;
double spple = 0;
float avgl = 0f;
float avgv = 0f;
float avgr = 0f;
float tr = 0f
First, the diagonal and vertical lines longer or equal than the values provided in _line and _vline are counted in the variables npl and npv, and the count of lines for each length stored in the histograms pl and pv. In the case of normal and joint plots, only one half of the matrix is used, as it is symmetric respect to the main diagonal. The main diagonal is not taken into account. The rr variable stores the percentage of recurrent points in each diagonal line, and is used to calculate the trend measure. If we are building a cross recurrence plot, the entire recurrence matrix must be processed.
if (TypeofPlot == PlotType.Cross)
{
for (int l = 1; l < Range; l++)
{
int cl1 = 0;
int cl2 = 0;
for (int i = 0; i <= l; i++)
{
if (_rMatrix[i, i + ((Range - 1) - l)] == 0f)
{
cl1++;
}
else
{
if (cl1 >= _line)
{
pl[cl1 - _line]++;
npl++;
}
cl1 = 0;
}
if (l < Range - 1) {
if (_rMatrix[i + ((Range - 1) - l), i] == 0f)
{
cl2++;
}
else
{
if (cl2 >= _line)
{
pl[cl2 - _line]++;
npl++;
}
cl2 = 0;
}
}
}
if (cl1 >= _line)
{
pl[cl1 - _line]++;
npl++;
}
if (cl2 >= _line)
{
pl[cl2 - _line]++;
npl++;
}
}
for (int l = 0; l < Range; l++)
{
int cv = 0;
for (int i = 0; i < Range; i++)
{
if (_rMatrix[l, i] == 0f)
{
cv++;
}
else
{
if (cv >= _vline)
{
pv[cv - _vline]++;
npv++;
}
cv = 0;
}
}
if (cv >= _vline)
{
pv[cv - _vline]++;
npv++;
}
}
}
else
{
for (int l = 0; l < Range - 1; l++)
{
int cl = 0;
int cv = 0;
float cr = 0f;
for (int i = 0; i <= l; i++)
{
if (_rMatrix[(Range - 1) - i, l - i] == 0f)
{
cl++;
cr += 1f;
}
else
{
if (cl >= _line)
{
pl[cl - _line]++;
npl++;
}
cl = 0;
}
if (_rMatrix[l + 1, i] == 0f)
{
cv++;
}
else
{
if (cv >= _vline)
{
pv[cv - _vline]++;
npv++;
}
cv = 0;
}
}
rr[l] = cr / (float)(l + 1);
if (cl >= _line)
{
pl[cl - _line]++;
npl++;
}
if (cv >= _vline)
{
pv[cv - _vline]++;
npv++;
}
}
}
With these histograms, we can calculate other measures, as mean and max diagonal and vertical line length, the Shannon information entropy and the trend of distribution of diagonal lines in respect the main diagonal. The TREND measure does not apply in the case of the cross recurrence plot.
for (int ip = 0; ip < pl.Length; ip++)
{
spl += (ip + _line) * pl[ip];
if (pl[ip] > 0)
{
avgl += (float)(ip + _line) * pl[ip];
double p = (double)pl[ip] / (double)npl;
sppl += p * Math.Log(p, 2);
spple += p * Math.Log(p);
}
if (pl[ip] > 0)
{
lmax = ip + _line;
}
}
for (int ip = 0; ip < pv.Length; ip++)
{
spv += (ip + _vline) * pv[ip];
if (pv[ip] > 0)
{
avgv += (float)(ip + _vline) * pv[ip];
}
if (pv[ip] > 0)
{
vlmax = ip + _vline;
}
}
sppl = -sppl;
spple = -spple;
if (npl != 0)
{
avgl /= (float)npl;
}
if (npv != 0)
{
avgv /= (float)npv;
}
for (int ir = rr.Length - _trendK; ir < rr.Length; ir++)
{
avgr += rr[ir];
}
if (TypeofPlot != PlotType.Cross)
{
for (int ir = rr.Length - _trendK; ir < rr.Length; ir++)
{
avgr += rr[ir];
}
avgr /= (float)_trendK;
float i2 = 0f;
float n2 = (float)_trendK / 2f;
for (int i = 0; i < _trendK; i++)
{
tr += ((float)(i + 1) - n2) * (rr[i] - avgr);
i2 += ((float)(i + 1) - n2) * ((float)(i + 1) - n2);
}
tr *= 1000f;
tr /= i2;
}
Finally, the calculations are completed and stored in the corresponding properties.
RR = (float)prec / (float)tpnt;
DET = (float)spl / (float)prec;
RATIO = (float)(spl * tpnt) / (float)(prec * prec);
L = avgl;
LMAX = lmax;
DIV = lmax > 0? 1f / (float)lmax: float.NaN;
ENTR2 = (float)sppl;
ENTRe = (float)spple;
TREND = tr;
LAM = (float)spv / (float)prec;
VMAX = vlmax;
TT = avgv;
The plot drawing is performed in a separate function, so you can redraw it without need of recalculate all the distances and measures.
public Bitmap DrawRecurrencePlot(int maxwidth, int maxheight, Color color)
The only thing to comment about this function is that it generates a squared Bitmap with a minimum side size of one pixel for each position in the recurrence matrix. If the width and height parameters (which normally will be the size of the window where you want to draw the Bitmap) allow to use more than one pixel per position, the Bitmap is rescaled to the maximum possible size. The color parameter is used when drawing plots with a radius greather or equal to zero, which are drawn using only two colors (the background is always white).
int scale = Math.Max(1, Math.Min(maxwidth, maxheight) / Range);
Bitmap bmprp = new Bitmap(scale * Range, scale * Range);
And this is all by the moment. My intention is to add in the future some graphic tools to help to decide the parameters used with the plot, as the delay, embedded dimension and radius, and some graphs with the evolution of some RQA measures as some plot settings change.
Thanks for reading !!!