Introduction
In Part 2 of this series, we look at performing linear Granger Causality tests with both C#.NET and the R Statistical Language. By using both, we can prototype statistical algorithms in R with work from Pfaff (2008) and then convert to C# and/or C++ code for optimization. First, we want to have multiple statistical perspectives in C#, so we can handle the communication with R as well as build on the work from Part I. Second, we want to be able to migrate our working code from the open source community into high performance CUDA libraries. This is necessary for Big Data and real-time forecasting needed in neuroscience and commodity price research. One of subjects in this series.
Background
Again, I refer to the many good books on multivariate times series analysis that feature Granger Casuality Tests. Of course, my book-not that one...wait for it...with Springer Science due to be released this year "Neuroinformatics and Computational Neuroscience Research in R and Model Maker 3" has the work in more detail-shameless plug. Anyway, you can read the PDFs for Library 1
Library 1: R Packages available on CRAN
1. tSeries - Diagnostic tests and ARMA Modeling
2. vars - Causality test and multivariate time series modeling (good instruction)
Library 2: R Interface to .NET
R.NET is a good interface to C#.NET and is available on codeplex at http://rdotnet.codeplex.com/. This is much better than StatConnector which was the subject of a previous article and occupied a year or two of my life.
Using the code
In the listing of Figure 1, you have to replace several lines of code with your own data set and variables. The main method in R is causality()
which computes both F and Wald test statistics for Granger (1969) and Instantaneous causality for a VAR(p) model. The cool thing with the REngine is that you can
do string processing within the methods and can build strings based on decision variables. Lots of AI possibilities here and the integration of my neural diagonal thinking
stuff (http://neuraldiagonalthinking.wordpress.com/) with the cross-frequency coupling for forecasting brain wave states
with a combination of HTML5 and ASP.NET-too exciting. Again, another article in the series.
Figure 1.
Granger Causailty Test in R using the Brain data from
the Brainwaver R package
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using ABMath.ModelFramework.Models;
using ABMath.ModelFramework.Data;
using MathNet.Numerics.Distributions;
using MathNet.Numerics.RandomSources;
using MathNet.Numerics;
using MathNet.Numerics.LinearAlgebra;
using RDotNet;
namespace TCW_ModelMaker3
{
public class R_GrangerCausality
{
public R_GrangerCausality()
{
}
public void runR_CausalityTest()
{
REngine engine;
REngine.SetDllDirectory(@"C:\Program Files\R\R-2.13.0\bin\i386");
engine = REngine.CreateInstance("RDotNet", new[] { "-q" });
engine.EagerEvaluate("load('C:/2011/The Cromwell Workshop/Research/Data/fmri/brain.RData')");
engine.EagerEvaluate("library(tseries)");
engine.EagerEvaluate("install.packages('vars',,'http://cran.ms.unimelb.edu.au')");
engine.EagerEvaluate("library(vars)");
engine.EagerEvaluate("x<-brain$V2");
engine.EagerEvaluate("y<-brain$V3");
engine.EagerEvaluate("X1<-data.frame(x,y)");
engine.EagerEvaluate("var_Model<-VAR(X1,p=1)");
engine.EagerEvaluate("causTest1<-causality(var_Model, cause='x')");
engine.EagerEvaluate("causStat<-causTest1$Granger$statistic[1,1]");
engine.EagerEvaluate("causValue<-causTest1$Granger$p.value[1,1]");
string statistic = engine.GetSymbol("causStat").AsNumeric().First().ToString();
string pvalue = engine.GetSymbol("causValue").AsNumeric().First().ToString();
string hypothesis = "HO: X Does Not Granger Cause Y";
string decision;
double signif = 0.05;
if (Convert.ToDouble(pvalue) < signif)
{
decision = "Reject";
}
else
{
decision = "Accept";
}
Console.WriteLine("Hypothesis:" + hypothesis);
Console.WriteLine("Statistic:" + statistic + " and " +
"P-Value:" + pvalue + " and " + decision);
Console.ReadKey();
}
}
}
In C#.NET, no I am not going to replicate the same as above. That is your task. Here the process is to fit the VAR(p) model to the data and then estimate AR(p)
models for both the variables X and Y and obtain the variances and compute the test statistic. The other methods are for data generation.
Figure 2.
The Granger Causality class in C#.
public class GrangerCausality
{
public double[] sigma2 { get; set; }
public Matrix s2 { set; get; }
public double[,] resids { set; get; }
public double covariance { set; get; }
public double[,] _data { set; get; }
StatTest st;
public GrangerCausality(int order, int dim, int obs)
{
List<TimeSeries> zt = generateTimeSeries(obs);
VARModel model1 = new VARModel(order, dim);
MVTimeSeries m1 = new MVTimeSeries(zt, false);
resids = model1.resid_Data;
model1.SetInput(0, m1, null);
model1.FitByMethodOfMoments();
model1.ComputeResidualsAndOutputs();
s2 = model1.Sigma;
sigma2 = model1.sigMa2;
covariance = model1.coVariance;
Console.WriteLine(model1.Description);
ARMAModel model2 = new ARMAModel(order, 0);
TimeSeries ts1 = new TimeSeries();
ts1 = zt[0];
model2.SetInput(0, ts1, null);
model2.FitByMLE(10, 10, 10, null);
model2.ComputeResidualsAndOutputs();
Console.WriteLine(model2.Description);
var sigmaVAR_X = sigma2[0];
var sigma_AR_X = model2.Sigma;
ARMAModel model3 = new ARMAModel(order, 0);
TimeSeries ts2 = new TimeSeries();
ts2 = zt[1];
model3.SetInput(0, ts2, null);
model3.FitByMLE(10, 10, 10, null);
model3.ComputeResidualsAndOutputs();
Console.WriteLine(model3.Description);
var sigmaVAR_Y = sigma2[1];
var sigma_AR_Y = model3.Sigma;
st = new StatTest();
var GC_1 = obs*Math.Log(sigma_AR_X / sigmaVAR_X);
var GC_2 = obs*Math.Log(sigma_AR_Y / sigmaVAR_Y);
var GC_3 = obs*Math.Log((sigmaVAR_X * sigmaVAR_Y) / ((sigmaVAR_X * sigmaVAR_Y) - model1.coVariance));
var GC_TL = GC_1 + GC_2 + GC_3;
st._FXToY = GC_1;
st._FYToX = GC_2;
st._FYX = GC_3;
st._FLinear = GC_TL;
Console.WriteLine("GC_XToY:"+st.computeHypothTest(0.10,st._FXToY,1));
Console.WriteLine("GC_YToX:"+st.computeHypothTest(0.10, st._FYToX, 1));
Console.WriteLine("GC_YX:"+st.computeHypothTest(0.10, st._FYX, 1));
Console.WriteLine("GC_FLinear:"+st.computeHypothTest(0.10, st._FLinear, 3));
Console.ReadKey();
}
public List<TimeSeries> generateTimeSeries(int obs)
{
TimeSeries ts1 = new TimeSeries();
TimeSeries ts2 = new TimeSeries();
List<TimeSeries> _lstTime = new List<TimeSeries>();
var current = new DateTime(2001, 1, 1);
_data = generateData(obs, 2);
for (int t = 0; t < obs; ++t)
{
ts1.Add(current, _data[t, 0], false);
ts2.Add(current, _data[t, 1], false);
current = new DateTime(current.AddDays(1).Ticks);
}
_lstTime.Add(ts1);
_lstTime.Add(ts2);
return _lstTime;
}
public double[,] generateData(int obs, int dim)
{
var normalSim = new StandardDistribution();
_data = new double[obs, dim];
for (int t = 0; t < obs; ++t)
{
double s1 = normalSim.NextDouble();
double s2 = normalSim.NextDouble();
_data[t, dim - 2] = s1;
_data[t, dim - 1] = s2;
}
return _data;
}
}
We can then use our open source Math library from Part I and wrap it into a class for getting the test statistics.
I still need to write an article on this...the methodology of specifying, estimating and forecasting linear and nonlinear time series models with agent based model design.
Oh well, I am getting there-rather read it than do it. I won't complain too much here.
Figure 3.
The StatTest
class:
public class StatTest
{
public double _FTest { set; get; }
public double _pvalue { set; get; }
public double _FXToY { set; get; }
public double _FYToX { set; get; }
public double _FYX { set; get; }
public double _FLinear { set; get; }
public string decision { set; get; }
public StatTest()
{
}
public string computeHypothTest(double pvalue, double _Fstat, int dgFreedom)
{
if (_Fstat > 0)
{
ChiSquareDistribution chi = new ChiSquareDistribution();
chi.SetDistributionParameters(dgFreedom);
double x = chi.ProbabilityDensity(_Fstat);
double pval = 1 - chi.CumulativeDistribution(_Fstat);
if (pval > pvalue)
{
decision = "Accept H0: No Granger Causality.";
}
else
{
decision = "Reject H0: There is Granger Causality.";
}
}
else
{
pvalue = 0;
decision = "Accept H0: No Granger Causality.";
}
return decision;
}
}
Of course, this can be extended to multiple multivariate relationships as well as place the above into an artificial intelligence framework as mentioned. In other words,
how many Broadmann areas are connected and at what frequency for a given real-time task? The code is at http://modelmaker3.codeplex.com/
to download.
References
Granger, C.W.J. (1969), "Investigating causal relations be econometric models and cross-spectral methods. Econometrica 37:424-438.
Pfaff, B. (2008), "VAR, SVAR and SVEC Models: Implementation Within R Package vars." Journal of Statistical Software 27(4)
at http://www.jstatsoft.org/v27/i04/.
History
- 5/21/2012 Code added to CodePlex.