Introduction
In this article we will look at markov models and its application in classification of discrete sequential data.
- Markov processes are examples of stochastic processes that generate random sequences of outcomes or states according to certain probabilities.
- Markov chains can be considered mathematical descriptions of Markov models with a discrete set of states.
- Markov chains are integer time process \(X_n,n\ge 0\) for which each random variable \(X_n\) is integer valued and depends on the past only through most recent random variable \(X_{n-1}\) for all integer \(n\geq 1\).
- \(X_n,n\in N\) is a discrete Markov chain on state space \(S={1,\ldots,M}\)
- At each time instant t,The system changes state ,and makes a transition.
- The markov chains follow the markovian and stationarity property.
- For a first order markov chain,the markov property states that the state of the system at time \(t+1\) depends only on the state of the system at time \(t\). The markov chain is also said to be memoryless due to this property.
\begin{eqnarray*} & Pr(X_{t+1} = x_{t+1} |X_{t} = {x_1 \ldots x_t} = Pr(X_{t+1} = x_{t+1} |X_{t} = x_t) \end{eqnarray*}
- A stationarity assumption is also made which implies that markov property is independent of time.
\begin{eqnarray*} & Pr(X_{t+1} = x_i | X_{t} = x_j) = P_{i,j} & \text{for $\forall$ t and $\forall i,j \in {0 \ldots M}$} \end{eqnarray*}
- Thus we are looking at processes whose sample functions are sequence of integers between \({1 \ldots M}\).
- Thus markov process is parameterized by transition probability \(P_{ij}\) and intital probability \(P_{i0}\)
- Markov chains can be represented by directed graphs,where each state is represented by a node and directed arc represents a non zero transition probability.
- If a markov chain has M states then transition probability can be represented by a \(MxM\) matrix.
\begin{eqnarray*} &T =\begin{bmatrix} P_{11} & P_{22} & \ldots &P_{1M} \\ P_{21} & P_{22} & \ldots & P_{2M} \\ \ldots \\ P_{M1} & P_{M2} & \ldots & P_{MM} \\ \end{bmatrix} \\ &\sum_{j} P_{ij} = 1 \end{eqnarray*}
- The matrix T is stochastic matrix where elements in each row sum to 1
- This implies that it is necessary for transition to occur from present state to one of the M states.
- The probability of sequence being generated by markov chain is given by
\begin{eqnarray*} &P({X}) = \pi(x_0)*\prod_{t=1}^T p(x_t | x_{t-1}) \\ & \text{$p(x_t | x_{t-1})$ is the probability of observing the sequence $x_t$ at} \\ \end{eqnarray*}
time instant t given the present state is \(t-1\)
- Let us consider a 2 models with following initial transition and probability matrix.
\begin{eqnarray*} & \pi_1 = \begin{bmatrix} 1 & 0 & 0 \\ \end{bmatrix} \\ & T_1 = \begin{bmatrix} 0.6 & 0.4 & 0 \\ 0.3 & 0.3 & 0.4 \\ 0.4 & 0.1 & 0.5 \\ \end{bmatrix}
& \pi_2 = \begin{bmatrix} 0.1 & .5 & 0.4 \end{bmatrix} \\ & T_2 = \begin{bmatrix} 0.9 & 0.05 & 0.05 \\ 0.3 & 0.1 & 0.6 \\ 0.3 & 0.5 & 0.2 \\ \end{bmatrix} \end{eqnarray*}
Sequence Classification
- The sequence generate from these two markov chains
\begin{eqnarray*} & S_1= \begin{bmatrix} 1 & 2 & 1 & 2 & 3 \\ 3 & 3 & 2 & 1 & 1 \\ \end{bmatrix} \\ & S_2= \begin{bmatrix} 3 & 2 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ \end{bmatrix} \end{eqnarray*}
- In sequence 1 since \(P_{13}=0\), we can observe there is no transition from 1 to 3 and dominant transition is expected to be from \(1->1\).
- In sequence 2 since dominant transition is from \(1->1\) we can observe a long sequence of 1's.
- We can also compute the probability that the sequence has been generate from a given markov process.
- The sequence 1 has probability of \(8.6400e^{-05}\) from the 1st model and \( 2.4300e^{-07}\) from second model.
- The sequence 2 has probability of 0 being generate from 1st model and 0.00287 from 2nd model.
- Thus if we have sequence and know it is being generate from 1 of 2 models we can always predict the model the sequence has been generated from by choosing the model which generates the maximum probability.
- Thus we can use markov chain for sequence modelling and classification.
01
07 class markovChain
08 {
09 public:
10 markovChain(){};
11
12
15 MatrixXf _transition;
16 MatrixXf _initial;
17
18
24 void setModel(Mat transition,Mat initial)
25 {
26 _transition=EigenUtils::setData(transition);
27 _initial=EigenUtils::setData(initial);
28 }
29 void setModel(Mat transition,Mat initial)
30 {
31 _transition=EigenUtils::setData(transition);
32 _initial=EigenUtils::setData(initial);
33 }
34
35
42 float computeProbability(vector<int> sequence)
43 {
44 float res=0;
45 float init=_initial(0,sequence[0]);
46 res=init;
47 for(int i=0;i<sequence.size()-1;i++)
48 {
49 res=res*_transition(sequence[i],sequence[i+1]);
50 }
51 return res;
52
53 }
54
55 }
Generating a Sequence
- The idea behind generating a sequence from a markov process is to use a uniform random number generator.
- For each row of initial probability or transition matrix select state which is most likely.
- For example if the row contains values \([0.6,0.4,0]\)
- If a uniform random value generates a value between 0 and 0.6 then state 0 is returned
- If a random value between 0.6 and 1 is generated then state 1 is returned.
01
07 int initialRand(MatrixXf matrix,int index)
08 {
09
10 float u=((float)rand())/RAND_MAX;
11 cerr << u << endl;
12 float s=matrix(0,0);
13 int i=0;
14
15
16 while(u>s & (i<matrix.cols()))
17 {
18 i=i+1;
19 s=s+matrix(index,i);
20 }
21 return i;
22 }
- First step is to use the above method to select a inital state of matrix by passing the initial probability matrix as input.
- Next random state will be selected from the transition probability by passing the transition probability matrix as input.
01
08 vector<int> generateSequence(int n)
09 {
10
11 vector<int> result;
12 result.resize(n);
13 int i=0;
14 int index=0;
15
16 int init=initialRand(_initial,0);
17 result[i]=init;
18 index=init;
19 for(i=1;i<n;i++)
20 {
21
22 index=initialRand(_transition,index);
23 result[i]=index;
24 }
25 return result;
26 }
Code
The code can be found in https://github.com/pi19404/OpenVision in files {ImgML/markovchain.cpp} and {ImgML/markovchain.hpp} . http://www.codeproject.com/Articles/808292/Markov-chain-implementation-in-Cplusplus-using-Eig http://pi-virtualworld.blogspot.ca/2014/04/markov-chain-implementation-in-c-using.html