Introduction
For many people, even for developers, issues related to linear algebra seem to be an obstacle in solving various other problems. To deal with this we search for ready-made tools that we are forced to treat as a black box. The purpose of this article is to show both theory and code implementation of the most common matrix and polynominal algebra cases.
Matrix
The most common definitions of matrix relete to its apperance. We describe the matrix as a set of numbers stored in a rectangular table:
$\begin{aligned} \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{bmatrix} \end{aligned}$
In fact, matrix is a function of two variables defined in the Cartesian product of two vectors:
$\begin{aligned} \begin{bmatrix} 1, ..., m \end{bmatrix} \times \begin{bmatrix} 1, ..., n \end{bmatrix} \end{aligned}$
So the matrix describes the relationship of the individual vector elements.
Things that describe the properties of the matrix are: trace, rank, determinant and characteristic polynominal. The typical operations on matrices are: transposition, scaling, addition, subtraction, multiplification and inversion.
Trace
The matrix trace is defined only for the square matrices. It is a sum of all elements placed on the matrix diagonal.
$\begin{aligned} \mathrm{trace}(A_{n \times n}) = \sum^{n}_{i=1} a_{i,i} \end{aligned}$
public Double Trace()
{
if (Rows == Columns)
{
Double Trace = 0;
for (int m = 0; m < Rows; m++)
{
Trace += coefs[m, m];
}
return Trace;
}
else
{
throw new Exception("Matrix must be square.");
}
}
Further, the matrix trace will be used in calculating matrix characteristic polynominal and inverse matrix.
Rank
The rank is one of the most significant matrix features. If we imagine that matrix is built with horizontal (in the row-oriented approach) or vertical (in the column-oriented approach) vectors, then rank answeres a question, how many of these vectors are lineary independent.
The rank of matrix is indementent of orientation we approach, i.e. row rank of the matrix A is equal to the column rank of this matrix.
The rank can be designated to every rectangular matrix, in general for non-zero matrix A in dimension of n x m, its rank is in the range:
$\begin{aligned} 1 \le \mathrm{rank}(\textbf{A}) \le \mathrm{MIN}(m, n) \end{aligned}$
Designating the matrix rank is equivalent to finding the minor with the greatest dimension (its dimension coresponds to the original matrix rank). Due to this, a non-zero square matrix A is full-rank if its determinant is not equal to zero.
However, the minor method is visually elegant, it is not effective when calculating ranks of larger matrices. In that cases we can use the Gauss elimination algorithm.
The crux of this method is to create matrix B from matrix A using only linear transformation on the rows (columns). In order for matrix A to have full rank, matrix B should be a tringular matrix wtih zero elements below its diagonal, where the sum of all elements in each row (or each column) is non-zero, as the rank of A is the number of non-zero-sum rows (columns).
Denote by A matrix with elements:
$\begin{aligned} \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n,1} & a_{n,2} & \cdots & a_{n,n} \end{bmatrix} \end{aligned}$
In the first step we multiply each element of the first row by
$\begin{aligned} c_{i,1} = \frac{a_{i,1}}{a_{1,1}} & & \mathrm{where} & & i = 2, 3, ..., n \end{aligned}$
and subtract the result of multiplification from each i-th row. We receive a new matrix A(1):
$\begin{aligned} \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ 0 & a_{2,2}^{(1)} & \cdots & a_{2,n}^{(1)} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & a_{n,2}^{(1)} & \cdots & a_{n,n}^{(1)} \end{bmatrix} \end{aligned}$
What was actually done, was eliminating all ai,1 elements besides a1,1.
In the second step we multiply each element of the new second row by
$\begin{aligned} c_{i,2} = \frac{a_{i,2}}{a_{2,2}} & & \mathrm{where} & & i = 3, 4, ..., n \end{aligned}$
and subtract the result of multiplification from each i row. We receive new matrix A(2) with eliminated all a(2)i,2 elements besides a(2)2,2.
public int Rank()
{
int rank = 0;
Double[,] mat = coefs;
for (int k = 0; k < Rows; k++ )
{
for (int i = k + 1; i < Rows; i++ )
{
Double c = mat[i, k] / mat[k, k];
for (int k1 = 0; k1 < Columns; k1++ )
{
mat[i, k1] = mat[i, k1] - mat[k, k1] * c;
}
}
Double sum = 0;
for (int i = 0; i < Columns; i++)
{
sum += mat[k, i];
}
if (sum != 0) { rank++; }
}
return rank;
}
Notice that in each k step of this algorithm we need to assume that element a(k)k,k is non-zero (in order to divide by it). However, if we find element a(k)k,k = 0, we can apply the method called the row pivoting (or column pivoting). We choose a first row (from the remaining once) with non-zero a(k)l,l (where l = k+1, k+2, ..., n) and replace position of k and l rows (or columns).
public int Rank()
{
int rank = 0;
Double[,] mat = coefs;
for (int k = 0; k < Rows; k++)
{
for (int i = k + 1; i < Rows; i++)
{
if (mat[k,k] == 0)
{
mat = RowPivot(mat, k);
}
Double c = mat[i, k] / mat[k, k];
for (int k1 = 0; k1 < Columns; k1++)
{
mat[i, k1] = mat[i, k1] - mat[k, k1] * c;
}
}
Double sum = 0;
for (int i = 0; i < Columns; i++)
{
sum += mat[k, i];
}
if (sum != 0) { rank++; }
}
return rank;
}
private Double[,] RowPivot(Double[,] matrix, int k)
{
for (int i = k + 1; i < matrix.GetLength(0); i++ )
{
if (matrix[i, i] != 0)
{
Double[] x = new Double[matrix.GetLength(1)];
for (int j = 0; j < matrix.GetLength(1); j++)
{
x[j] = matrix[k, j];
matrix[k, j] = matrix[i, j];
matrix[i, j] = x[j];
}
break;
}
}
return matrix;
}
As a quick check of our rank designation algorithm, let us consider following examples, where matrix A is built with lineary independent rows, in matrix B rows 1 and 3 are lineary dependent and matrix C starting with 0 element in first row:
$\begin{aligned} (\mathrm{Ex. 1}) & A = \begin{bmatrix} 1 & 2 & 7 \\ 3 & 4 & 1 \\ -1 & 3 & 1 \end{bmatrix} & (\mathrm{Ex. 2}) & B = \begin{bmatrix} 1 & 2 & 7 \\ 3 & 4 & 1 \\ 2 & 4 & 14 \end{bmatrix} & (\mathrm{Ex. 3}) & C = \begin{bmatrix} 0 & 2 & 7 \\ 3 & 4 & 1 \\ -1 & 3 & 1 \end{bmatrix} \end{aligned}$
Following, calculated with our algorithm, matrices in example 1 are:
$\begin{aligned} A^{(0)} = \begin{bmatrix} 1 & 2 & 7 \\ 3 & 4 & 1 \\ -1 & 3 & 1 \end{bmatrix} \end{aligned}$
$\begin{aligned} A^{(1)} = \begin{bmatrix} 1 & 2 & 7 \\ 0 & -2 & -20 \\ 0 & 5 & 8 \end{bmatrix} \end{aligned}$
$\begin{aligned} A^{(2)} = \begin{bmatrix} 1 & 2 & 7 \\ 0 & -2 & -20 \\ 0 & 0 & -42 \end{bmatrix} \end{aligned}$
The sum of all elements in each row is non-zero, so matrix A rank is 3 (full rank).
Following, calculated with our algorithm, matrices in example 2 are:
$\begin{aligned} B^{(0)} = \begin{bmatrix} 1 & 2 & 7 \\ 3 & 4 & 1 \\ 2 & 4 & 14 \end{bmatrix} \end{aligned}$
$\begin{aligned} B^{(1)} = \begin{bmatrix} 1 & 2 & 7 \\ 0 & -2 & -20 \\ 0 & 0 & 0 \end{bmatrix} \end{aligned}$
The number of rows, where the sum of all elements is non-zero, is 2. Matrix B rank is 2.
Following, calculated with our algorithm, matrices in example 3 are:
Step 0: Algorithm found out that element c1,1 is zero, so applied row pivot:
$\begin{aligned} C^{(0)} = \begin{bmatrix} 3 & 4 & 1 \\ 0 & 2 & 7 \\ -1 & 3 & 1 \end{bmatrix} \end{aligned}$
$\begin{aligned} C^{(1)} = \begin{bmatrix} 3 & 4 & 1 \\ 0 & 2 & 7 \\ 0 & 4.33 & 1.33 \end{bmatrix} \end{aligned}$
$\begin{aligned} C^{(2)} = \begin{bmatrix} 3 & 4 & 1 \\ 0 & -2 & -7 \\ 0 & 0 & -13.83 \end{bmatrix} \end{aligned}$
The sum of all elements in each row is non-zero, so matrix C rank is 3 (full rank).
Determinant
According to the formal definition, the matrix determinant is the function assigning to each square matrix An x n {R}, with elements from the commutative ring R, one element from this commutative ring R.
For our purposes, we will define matrix determinant as another function that describes matrix (in fact it describes matrix elements).
In symbolic mathematics, the determinant is calculated using the Leibniz formula (also named as the permutation definition of determinant):
$\begin{aligned} \mathrm{det}A = \sum_{s \in S_n} -1^{In(s)} \prod_{i=1}^{n} a_{i, s_{i}} \end{aligned}$
where Sn is a set of the all possible permutations of set {1, 2, ..., n}, In(s) is a number of possible inversions of permutation s.
Although, the definition above seems to be complicated, we used to use it for calculation of the determinant of 2 x 2 matrices. We have:
$\begin{aligned} \mathrm{det} A = \mathrm{det}\begin{bmatrix} a_{1,1} & a_{1,2}\\ a_{2,1} & a_{2,2} \end{bmatrix} = a_{1,1} a_{2,2} - a_{1,2}a_{2,1} \end{aligned}$
When calculating the determinant of 3 x 3 matrix, all simplifications were already made by Pierre Sarrus:
$\begin{aligned} \begin{matrix} \mathrm{det} A = \mathrm{det}\begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \\ a_{3,1} & a_{3,2} & a_{3,3} \end{bmatrix} = \\ = (a_{1,1}a_{2,2}a_{3,3} + a_{1,2}a_{2,3}a_{3,1} + a_{1,3}a_{2,1}a_{3,2}) - (a_{1,3}a_{2,2}a_{3,1} + a_{1,1}a_{2,3}a_{3,2} + a_{1,2}a_{2,1}a_{3,3}) \end{matrix} \end{aligned}$
In discrete mathematics, we commonly use the Laplace expansion recursive algorithm, where:
$\begin{aligned} \mathrm{det} A_{n \times n} = \sum_{j=1}^n a_{i,j} A_{i,j} \end{aligned}$
where i is a fixed number of row due to which we make expansion, ai,j is a matrix element with coordinates i,j and Ai,j is an algebraic complement of the element ai,j
The formula above is used recursively, until algebraic complement is in dimension 2 x 2 (then the determinant can be simply calculated by definition). The algebraic complement of element a_{i,j} is defined as:
$\begin{aligned} A_{i,j} = -1^{i+j} M_{i,j} \end{aligned}$
where Mi,j is a minor of the matrix A, created by removing row i and column j.
Code below differs from code in library attached, to better show how the algorithm works.
public Double MatrixDeterminant(Double[,] Matrix)
{
int M = Matrix.GetLength(0);
int N = Matrix.GetLength(1);
if (M == N)
{
Double Det = 0;
if (M == 1)
{
Det = Matrix[0, 0];
}
else if (M == 2)
{
Det = Matrix[0, 0] * Matrix[1, 1] - Matrix[0, 1] * Matrix[1, 0];
}
else
{
Double[,] AlgebraicComplement = new Double[M - 1, M - 1];
for (int m = 0; m < M; m++)
{
int a = 0;
for (int k = 1; k < M; k++)
{
int b = 0;
for (int l = 0; l < M; l++)
{
if (l != m)
{
AlgebraicComplement[a, b] = Matrix[k, l];
b++;
}
}
a++;
}
Det += Math.Pow(-1, m) * Matrix[0, m] * MatrixDeterminant(AlgebraicComplement);
}
}
return Det;
}
else
{
throw new Exception("Matrix must be square.");
}
}
Consider matrix A (example 1 from previous part):
$\begin{aligned} A = \begin{bmatrix} 1 & 2 & 7 \\ 3 & 4 & 1 \\ -1 & 3 & 1 \end{bmatrix} \end{aligned}$
Calculated determinant is 84. But let us recall a structure of this matrix after the Gauss elimination method:
$\begin{aligned} A^{(2)} = \begin{bmatrix} 1 & 2 & 7 \\ 0 & -2 & -20 \\ 0 & 0 & -42 \end{bmatrix} \end{aligned}$
If we multiply all elements on its diagonal we will also get 84. This is one of the properties of the triangular matrix - the determinant is equal to the product of the elements on the diagonal. Due to this, we can use the Gauss elimination method also for calculation of matrix determinant.
Characteristic polynominal
The characteristic polynominal of the square matrix An x n is defined as:
$\begin{aligned} p(x) = \mathrm{det} (A - x \textbf{1}) \end{aligned}$
where 1 is an identity matrix (ones on diagonal).
If we use polynominal notaion, we can rewrite p(x) as:
$\begin{aligned} p(x) = x^n + p_1 x^{n-1} + ... + p_{n-1} x + p_n \end{aligned}$
To calculate p coefficients we can use the Faddeev - Leverrier algorithm, where in each step we 'subtract' coefficient calculated in previous step:
$\begin{aligned} \begin{matrix} A^{(1)} = A & \to & p_1 = \frac{1}{1} \mathrm{trace} A^{(1)} \\ A^{(2)} = A (A^{(1)} - p_1 \textbf{1}) & \to & p_2 = \frac{1}{2} \mathrm{trace} A^{(2)} \\ \vdots & &\vdots \\ A^{(n)} = A (A^{(n-1)} - p_{n-1} \textbf{1}) & \to & p_n = \frac{1}{n} \mathrm{trace} A^{(n)} \end{matrix} \end{aligned}$
The implementation below differs from code in attached library:
public Double[] CharacteristicPolynominalCoefficients(Double[,] Matrix)
{
int M = Matrix.GetLength(0);
int N = Matrix.GetLength(1);
if (M == N)
{
Double[] Coeffs = new Double[M + 1];
Double[] CoeffsSorted = new Double[M + 1];
Double[,] B;
Double LastCoeff;
Coeffs[0] = 1;
B = Matrix;
LastCoeff = MatrixTrace(B);
Coeffs[1] = -LastCoeff;
for (int m = 2; m < M + 1; m++)
{
B = MatrixMultiplication(Matrix, MatrixSubtraction(B, MatrixScaling(LastCoeff, IdentityMatrix(B.GetLength(0)))));
LastCoeff = MatrixTrace(B) / m;
Coeffs[m] = -LastCoeff;
}
for (int m = 0; m < M + 1; m++)
{
CoeffsSorted[m] = Coeffs[M - m];
}
return CoeffsSorted;
}
else
{
throw new Exception("Input data matrix must be square.");
}
}
The roots of the function p(x) are called the eigenvalues of the matrix A, in the following part an algorithm to calculate polynominal roots will be shown
Transpose
The matrix transpose operation, creates new matrix which is the reflection of original matrix over its main diagonal. In other words, the transpose function swaps columns and rows.
$\begin{aligned} B = A^{T} & \to & \bigwedge_{i} \bigwedge_{j} b_{i,j} = a_{j,i} \end{aligned}$
public Matrix Transpose()
{
Double[,] _MatrixT = new Double[Columns, Rows];
for (int m = 0; m < Rows; m++)
{
for (int n = 0; n < Columns; n++)
{
_MatrixT[n, m] = coefs[m, n];
}
}
return new Matrix(_MatrixT);
}
Scaling
The matrix scaling multiplies each matrix element by given scalar:
$\begin{aligned} B = s \cdot A & \to & \bigwedge_{i} \bigwedge_{j} b_{i,j} = s \cdot a_{i,j} \end{aligned}$
public Matrix Scale(Double Coefficient)
{
Matrix result = new Matrix(Rows, Columns, 0);
for (int m = 0; m < Rows; m++)
{
for (int n = 0; n < Columns; n++)
{
result.SetElement(m, n, this.Element(m, n) * Coefficient);
}
}
return result;
}
Addition and subtraction
The matrix addition and subtraction operations are defined only for the matrices of the same dimensions and calculated as shown below:
$\begin{aligned} C_{m \times n} = A_{m \times n} \pm B_{m \times n} & \to & \bigwedge_{i} \bigwedge_{j} c_{i,j} = a_{i,j} \pm b_{i,j} \end{aligned}$
public Matrix Add(Matrix MatrixB)
{
if (Rows == MatrixB.Rows && Columns == MatrixB.Columns)
{
Matrix result = new Matrix(Rows, Columns, 0);
for (int n = 0; n < Rows; n++)
{
for (int m = 0; m < Columns; m++)
{
result.SetElement(n, m, (this.Element(n, m) + MatrixB.Element(n, m)));
}
}
return result;
}
else
{
throw new Exception("Matrices dimensions must be equal.");
}
}
public Matrix Sub(Matrix MatrixB)
{
if (Rows == MatrixB.Rows && Columns == MatrixB.Columns)
{
Matrix result = new Matrix(Rows, Columns, 0);
for (int n = 0; n < Rows; n++)
{
for (int m = 0; m < Columns; m++)
{
result.SetElement(n, m, (this.Element(n, m) - MatrixB.Element(n, m)));
}
}
return result;
}
else
{
throw new Exception("Matrices dimensions must be equal.");
}
}
Multiplification
The matrix multiplification operation is defined using the Cauchy multiplification formula as follows:
$\begin{aligned} C_{m \times p} = A_{m \times n} \cdot B_{n \times p} & \to & \bigwedge_{i: 1 \le i \le m} \bigwedge_{j: 1 \le j \le p} c_{i,j} = \sum_{k=1}^n a_{i,k} b_{k,j} \end{aligned}$
The operation can be applied only if number of columns in matrix A is equal to number of rows in matrix B.
public Matrix Multiply(Matrix MatrixB)
{
if (Columns != MatrixB.Rows)
{
throw new Exception("Number of columns in A matrix must be equal to number of rows in B matrix.");
}
else
{
Double[,] _MatrixAB = new Double[Rows, MatrixB.Columns];
for (int m_a = 0; m_a < Rows; m_a++)
{
for (int n_b = 0; n_b < MatrixB.Columns; n_b++)
{
_MatrixAB[m_a, n_b] = 0;
for (int n_a = 0; n_a < MatrixB.Rows; n_a++)
{
_MatrixAB[m_a, n_b] = _MatrixAB[m_a, n_b] + this.Element(m_a, n_a) * MatrixB.Element(n_a, n_b);
}
}
}
return new Matrix(_MatrixAB);
}
}
Inversion
By the definition matrix An x n is invertible if and only if there exists matrix B that satisfies equation:
$\begin{aligned} A B = B A = \textbf{1}_n \end{aligned}$
Matrix B is named the inverse of matrix A and usualy denoted by A-1. Not every matrix is inversible, to be invertible matrix A needs to be nonsingular - its determinant needs to be non-zero.
We can find many algorithm that calculates inverse matrix, but once we computed Faddeev - Leverrier method, we will use it once again, because inverse of matrix A can be designated using formula:
$\begin{aligned} A^{-1} = \frac{1}{p_n} (A^{(n-1)} - p_{n-1} \textbf{1}) \end{aligned}$
where A(n-1) is a matrix designated in n-1 step of algorithm, pn and pn-1 are the characteristic polynominal coefficients designated in n-1 and n step.
public Matrix Inverse()
{
if (Rows == Columns)
{
if (this.Determinant() != 0)
{
Matrix LastB;
Matrix NextB;
Matrix Inv = new Matrix(Rows, Columns, 0);
Double LastCoeff;
Double NextCoeff;
LastB = this;
LastCoeff = LastB.Trace();
for (int m = 2; m < Rows; m++)
{
LastB = this.Multiply(LastB.Sub(new Matrix(Rows, 1).Scale(LastCoeff)));
LastCoeff = LastB.Trace() / m;
}
NextB = this.Multiply(LastB.Sub(new Matrix(Rows, 1).Scale(LastCoeff)));
NextCoeff = NextB.Trace() / Rows;
Inv = LastB.Sub(new Matrix(Rows, 1).Scale(LastCoeff)).Scale(1 / NextCoeff);
return Inv;
}
else
{
throw new Exception("The matrix is not invertible over commutative ring.");
}
}
else
{
throw new Exception("Input data matrix must be square.");
}
}
As example, we will use once again matrix:
$\begin{aligned} A = \begin{bmatrix} 1 & 2 & 7 \\ 3 & 4 & 1 \\ -1 & 3 & 1 \end{bmatrix} \end{aligned}$
Calculated inverse matrix is:
$\begin{aligned} A^{-1} = \begin{bmatrix} 0.0119047619047619 & 0.226190476190476 & -0.30952380952381 \\ -0.0476190476190476 & 0.0952380952380952 & 0.238095238095238 \\ 0.154761904761905 & -0.0595238095238095 & -0.0238095238095238 \end{bmatrix} \end{aligned}$
According to the definition, product of the A A-1 or A-1 A should give us an identity matrix. What we received in practice is:
$\begin{aligned} A A^{-1} = \begin{bmatrix} 1 & 0 & -2.77555756156289E-17 \\ 0 & 1 & -8.32667268468867E-17 \\ 0 & 1.38777878078145E-17 & 1 \end{bmatrix} \end{aligned}$
Polynominals
The polynominal of single variable is a mathematical notation (also named as algebraic sum) for the sum of mononominals:
$\begin{aligned} \sum_{i=0}^n a_i x^i \end{aligned}$
where i, n are integers, ai and x can be complex. Above can be written as:
$\begin{aligned} a_n x^n + a_{n-1} x^{n-1} + ... + a_1 x + a_0 \end{aligned}$
The degree of the polynominal above is naturally n. When x is a set of values, we are dealing with the polynominal function:
$\begin{aligned} f(x) = a_n x^n + a_{n-1} x^{n-1} + ... + a_1 x + a_0 \end{aligned}$
Evaluation
The polynominal evaluation (or more precisely the polynominal function evaluation) is calculating f(x) value for defined value of x.
public dynamic Evaluate(Complex Value)
{
Complex sum = 0;
for (int i = 0; i < coefs.Length; i++)
{
sum += coefs[i] * Complex.Pow(Value, i);
}
if (sum.Imaginary == 0)
{
return sum.Real;
}
else
{
return sum;
}
}
Addition and subtraction
Adding (subtracting) two polynominal (with we same variable) produces new polynominal with degree of the greatest degree of the addition (subtraction) components. The coefficents coresponding to the same degree of x can be added (subtracted).
public Polynominal Add(Polynominal poly)
{
int M = this.Degree();
int N = poly.Degree();
int K = Math.Max(M, N);
Polynominal Poly1Ext = new Polynominal(K);
Polynominal Poly2Ext = new Polynominal(K);
Polynominal Add = new Polynominal(K);
for (int m = 0; m < M + 1; m++)
{
Poly1Ext.SetElement(m, this.Element(m));
}
for (int n = 0; n < N + 1; n++)
{
Poly2Ext.SetElement(n, poly.Element(n));
}
for (int k = 0; k < K + 1; k++)
{
Add.SetElement(k, Poly1Ext.Element(k) + Poly2Ext.Element(k));
}
return Add;
}
Scaling
The polynominal scaling is the operation defined:
$\begin{aligned} c \cdot (a_n x^n + a_{n-1} x^{n-1} + ... + a_1 x + a_0) = c \cdot a_n x^n + c \cdot a_{n-1} x^{n-1} + ... + c \cdot a_1 x + c \cdot a_0 \end{aligned}$
public Polynominal Scale(Complex Value)
{
Polynominal scale = new Polynominal(this.Degree());
for (int i = 0; i < this.Degree() + 1; i++)
{
scale.SetElement(i, this.Element(i) * Value);
}
return scale;
}
Polynominal by binominal division
Dividing polynominal by binominal is a popular method for degree reduction by reducing one known root of polynominal - we will use it when descibing algorithm for designating polynominal function roots.
To divide polynominal by binominal we will use Horner's method.
Denote by p(x) polynominal of nth degree:
$\begin{aligned} p(x) = a_n x^n + a_{n-1} x^{n-1} + ... + a_1 x + a_0 \end{aligned}$
and by b(x) binominal:
$\begin{aligned} d(x) = x - c \end{aligned}$
we can write that:
$\begin{aligned} p(x) = d(x) b(x) + r \end{aligned}$
where r is a reminder from division, and b(x) is a polynominal of degree n-1.
We have:
$\begin{aligned} a_n x^n + a_{n-1} x^{n-1} + ... + a_1 x + a_0 = (x - c)(b_{n-1} x^{n-1} + ... + b_1 x + b_0) + r \end{aligned}$
multipling right side of equation and comparing coefficients with left side we can see that:
$\begin{aligned} \begin{matrix} b_{n-1} = a_n \\ b_{n-2} = a_{n-1} + c b_{n-2} \\ \vdots \\ b_{n-k} = a_{n-k+1} + c b_{n-k} \\ \vdots \\ b_0 = a_1 + c b_1 \\ r = a_0 + c b_0 \end{matrix} \end{aligned}$
public Polynominal ByBinominalDivision(Complex BinominalRoot)
{
int M = this.Degree() + 1;
if (M > 1)
{
int N = M - 1;
Complex[] Quotient = new Complex[N];
Complex[] QuotientSorted = new Complex[N];
Complex[] CoeffsSorted = new Complex[M];
Complex Last;
for (int m = 0; m < M; m++)
{
CoeffsSorted[m] = this.Element(M - m - 1);
}
Last = CoeffsSorted[0];
Quotient[0] = Last;
for (int m = 1; m < N; m++)
{
Last = CoeffsSorted[m] + BinominalRoot * Last;
Quotient[m] = Last;
}
Complex Remainder = CoeffsSorted[M - 1] + BinominalRoot * Last;
for (int n = 0; n < N; n++)
{
QuotientSorted[n] = Quotient[N - n - 1];
}
return new Polynominal(QuotientSorted);
}
else
{
throw new Exception("Given coefficients number is not corresponding to correct polynominal structure.");
}
}
Derivative
The polynominal derivatice can be calculated as a sum of monominals derivative, which is a basic operation:
$\begin{aligned} \frac{d}{dx} a_n x^n = a_n \frac{d}{dx} x^n = a_n n x^{n-1} \end{aligned}$
public Polynominal Derivative()
{
int Degree = this.Degree();
int NumCoefs = Degree + 1;
Polynominal Derivative = new Polynominal(this.Degree()-1);
if (Degree > 0)
{
for (int i = 1; i < NumCoefs; i++)
{
Derivative.SetElement(i - 1, i * this.Element(i));
}
}
else
{
return new Polynominal(new Double[1] { 0 });
}
return Derivative;
}
Roots
For finding polynominal function roots we will use Laguerre's method, which is robust to disorder of multiple roots (in contrast to the method of Horner). The Laguerre's method is based on decreasing polynominal degree each time we find one root (using polynominal by binominal division). Roots found are also smoothed using original polynominal.
Recursively, we calculate next approximations of each root using formula:
$\begin{aligned} r_{i+1} = r_i - \frac{n p(r_i)}{ \frac{d}{dx}p(r_i) \pm \sqrt{(n-1)( (n-1)(\frac{d}{dx}p(r_i))^2 - n p(r_i) \frac{d^2}{dx^2}p(r_i) ) } } \end{aligned}$
where z0 is first approximation of root - we can use any random number.
We sign in denominator should be chosen to maximize its module.
public Complex[] Roots(Complex Seed, Double Accuracy)
{
int N = this.Degree();
Complex[] Roots = new Complex[N];
int degree = N;
Polynominal tmpoly = this;
for (int n = 0; n < N; n++)
{
while (true)
{
Complex _tmp0 = tmpoly.Derivative().Evaluate(Seed);
Complex _tmp1 = (degree - 1) * Complex.Pow(tmpoly.Derivative().Evaluate(Seed), 2);
Complex _tmp2 = degree * tmpoly.Evaluate(Seed) * tmpoly.Derivative().Derivative().Evaluate(Seed);
Complex _tmp3 = Complex.Pow((degree - 1) * (_tmp1 - _tmp2), 0.5);
Complex _tmp4 = MaximizeResultAbsolute(_tmp0, _tmp3);
Seed = Seed - (degree * tmpoly.Evaluate(Seed)) / _tmp4;
Complex result = tmpoly.Evaluate(Seed);
if (Complex.Abs(result) < Math.Abs(Accuracy))
{
Roots[n] = Seed;
tmpoly = tmpoly.ByBinominalDivision(Seed);
degree--;
Random _Random = new Random();
Seed = -10 + (_Random.NextDouble() * 15);
break;
}
}
}
return Roots;
}
private Complex MaximizeResultAbsolute(Complex Value1, Complex Value2)
{
if (Complex.Abs(Value1 + Value2) > Complex.Abs(Value1 - Value2))
{
return Value1 + Value2;
}
else
{
return Value1 - Value2;
}
}
For example, let us denote p(x) as:
$\begin{aligned} p(x) = x^3 + 3x^2 -3x -3 \end{aligned}$
We can calculate that p(x) has got three real roots: 1.2618022452599722, -3.601679131883154 and -0.660123113376818.
Using our algorithm with seed 2 and accuracy 0.0001 we receive following roots (complex):
x1 = 1.26180224809477 + 0 i
x2 = -3.60167913003098 - 1.46390615026496E-15 i
x3 = -0.660123118063785 + 5.44361994589775E-16 i
Using the code
All algorithms presented in this article are implemented in attached source code. To start working with library simply declare its namespace:
using MatrixPolynominal;
Creating new matrix and polynominal:
Matrix c = new Matrix(new Double[3, 3] { { 1, 2, 7 }, { 3, 4, 1 }, { -1, 3, 1 } });
Polynominal p = new Polynominal(new Double[4] {-3, -3, 3, 1});
Each time we want to print matrix or polynominal on console, we just call method Print:
c.Print();
p.Print();
Matrix operations
Matrix c = new Matrix(new Double[3, 3] { { 1, 2, 7 }, { 3, 4, 1 }, { -1, 3, 1 } });
Matrix c1 = new Matrix(3, 3, 10);
Matrix c2 = new Matrix(3, 1);
Matrix result;
result = c.Transpose();
result = c.Scale(10);
result = c.Multiply(new Matrix(new Double[3, 3] { { 12, 12, 17 }, { -3, -4, -1 }, { 0, 0, 1 } }));
result = c.Inverse();
result = c.Add(new Matrix(new Double[3, 3] { { 12, 12, 17 }, { -3, -4, -1 }, { 0, 0, 1 } }));
result = c.Sub(new Matrix(new Double[3, 3] { { 12, 12, 17 }, { -3, -4, -1 }, { 0, 0, 1 } }));
int rank = c.Rank();
Double det = c.Determinant();
Polynominal cp = c.CharacteristicPolynominal();
Double el = c.Element(1, 1);
c.SetElement(1,1, el);
Polynominal operations
Polynominal p = new Polynominal(new Double[4] {-3, -3, 3, 1});
Polynominal p1 = new Polynominal(new Double[4] { 1, 2, 3, 4 });
Polynominal result;
result = p.Add(p1);
result = p.Sub(p1);
result = p.Derivative();
result = p.ByBinominalDivision(10);
Complex value = p.Evaluate(10);
int degree = p.Degree();
result = p.Scale(4);
Complex el = p.Element(2);
p.SetElement(2, el);
Complex[] x = p.Roots(2, 0.0001);