Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / VB10

Modeling the Biochemical System Using VB

5.00/5 (9 votes)
6 Oct 2013CPOL9 min read 29.4K   656  
Mathematical method of S-system equation to simulate a biochemical network system

Introduction

The systems biology is a subject that studies the biology system and the behavior which comes from the interaction between those components in this system. The major work in the systems biology study is modeling the biological system and then using a program to simulate the model. Because I am at hard working on the regulation signal transduction network analysis in bacterial Xanthomonas campestris pathovar carnpestris 8004 of my Microbial Genetics research job, and I need a program to make the network simulation and data analysis. So I rewrite the program PLAS which was written by Antonio E.N.Fereira base on the research job of E.O.Voit for build up my own simulation system in the feature and post my coding job here hopes it can helps other biological researchers.

Although PLAS have no LINUX edition currently but the PLAS software works fine on LINUX platform using wine program.

PLAS software website: http://enzymology.fc.ul.pt/software/plas/

Background

A briefly introduction to the biochemical reaction system and S-System function

Consider that A biochemical system has N(N>=1) kinds substrate that involved in the reaction network, and the temperature and volume of the system won’t be changed. So that we can using the vector X(t) to represent that the state of this system in any time t:

X(t)=(x<sub>1</sub>(t), x<sub>2</sub>(t), ..., X<sub>N</sub>(t))

x<sub>i</sub>(t) represents the number or the concentration a kind of substrate(i=1,2,…,N).

According to the view from the work of E.O.Voit, a reaction procedure can be expressed as S-system function:

x<sub>i</sub><sup>'</sup>= α<sub>i</sub>∑(x<sub>j</sub>)<sup>gij</sup>-β<sub>i</sub>∑(x<sub>j</sub>)<sup>hij</sup>

x<sub>j</sub> represent the concentration of a kind of substrate, parameters α<sub>i</sub> , <code>β<sub>i</sub> , g<sub>ij</sub> and <span lang="EN-US">h<sub>ij</sub></span><span lang="EN-US"> </span>are represent the that associated with the reaction x<sub>i</sub>', they were all could achieved from the laboratory experiment work, and the x<sub>i</sub>' is also represent the generation ratio of the substrate x<sub>i</sub> .

So that a reaction system like

x<sub>1</sub> -> x<sub>2</sub> -> x<sub>3</sub>

Such a chain reaction can be express as s-system function:

Image 1

Because the left side of each equation x<sub>i</sub>' is represent as the substrate generation ratio, so that after we calculate the value of the right side of the equation, and then multiply it with the time unit, then we get the number or the concentration change value of that substrate. And then we get the new concentration of the substrate like the expression below:

x<sub>i</sub>=x<sub>i</sub>+x<sub>i</sub>'*△t

So that we could use the new concentration to calculate other substrate in the next iteration loop.

All of this knowledge and more detail information you can find out in the book <Computational Analysis of Biochemical Systems> that E.O.Voit wrote or his scientific research articles in the early year.

Using the code

Build up the biochemical network system simulator using VB

Elements in the biochemical system

Elements in the biochemical system can be divided into 3 types: Substrate, reaction, and disturbing of the system.

Substrate

A substrate you can treat it as a node in the biochemical system network. And it gets two main attribute to stand for this node: Name and concentration or amount value. The substrate class we define in the model just for the storage of the calculation result.

Its data structure definition just simple as you can see:

VB.NET
Public Class Var
 
    <Xml.Serialization.XmlAttribute> Public Name As String
    <Xml.Serialization.XmlAttribute> Public Value As Double
    <Xml.Serialization.XmlAttribute> Public Title As String
    <Xml.Serialization.XmlAttribute> Public Comment As String
 
     Public Overrides Function ToString() As String
        If String.IsNullOrEmpty(Comment) Then
            Return String.Format("{0}={1}", IIf(Len(Title) > 0, Title, Name), Value)
        Else
            Return String.Format("{0}={1}; //{2}", IIf(Len(Title) > 0, Title, Name), Value, Comment)
        End If
    End Function
End Class   

Equation

An equation stands for a biochemical reaction, and it also means the interaction between the substrates (or the system components). S-equation is a kind of chemical rate equation and uses it for the concentration change rate of a substrate. So that the class equation can basically define in two properties:

VB.NET
Public Class Equation
    <Xml.Serialization.XmlAttribute> Public Name As String
    <Xml.Serialization.XmlAttribute> Public Expression As String
End Class 
And it has a method to calculate the value of the equation expression:
VB.NET
Public Function Elapsed() As Boolean
    Call Me.sBuilder.Clear()
    sBuilder.Append(Expression)
    For Each e In Kernel.Vars 'Replace the name using the value
        Call sBuilder.Replace(e.Name, e.Value)
    Next
    Var.Value += (Microsoft.VisualBasic.Mathematical.Expression.Evaluate(sBuilder.ToString) * 0.1)
    Return True
End Function      

The statement

VB.NET
Var.Value += (Microsoft.VisualBasic.Mathematical.Expression.Evaluate(sBuilder.ToString)
* 0.1) 

in this function is just do the iteration calculation: calculate the change rate of the substrate using the equation expression an then multiply it with the elapsed time to get the concentration change value, finally using the change value to modify the current substrate concentration.

Disturb

The disturbing of a biochemical system is an important research method as it can let us get the important pathway or the reaction in the system network. And also we can check a pathway or reaction is exist or not through compare the calculation result or disturbing result in the model with the result that come from the laboratory experiment.

VB.NET
Public Class Disturb
    Public Enum Types
        Increase
        Decrease
        ChangeTo
    End Enum
    ''' <summary>
    ''' The name Id of the target.
    ''' (目标的名称)
    ''' </summary>
    ''' <remarks></remarks>
    <Xml.Serialization.XmlAttribute> Public Id As String
    ''' <summary>
    ''' The start time of this disturb.
    ''' (这个干扰动作的开始时间)
    ''' </summary>
    ''' <remarks></remarks>
    <Xml.Serialization.XmlAttribute> Public Start As Double
    ''' <summary>
    ''' The interval ticks between each kick.
    ''' (每次干扰动作执行的时间间隔)
    ''' </summary>
    ''' <remarks></remarks>
    <Xml.Serialization.XmlAttribute> Public Interval As Double
    ''' <summary>
    ''' The counts of the kicks.
    ''' (执行的次数)
    ''' </summary>
    ''' <remarks></remarks>
    <Xml.Serialization.XmlAttribute> Public Kicks As Integer
    <Xml.Serialization.XmlAttribute> Public DisturbType As Types
    <Xml.Serialization.XmlAttribute> Public Value As Double
End Class 
What a disturb object have done to thewhole system in the system run time? The disturbing modifies the substrateconcentration directly on a specific time as the substrate concentration can representthe state of the whole system. And the disturbing have 3 type of modifymethods:

VB.NET
Public Shared ReadOnly Methods As System.Func(Of Double, Double, Double)() = {
    Function(Var As Double, Delta As Double) Var + Delta,
    Function(Var As Double, Delta As Double) Var - Delta,
    Function(Var As Double, Delta As Double) Delta}

    Public Enum Types
        Increase
        Decrease
        ChangeTo
    End Enum   

The disturbing was managed by the Kicks object, it gets two list objects, one is for the disturbing is not running and another is for running disturbing object.

VB.NET
''' <summary>
''' Standing by
''' </summary>
''' <remarks></remarks>
<Xml.Serialization.XmlIgnore> Friend PendingKicks As List(Of Disturb)
''' <summary>
''' Active object.
''' </summary>
''' <remarks></remarks>
<Xml.Serialization.XmlIgnore> Friend RunningKicks As List(Of Disturb)


Built up the system module

Script define format: Declaration syntax are all write in the format like <Keyword> Expression, comment line are those line first word is not in keyword list:

Keyword

Information

Syntax

Example

RXN

Declare a s-equation

RXN <Substrate>=<Expression>

RXN X1=10*X5-5*X1^0.5

FUNC

Declare a function

FUNC <Name> <Expression>

FUNC f x+y^2

INIT

Set the initial concentration value of a substrate

INIT <Substrate>=<value>

INIT X1=2

CONST

Declare a constant

CONST <Name> <value>

CONST beta1 30

NAMED

Set the displaying title of a substrate, optionally

NAMED <Substrate> <Title>

NAMED X1 ATP

STIMULATE

Set up a disturb in the system

STIMULATE OBJECT <Substrate> START <Start Time> KICKS <Kicks count> INTERVAL <Kicks Interval> VALUE <[Type]value>

STIMULATE OBJECT X3 START 45 KICKS 30 INTERVAL 7 VALUE 5

FINALTIME

Set up the running time of the system

FINALTIME <value>

FINALTIME 10

TITLE, COMMENT

Model property define, optionally

<Keyword> <VALUE>

TITLE EXAMPLE SCRIPT

Here I define a Script class to compile the script file to an available model to the kernel:

Image 2

And here is an example script file:

C#
/* This is a example script */
/* S-Equations */
RXN X1=10*X5*X3^-0.8-5*X1^0.5
RXN X2=5*X1^0.5-10*X2^0.5
RXN X3=2*X2^0.5-1.25*X3^0.5
RXN X4=8*X2^0.5-5*X4^0.5
// FUNC f 0.2*x+y^0.8
/* Substrate initial value */
// ATP comments ......
INIT X1=2
INIT X2=0.25
INIT X3=0.64
// BLA BLA BLA COMMENTS ......
INIT X4=0.64
INIT X5=0.5
/* Stimulates */
/* Kick  specific
times with add a value */
// Example comment 1
STIMULATE OBJECT X1 START 4 KICKS 1 INTERVAL 1 VALUE 0
// Example comment 2
/* STIMULATE OBJECT X2 START 5 KICKS -1 INTERVAL 2.5
VALUE --2 */
// NULL
/* STIMULATE OBJECT X3 START 45 KICKS 30 INTERVAL 7 VALUE
5 */
NAMED X1 ATP
NAMED X2 G6P
TITLE EXAMPLE SCRIPT
COMMENT NO COMMENT
FINALTIME 10 

This script modeling such a branch and feedback pathway:

Image 3

Design the simulator kernel

The kernel class is tiny and easily to extend, an array to hold the system state and an expression array to alter the system state, a module to action the disturbing and a module to collecting the output result, those 4 component were almost consist a complete kernel.

VB.NET
''' <summary>
''' The simulation system kernel.
''' </summary>
''' <remarks></remarks>
Public Class Kernel
    ''' <summary>
    ''' The current ticks since from the start of running.
    ''' (从运行开始后到当前的时间中所流逝的内核循环次数)
    ''' </summary>
    ''' <remarks></remarks>
    Friend RtTime As Double
    Friend Model As Model
    ''' <summary>
    ''' Data collecting
    ''' </summary>
    ''' <remarks></remarks>
    Dim DataAcquisition As New DataAcquisition
    ''' <summary>
    ''' Object that action the disturbing
    ''' </summary>
    ''' <remarks></remarks>
    Public Kicks As Kicks
    ''' <summary>
    ''' Store the system state.
    ''' </summary>
    ''' <remarks></remarks>
    Public Vars As Var()
    ''' <summary>
    ''' Alter the system state.
    ''' </summary>
    ''' <remarks></remarks>
    Public Channels As Equation()   

The kernel gets a method named Run to action the biochemical system simulation: accumulate the system running time from the start zero to the final time and the iteration calculation at each loop. And yes, actually the state change of the whole system is the iteration between each loop.

VB.NET
Public Sub Run()
    For Me.RtTime = 0 To Model.FinalTime Step 0.1
        Call Tick()
    Next
End Sub

''' <summary>
''' The kernel loop.(内核循环)
''' </summary>
''' <remarks></remarks>
Public Sub Tick()
    Dim Query As Generic.IEnumerable(Of Boolean) = From e As Equation In Channels Select e.Elapsed '
    Call DataAcquisition.Tick()
    Call Kicks.Tick()
    Query = Query.ToArray
End Sub  

In fact, you also can get a parallel calculation edition of this kernel just needs simply modify the LINQ query statement as:

VB.NET
Dim Query As Generic.IEnumerable(Of Boolean) = From e As Equation In Channels.AsParallel Select e.Elapsed '  

But the parallel is not just always make the things more efficient, sometimes it may slow down the system running because the system needs to deal with more things. Or the parallel sometimes maybe make the calculation data not so good.

Output the simulation result

The kernel gets a data acquisition class object to collecting the simulation result and using the CSV file to store the data. And you can get the csv wrapper class in the namespace <span lang="EN-US">Microsoft.VisualBasic.DataVisualization.Csv.</span><span lang="EN-US">File</span>

VB.NET
Public Class DataAcquisition 
    Dim _dataPackage As New Microsoft.VisualBasic.DataVisualization.Csv.File
    Dim Kernel As Kernel

    Public Sub Tick()
        Dim Row As New Microsoft.VisualBasic.DataVisualization.Csv.File.Row
        Call Row.Add(Kernel.RtTime)
        For Each Var As Var In Kernel.Vars
            Row.Add(Var.Value)
        Next
        _dataPackage.AppendLine(Row)
    End Sub

    Public Sub Save(Path As String)
        Call _dataPackage.Save(Path)
    End Sub

    Public Sub [Set](Kernel As Kernel)
        Dim Row As New Microsoft.VisualBasic.DataVisualization.Csv.File.Row
        Me.Kernel = Kernel
        Call Row.Add("Elapsed Time")
        For Each Var As Var In Kernel.Vars
            If Not String.IsNullOrEmpty(Var.Title) Then
                Call Row.Add(String.Format("""{0}""", Var.Title))
            Else
                Call Row.Add(Var.Name)
            End If
        Next
        Call _dataPackage.AppendLine(Row)
    End Sub

    Public Shared Function [Get](e As DataAcquisition) As Microsoft.VisualBasic.DataVisualization.Csv.File
        Return e._dataPackage
    End Function
End Class 

Testing

Here I am using an example system to test the program: Hull et al benchmarks / nonstiff systems.

Here is how the system defines:

ASM
RXN z1=2*(z1-z1*z2)
RXN z2=-1*(z2-z1*z2)
INIT z1=2
INIT z2=3
FINALTIME 20 

Put this model into a text file, and then using the code to run this system:

VB.NET
Sub Main()
    Dim CSV = Kernel.Run(Script.Compile("./Hull.txt"))
    CSV.Save("/home/xieguigang/Documents/Hull.csv")
End Sub  

And then we open the CSV data file that we get from the calculation, open in KingSoft Office or OpenOffice, using the data to draw a scatter graph.

Image 4

From the simulation result that we can see, the concentration of two substrate z1 and z2 were change in period time, and it get a nice period.

Points of Interest

The s-system was using a linear equation collection to simulate a non-linear complexes system. Although the complexes system was simplified by us, but we can still find some interesting attribute through the simulation: The complexes system is sensitive to the initial value of each component.

As we can change the initial value of the component z1 in the hull system, and then run the modified model to see what we get:

Image 5

Z1=1

Image 6

Z1=0

Image 7

Z1=-1

From the change of the z1 initialize value we can see that the system start to lose the ability of change the component value in period time. And the most interesting thing is that when the z1=1, the system has the attribute of period at the first time, but the thing goes crazy at later: the continuing system calculation error accumulation finally change the whole system behavior. And we also know this phenomenon its famous name: The butterfly effect, the accumulated system error will finally change the whole system behavior.

Does the biochemical system do the same thing?

Although the living system is also a complexes system, but the life system gets the ability to restore itself from the disruption (We make the modifications of the system initial value that is a disruption to the normal state), if the disruption is not so deadly. The systems biology says that, the living system gets this property: Emergent, Robustness, Complexity and Modularity. So from these points of view, we could conclude that the biological living system gets a lot of system error if we doing the calculation, and the biological system is not sensitive to the precise values of biochemical parameters because of its robustness, it can adjust itself from the error, but the mathematical equation do not. Why the biological system not so sensitive, because it have found the way to restore itself from the system error or to avoid the system crash, the regulation network, which is really complexity. The regulation network is a real complexes system and the mathematical equation is a fake one, that is the reason why.

From this point of view that means the equations is not enough to simulate a complex system. Although the mathematical equation can gives the numeric result to us, but this numeric result just the calculation result of a simplified complexes system.

So, based on the foundation work of E.O.Voit and the concept & work mechanism of the S-system, we create a upgrade version of the PLAS system, call it Object-S, not only the mathematical equations in this system, and also the bio-macromolecule object, interaction relationship and cell event integrated in this model. And I call this new systems biology modeling project as the Genetic Clock Initiative project (GCI) as the MICROSFT also have a bioinformatics research project for .NET named Microsoft Biology Initiative (MBI) or Microsoft Biology Foundation (MBF).

This program was developing on Ubuntu 13.04(mono 2.1) and successfully debug and test on the Ubuntu LINUX and Windows 8 Enterprises.

Image 8

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)