Download Links
The entire source code of the Shoal Shell language can be download from SourceForge SVN server:
svn checkout svn://svn.code.sf.net/p/shoal/Source/ shoal-Source
Testing example source code in this article:
Related links about Shoal Shell and hybrids programming:
Introduction
Doing the hybrid scripting between VB and R language is painful when you read the calculation data of the R expression, so I wanted to develop a simple wrapper operation to do this data conversion job automatically.
In my recent laboratory scientific research job, I wanted to analyse the gene expression regulation signal from the virtual cell real-time gene chip data. And the R version of the wavelets library could finish this job perfectly, so the code in this article made this hybrid programming happy and simple.
Wavelets analysis using the VB/R hybrids programming code example in this article shows the gene expression regulation signal changes in the bacteria genome.
Picture 1. Steps overview of the Vb/C#/R hybrids programming
Using the Code
Quote:
Steps overviews of the hybrids programming:
1. Create mapping between the .NET class object property and the S4Object attribute
2. R expression evaluation
3. Serialize the R symbolic expression into a.NET object instance.
So, that’s it, just 3 simple steps for you to hybrid programming between the VB/C# and R language. Let’s learn how to step by step.
1. Create Mapping Between the .NET Class Object Property and the S4Object Attribute
This step is the step of creating the schema mapping between the R object and your .NET object, as the same as the XML serialization, before you are going to create an XML document using the XML serialization, you should define a class object to describe the document XML format; after the type definition, then you will be able to create an XML document.
So that in this steps are the same as how you did in the XML serialization, but the difference between the XML serialization and this R object serialization is that we are just using a different custom attribute.
Before we create the mapping, let’s learn the types in R language:
In my opinion, the R object can be divided into 3 types:
S4Object
, the s4object
is just like the class object in .NET language. The property in a .NET object is equal to the s4object
attribute (or slot) in the R language. The main function in this article’s code is to implement the mapping between our .NET class object and the R s4object
. Function
, the function object in R language is just like the lambda expression or delegate in.NET language, the declaration of the function in R is just like the lambda expression declaration in.NET. - Generic vector, the generic vector is the most used object in R language because almost all of the objects in R language are vectors. Like the array or list in .NET, the vector can be a property (or attribute) of an
s4object
in R language and it can also consist of a collection of s4object
.
So, as you can see in .NET language, our class object is equal to the s4object
in R language, so that the mapping we created in these steps is on the class
property. The mapping between the s4object
attribute and the property in .NET class is using the DataFrameColumnAttribute
, it is in the namespace of Microsoft.VisualBasic.ComponentModel.DataSourceModel
, as you can see from the class definition of the customers attribute DataFrameColumnAttribute
, this attribute only can be applied on the property or field:
Namespace ComponentModel.DataSourceModel
<AttributeUsage(AttributeTargets.[Property] Or AttributeTargets.Field, Inherited:=True,
AllowMultiple:=False)> _
Public Class DataFrameColumnAttribute : Inherits Attribute
Here is an example code to create the mapping using this attribute:
Imports Microsoft.VisualBasic.ComponentModel.DataSourceModel
Public Class Filter
<DataFrameColumn> Public Property L As Integer
<DataFrameColumn("level")> Public Property level As Integer
<DataFrameColumn("h")> Public Property h As Double()
<DataFrameColumn("g")> Public Property g As Double()
<DataFrameColumn("wt.class")> Public Property wtclass As String
<DataFrameColumn("wt.name")> Public Property wtname As String
<DataFrameColumn("transform")> Public Property transform As String
<DataFrameColumn("class")> Public Property [class] As String
End Class
as you can see, the first property:
<DataFrameColumn> Public Property L As Integer
Their mappings have no column name, so that when we create a mapping, the serializes will be using its property name as the mapping name automatically.
The mapping needs a name
property because some attribute in the R s4object
is illegal in.NET language, such as wt.class
in .NET property name is not allowed, so that you can use the DataFrameColumn
mapping attribute to accomplish this job.
2. R Expression Evaluation
We are going to get result from R using RDotNET
; this library is the most perfect solution by which we can implement the hybrid programming between our VB/C# .NET language and the R language.
You can download the RDotNET
library from codeplex home page:
Just two simple steps to hybrid programming between the .NET language and the R language:
First, start the R engine services, for example:
If Not String.IsNullOrEmpty(R_HOME) Then
Wavelets.R = RDotNET.REngine.StartEngineServices(R_HOME)
Else
Wavelets.R = RDotNET.REngine.StartEngineServices
End If
Call Wavelets.R.Library(PackageName:="wavelets")
Start a R engine services needs a R_HOME
value which is the directory where your R program is installed, such as the default location of the R installer.
C:\Program Files\R\R-3.1.3\bin
If your R program is properly installed on your computer, then the RDotNET
can search for the R_HOME
automatically based on the registry value of the R program, then you can just using the non-parameter version of the RDotNET.REngine.StartEngineServices
to create instance. If not, then you can use the RDotNET.REngine.StartEngineServices(R_HOME)
to manually setup the R install location.
After you have created a R engine services instance using RDotNET
, then you can code in your .NET program. The thing to note in your hybrid programming is that many of the analysis programs in R are not originally included in the base package, so that before you are going to run the program, you should install the required R package in R terminal. When you have finished and successfully installed the R package, then you can use the Library
function in the REngine
to load the required library package.
Call Wavelets.R.Library(PackageName:="wavelets")
or you also can put this step in the scripting steps:
Dim STDOUT = Wavelets.R <= "library(""wavelets"")"
Then, you can just simply invoke the R calculation using the R.Evaluate
function, this function returns the RDotNET
symbolic expression object which exposes the R memory into your .NET program. Unlike the <=
operator in RDotNET
, <=
operator returns the STDOUT
string
collection which was displayed on the terminal console.
3. Serialize the R Symbolic Expression into a .NET Object Instance
In this step, we can just serialize an RDotNET
symbolic expression into a .NET object with just one statement, your hybrids programming with R language just keeps things simple and happy :-).
We assume that you have properly created the mapping class object in your program, and then you have got a result value from the R invoked evaluation, so than you can just do the serialization job simply like the operation shown below:
Dim Result = RDotNET.Extensions.ShellScriptAPI.Serialization.LoadFromStream_
(Of Wavelets.Waveletmodwt)(TestResultRS4Object)
How Does This Code Work?
This Serialization operation can be found at the namespace location: RDotNET.Extensions.ShellScriptAPI.Serialization
. And there are two interfaces to invoke this serialization:
Imports RDotNET.SymbolicExpressionExtension
Public Module Serialization
Public Function LoadFromStream(Of T As Class)(RData As RDotNET.SymbolicExpression) As T
Dim value As Object = InternalLoadFromStream(RData, GetType(T))
Return DirectCast(value, T)
End Function
Public Function LoadRStream(RData As RDotNET.SymbolicExpression, Type As Type) As Object
Dim value As Object = InternalLoadFromStream(RData, Type)
Return value
End Function
Since the s4object
in R maybe has some vector in its attribute and the element in the vector is possibly an s4object
type, the serialization of the s4object
is a recursive operation. So at first, we start this recursive operation from this function:
Private Function InternalLoadFromStream(RData As RDotNET.SymbolicExpression, _
TypeInfo As System.Type) As Object
Select Case RData.Type
Case Internals.SymbolicExpressionType.S4
Return InternalLoadS4Object(RData, TypeInfo)
Case Internals.SymbolicExpressionType.LogicalVector
Return RData.AsLogical.ToArray
Case Internals.SymbolicExpressionType.CharacterVector
Return RData.AsCharacter.ToArray
Case Internals.SymbolicExpressionType.IntegerVector
Return RData.AsInteger.ToArray
Case Internals.SymbolicExpressionType.NumericVector
Return RData.AsNumeric.ToArray
Case Internals.SymbolicExpressionType.List
Return InternalCreateMatrix(RData, TypeInfo)
Case Else
Throw New NotImplementedException
End Select
End Function
As you can see in this function, if the r
object is an s4object
, then the program will continue the operation recursive, or else if the object is an elementary type, then the function will exist from the recursive operation and return the value. In this serializes, we just simply read the simple data type in .NET language: Boolean
, String
, Integer
, Double
and Object()
, other data type such as function in R (lambda expression in.NET language) is skipped in this function, because we don't know how to save this data into the filesystem.
Then, we are going to the recursive operation steps if the object we are going to map in our program is the s4object
in R language:
Case Internals.SymbolicExpressionType.S4
Return InternalLoadS4Object(RData, TypeInfo)
Private Function InternalLoadS4Object(RData As RDotNET.SymbolicExpression, _
TypeInfo As System.Type) As Object
Dim Mappings = Microsoft.VisualBasic.ComponentModel.DataSourceModel._
DataFrameColumnAttribute.LoadMapping(TypeInfo)
Dim obj As Object = Activator.CreateInstance(TypeInfo)
Call Console.WriteLine("[DEBUG] {0} ---> R.S4Object (""{1}"")", _
TypeInfo.FullName, String.Join("; ", RData.GetAttributeNames))
For Each Slot In Mappings
Dim RSlot As RDotNET.SymbolicExpression = RData.GetAttribute(Slot.Key.Name)
Dim value As Object = InternalLoadFromStream(RSlot, Slot.Value.PropertyType)
Call InternalValueMapping(value, Slot.Value, obj:=obj)
Next
Return obj
End Function
We are going to load the mapping at first in this step using:
Dim Mappings = Microsoft.VisualBasic.ComponentModel.DataSourceModel._
DataFrameColumnAttribute.LoadMapping(TypeInfo)
Then, we create an object instance of target mapping type to contain the data.
Dim obj As Object = Activator.CreateInstance(TypeInfo)
Since an attribute in S4Object
is equal to the .NET class property, when we load the mapping from the metadata in the schema definition of the target type in our .NET program, then we can load the data from R expression specific for each property in our class. The steps in the For
loop contain these steps:
- Gets the specific attribute in
S4Object
as the mapping serialization data source:
Dim RSlot As RDotNET.SymbolicExpression = RData.GetAttribute(Slot.Key.Name)
- Then we are able to continue deserialization of the R expression recursively:
Dim value As Object = InternalLoadFromStream(RSlot, Slot.Value.PropertyType)
- At last, we get the value in .NET format, so that we can assign the value to the property using the reflection operation.
Call InternalValueMapping(value, Slot.Value, obj:=obj)
The matrix value cannot be directly assigned using reflection.
As you can see in the previous steps, the value we get from the serialization mapping is not directly assigned to the specific property, but using a function to implement this job, this is because the matrix object in R is mapping as the array of (object array)...... so that we get the matrix from R, in fact is an object array (since object array equals the object type, or everything in.NET is equal to the object type because all of the data type in .NET inherits from the object type) so that the matrix in R in fact the .NET program thinks it is an object array, not a specific type array's array, so that when we directly assign the matrix value, the program will crash!
Picture 2. How does the R Matrix will convert to a object array
Finally, we gets an Object()
of which the element type in this array is Double()
, not the type we want: Double()()
matrix, this will cause the exception. So that we are using the function:
Private Function InternalValueMapping(value As Object, _
pInfo As System.Reflection.PropertyInfo, ByRef obj As Object) As Boolean
Dim pTypeInfo As System.Type = pInfo.PropertyType
If pTypeInfo.HasElementType Then
Call InternalMappingCollectionType(value, pInfo, obj, pTypeInfo)
Else
Call InternalRVectorToNETProperty_
(pTypeInfo:=value.GetType, value:=value, obj:=obj, pInfo:=pInfo)
End If
Return True
End Function
To help us to correctly convert the vector matrix type into a properly .NET array type.
Since almost all of the R data type is a vector, when the property in our .NET class is a single element such as string
/integer
/double
not the vector string()
/integer()
/double()
, so when the reflected type of the property in .NET class is a single element, then we just need to convert the r data to an array and get the first element value, things just works fine. When the data type in our .NET class property is an array, then we just directly assign the r converted value to it, things also work fine!
Convert the object array into a specific type matrix using this function:
Private Sub InternalMappingCollectionType(value As Object, _
pInfo As System.Reflection.PropertyInfo, ByRef obj As Object, pTypeInfo As System.Type)
Dim EleTypeInfo As Type = pTypeInfo.GetElementType
Dim SourceList = (From val As Object In DirectCast(value, System.Collections.IEnumerable) _
Select val).ToArray
Dim List = Array.CreateInstance(EleTypeInfo, SourceList.Count)
For i As Integer = 0 To SourceList.Count - 1
Call List.SetValue(SourceList(i), i)
Next
Call pInfo.SetValue(obj, List)
End Sub
We can use the Array.CreateInstance
in this reflection operation function to create a type specific array, before we create the array, we should know its element type, the element type can be known from the reflection of the property type:
Dim EleTypeInfo As Type = pTypeInfo.GetElementType
Since we already know that the R converted data is a matrix, we directly convert it into an array data:
Dim SourceList = (From val As Object In DirectCast_
(value, System.Collections.IEnumerable) Select val).ToArray
At last, we have known two key elements of how to create an array: its element type and the element counts in the array (or we can say the array size):
Dim List = Array.CreateInstance(EleTypeInfo, SourceList.Count)
After we used the List.SetValue
to assign the element value to each position in the array, then we get an array(of array) type matrix in the .NET program. Finally, we can assign this converted matrix value to the specific property:
Call pInfo.SetValue(obj, List)
A Simple Code Testing Example
In the test project, you can learn how to do this happy and easily hybrid programming. There are two modules in the test project:
Quote:
Module Wavelets
for define the required r function and r object mapping type to read the wavelets calculation result from the r invoke
Module Program
for testing example code
Important Note
Before you run this code, the R program should properly install on your computer and the required wavelets R library should install on your R system.
1. The Simplest VB/C# Hybrid Programming Example
Dim ChipData = (From row As Microsoft.VisualBasic.DataVisualization.DocumentFormat.Csv.File.RowObject
In Microsoft.VisualBasic.DataVisualization.DocumentFormat.Csv.File.FastLoad_
("../DM_1184.GeneChipDataSamples.csv")
Select ID = row.First, ExpressionData0 = (From s As String In row.Skip(1) _
Select Val(s)).ToArray).ToArray
Call Wavelets.Initialize()
Dim TestResultRS4Object = Wavelets.DWT_RInvoke(ChipData.First.ExpressionData0, filter:="haar")
Dim Result = RDotNET.Extensions.ShellScriptAPI.Serialization.LoadFromStream_
(Of Wavelets.Waveletmodwt)(TestResultRS4Object)
Call Result.GetXml.SaveTo("./Test.Result.xml")
The program code follows the typical steps of the R hybrids programming:
- Initialize the R engine services and load the required library in function:
Call Wavelets.Initialize()
- And then invoke the R function gets a
RDotNET
symbolic expression:
Dim TestResultRS4Object = Wavelets.DWT_RInvoke(ChipData.First.ExpressionData0, filter:="haar")
- At last, we get the result in the .NET class format through the serialization:
Dim Result = RDotNET.Extensions.ShellScriptAPI.Serialization.LoadFromStream_
(Of Wavelets.Waveletmodwt)(TestResultRS4Object)
Invoking the wavelets signal analysis only needs simple and happy 3 steps of coding, right? :)
2. Hybrids Scripting With the ShoalShell Language
The Shoal Shell language is a new type of embed scripting language in .NET which was originally developed for my virtual cell system. And it has the feature of a lot of hybrids scripting ability with R/Perl/SQL/LINQ, currently, I just released the R hybrids scripting API for the shoal shell.
The example shows how to do hybrids scripting with shoal/R and your .NET program:
Dim ShoalShell As Microsoft.VisualBasic.Scripting.ShoalShell.Runtime.Objects.ShellScript = _
New Scripting.ShoalShell.Runtime.Objects.ShellScript()
Call ShoalShell.InstallModules(GetType_
(RDotNET.Extensions.ShellScriptAPI.Serialization).Assembly.Location)
Call ShoalShell.InstallModules(GetType(Wavelets).Assembly.Location)
Call ShoalShell.InstallModules(GetType_
(ShoalShell.PlugIns.Plot_Devices.DataSource).Assembly.Location)
Call ShoalShell.TypeLibraryRegistry.Save()
Dim Script As String =
<ShoalShell-Script>
imports wavelets
imports r.net
imports io_device.csv
imports system
chipdata < (imports.csv) ../DM_1184.GeneChipDataSamples.csv
chipdata <- $chipdata -> as.datasource
chipdata <= $chipdata [0]
chipdata <- $chipdata -> get.X
s4obj <- $chipdata -> dwt.r.invoke filter haar n.levels 5
result.type <- wavelets result.type.schema
result <- ctype r.data $s4obj cast.type $result.type
call $result > ./Test.Result.ShoalInvoke.xml
return $result
</ShoalShell-Script>
Dim bResult = ShoalShell <= Script
MsgBox(DirectCast(bResult, Wavelets.Waveletmodwt).GetXml, MsgBoxStyle.Information)
First, we instantiate a shoal shell scripting host in our code and then install the required module DLL file:
Dim ShoalShell As Microsoft.VisualBasic.Scripting.ShoalShell.Runtime.Objects.ShellScript = _
New Scripting.ShoalShell.Runtime.Objects.ShellScript()
For install, the external dynamics API module DLL file, you can use:
Call ShoalShell.InstallModules("<DLL_filepath>")
Example as:
Call ShoalShell.InstallModules(GetType_
(RDotNET.Extensions.ShellScriptAPI.Serialization).Assembly.Location)
Then we start to script and get the return result from:
# Shoal shell statement
return $result
Dim bResult = ShoalShell <= Script
3. Dynamics Programming With Shoal Shell
The shoal shell also has the dynamics programming feature with your .NET program:
Dim Dynamics As Object = New Microsoft.VisualBasic.Scripting.ShoalShell.Runtime.Objects.Dynamics_
(ShoalShell)
Dim ChipDataDy = Dynamics.Imports.Csv("../DM_1184.GeneChipDataSamples.csv")
ChipDataDy = Dynamics.As.DataSource(ChipDataDy)
ChipDataDy = ChipDataDy(0)
ChipDataDy = Dynamics.Get.X(ChipDataDy)
Dim s4obj = Dynamics.dwt.r.invoke(ChipDataDy)
Result = DirectCast(Dynamics.CType(s4obj, GetType(Wavelets.Waveletmodwt)), Wavelets.Waveletmodwt)
Result.GetXml.SaveTo("./Test.Result.ShoalInvoke.Dynamics.Programming.xml")
MsgBox(DirectCast(Result, Wavelets.Waveletmodwt).GetXml, MsgBoxStyle.Information)
As you can see, the dynamics code shown above is the VB translated version of the shoal shell scripting! Things are amazing!