(untagged)

Sharp as C

George Shagov

0.00/5 (No votes)

13 Jul 2005

This article represents a general architectural and design approach to application development.

Introduction

1:18 For in much wisdom is much grief:
and he that increaseth knowledge increaseth sorrow.

KJV - Ecclesiastes

In the beginning was � a word. And the word was � an algorithm!? Or should I say al-khwarizm? What Wikipedia says about the term algorithm?

Citing: �An algorithm (the word is derived from the name of the Persian mathematician Al-Khwarizmi), is a finite set of well-defined instructions for accomplishing some task which, given an initial state, will terminate in a corresponding recognizable end-state�.

Al-Khwarizmi? Citing: �Abu Abdullah Muhammad bin Musa al-Khwarizmi, was a Persian scientist, mathematician, astronomer/astrologer, and author. He was probably born in 780, or around 800; and probably died in 845, or around 840.�

1200 years!

What is this article about?

And how to read this. This article represents my own point of view at the general approach to software development and its architecture. In respect of your time, I�m going to hide all my thoughts I had, and way I did whether they were short or long, and am going to offer you just the final conclusions. Sometimes these conclusions might seem � strange, but that is what I�m thinking, my personal opinion. In this article I�m doing nothing, but expressing my own point of view, whether it make any sense to you or not, whether you see any useful ideas here, or whether you think all is absolutely useless, it is for you to decide.

The plan of the article is pretty simple: In �How it should be� section, I�m describing the general idea. If you find it interesting then go further to the section �It is possible� where you will find the details of the realization. Everything else is no more than pros and cons of the approach proposed in the �How it should be� section. If the idea described in the �How it should be� section seems pointless to you, you might spare your time and read no further.

How it should be

"...since brevity is the soul of wit, And tediousness the limbs and outward flourishes, I will be brief: your noble son is mad:"

POLONIUS, Hamlet, Prince of Denmark, W. Shakespeare

Is it possible to draw the architecture of an application in general? Somebody might say it depends on the business. IMHO: it should look like what is shown on Figure 1.

Figure 1

Business logic is to be written in script in order to be as plain as possible. The business entities are whatsoever your business needs, like collections (vectors, maps, sets, etc.), logging system, DB vendors (MSSQL, Oracle, SyBase, etc.), IPC (DCOM, RPC, Sockets, pipes), threading system (posix might be a good sample here) and so on, all these entities should expose some kind of a plain interface, which basically is getters and setters. These entities should be as simple in logic as possible and in general I would say they should export either data or simple functionality. The business logic is to be written in some scripting language and to be absolute, it means portability also. If some entity is going to be changed (you are switching from SQL to Oracle for instance) the logic should not be changed, in the perfect case. What I�m trying to say here is that any business logic and entities are to be separate. Let us see a classical sample:

int main()
{
  printf (�Hello world.\n�);
  return 0;
}

In this sample, business logic is represented by means of a C script (in general this is a script, since we have no idea how we are going to start it up). And there is only one business entity, this is the C-library (libc, msvcrt for instance), exposing plain �exported� C-functions (printf in our case). See Figure 2.

Figure 2

This approach goes in contrary with traditional OOP approach of development. OOP puts together an object and its functionality, this approach does otherwise. It is even thus keeping things clean and trying to save some time. I would dare to say that, OOP worked out its resource and it is � dead, IMHO.

Now I�m saying (and these are my IMHOs also) that developing business entities is not as painful as developing business logic algorithms. Constructing business algorithms is a much more peculiar, painful, and nervous process and takes much more time and resource than anything else.

Therefore business logic is to be written in plain script, and this script is to be changeable at run-time, without any recompilation. My basic objective is that business logic should not be as a �sacred ground�, once-working-never-changed, it�s otherwise ridiculous. It�s to be �playable� whenever it is required, especially on development/QA stages. This logic/script is to be changed in run-time, without any commits/check-ins to be done, no rebuilding, restarting, all those annoying procedures, just simply changing the script should immediately impact the running system. Let me guess, you say impossible, or if it be possible � too complicated.

It is possible

And it is not so complicated. This section will show how it works. (The sample code is written for Microsoft Windows platform.)

This is an application tree:

&#9500;&#9472;&#9472;&#9472;c_dispatcher
&#9500;&#9472;&#9472;&#9472;Debug
&#9500;&#9472;&#9472;&#9472;frontend_app
&#9500;&#9472;&#9472;&#9472;include
&#9500;&#9472;&#9472;&#9472;my_script
&#9500;&#9472;&#9472;&#9472;my_script_c_proxy
&#9492;&#9472;&#9472;&#9472;my_script_d_proxy

Inside folder fronend_app, the main (console) application is located. There is only one file there: frontend_app.cpp.

// frontend_app.cpp

// (c) George Shagov, 2005

#include <windows.h>

#include "..\\include\my_structs.h"


typedef int (__cdecl *MYFARPROC)(int nArg, 
    char* pString, SMyStructure* pMyStruct);

int main(int argc, char* argv[])
{
  HMODULE hMyScript = LoadLibrary("my_script_d_proxy.dll ");
  MYFARPROC pProcSource = (MYFARPROC)GetProcAddress(hMyScript,
  "c__my_entry_point");

  SMyStructure myStruct;

  myStruct.m_nVal = 0;
  strcpy(myStruct.m_sString, "");
  /*
  * calling for entry point.
  * directly
  */
  char sMyString[32];
  strcpy(sMyString, "My string here.");
  pProcSource(argc, sMyString, &myStruct);

  return 0;
}

As you can see here, it gets the address of the entry point of the script and executes it. The script itself might be found inside my_scipt folder, the file name: my_script.c_. There are some additional files there: my_script.gnrtd.c, my_script.gnrtd.h; these are to be generated from my_script.c_.

Here is the script:

// my_script.c_

// (c) George Shagov, 2005

/************************************************************************
*
* this file is automatically generated from my_script.c_
* do not modify it
*
************************************************************************/
#include <stdio.h>

#include <string.h>

#include "..\\include\\my_structs.h"

#include "my_script.gnrtd.h"


int c__get_value_1_impl(char* pString)
{
  return 1;
}

int c__get_value_2_impl(int nArg)
{
  return 2;
}

int c__call_in_case_varables_are_equal_impl(SMyStructure* pMyStruct)
{
  pMyStruct->m_nVal = 0;
  strcpy(pMyStruct->m_sString, "equal");
  return 0;
}

int c__call_in_case_varables_are_not_equal_impl(SMyStructure* pMyStruct)
{
  pMyStruct->m_nVal = 0;
  strcpy(pMyStruct->m_sString, "not equal");
  return 0;
}
int c__re_entry_impl(int nArg, char* pString, SMyStructure* pMyStruct)
{
  int nVar1 = c__get_value_1(pString);
  int nVar2 = c__get_value_2(nArg);

  if (nVar1 == nVar2)
  {
    c__call_in_case_varables_are_equal(pMyStruct);
  }
  else
  {
    c__call_in_case_varables_are_not_equal(pMyStruct);
  }

  return 11;
}

int c__my_entry_point_impl(int nArg, char* pString, 
                           SMyStructure* pMyStruct)
{
  int nRet;

  printf("-----------\nbefore:\n");
  printf("nArg: %d, string: %s\n", nArg, pString);
  printf("pMyStruct->m_nVal: %d, pMyStruct->m_sString: %s\n", 
          pMyStruct->m_nVal, pMyStruct->m_sString);

  nRet = c__re_entry(nArg, pString, pMyStruct);

  printf("++++++after:\n");
  printf("nArg: %d, string: %s\n", nArg, pString);
  printf("pMyStruct->m_nVal: %d, pMyStruct->m_sString: %s\n", 
          pMyStruct->m_nVal, pMyStruct->m_sString);
  printf("ret: %d\n-------------\n", nRet);

  return nRet;
}

c__my_entry_point_impl is an entry point to be called from frontend_app. my_script.gnrtd.c is the mere copy of the original script. my_script.gnrtd.h represents the declarations.

As you can see, fronend_app uses my_script_d_proxy library in order to make a call to c__my_entry_point_impl. There are two files under my_script_d_proxy folder: my_script_d_proxy.gnrtd.c and my_script_d_proxy.gnrtd.h, both these files are to be generated from the original script (my_script.c_) also. my_script_d_proxy.gnrtd.c contains plugs for all the functions written in the script, like this:

int c__re_entry_stub(int nESP, int nArg, char* pString, SMyStructure*
pMyStruct)
{
  void* pArgs = 0;
  int nSize = 0;

  _asm
  {
     push eax; /* saving eax */
     mov eax, ebp; /* ebp points out at the parameters (as known) */
     add eax, 8; /* now eax points out at the first argument, which is nESP*/
     mov pArgs, eax;
     add pArgs, 4; /* since first argument is esp, but we need real argument here */
     mov eax, nESP;
     sub eax, pArgs; /* eax now has a phisical size of the stack */
     shr eax, 2; /* eax/4 - eax now has an amount of arguments put in the
     stack */
     mov nSize, eax; /* saving that size */
     pop eax; /* restoring eax */
  }

  return g_pDispatcherEntry("c__re_entry", pArgs, nSize);
}

int c__re_entry(int nArg, char* pString, SMyStructure* pMyStruct)
{
  int nESP;

  _asm
  {
    mov nESP, esp;
  }

  return c__re_entry_stub(nESP, nArg, pString, pMyStruct);
}

Assembler instructions remember the pointer to the first argument which was put in the stack and the count of arguments in the stack, and delivers a call to the c_dispatcher library, which exports the g__c_dispatcher_entry_point function.

The code of c_dispatcher.cpp:

// c_dispatcher.cpp

// (c) George Shagov, 2005

#include <stdio.h>

#include <windows.h>

#include "c_dispatcher.h"


static HINSTANCE s_hCSource = NULL;
static HINSTANCE s_hProxy = NULL;
typedef int (__cdecl *MYFARPROC)();

MYFARPROC GetMyProcAddress(const char* pFunctionName)
{
  char pFile[128];
  char pFnName[128];

  sprintf(pFile, "my_script.%s_impl.c_", pFunctionName);
  sprintf(pFnName, "%s_impl", pFunctionName);

  FILE* f = fopen(pFile, "r");

  if (f)
  {
    fclose(f);
    return (MYFARPROC)GetProcAddress(s_hProxy, pFnName);
  }
  else
    return (MYFARPROC)GetProcAddress(s_hCSource, pFnName);
}

BOOL APIENTRY DllMain( HANDLE hModule, DWORD ul_reason_for_call, 
                       LPVOID lpReserved)
{
  switch (ul_reason_for_call)
  {
    case DLL_PROCESS_ATTACH:
      s_hCSource = LoadLibrary("my_script.dll");
      s_hProxy = LoadLibrary("my_script_c_proxy.dll");
      break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
      break;
    case DLL_PROCESS_DETACH:
      FreeLibrary(s_hCSource);
      FreeLibrary(s_hProxy);
      break;
  }
  return TRUE;
}

// This is an example of an exported function.

C_DISPATCHER_API int g__c_dispatcher_entry_point(const char* 
   pFunctionName, const void* pArguments, int nArgumentsCount)
{
  MYFARPROC pProc = GetMyProcAddress(pFunctionName);
  void* pStack = 0;

  if (nArgumentsCount)
  {
    _asm
    {
      mov ecx, nArgumentsCount;
      loop_start_01:
      push 0;
      loop loop_start_01;
      mov pStack, esp;
    }

    memcpy(pStack, pArguments, nArgumentsCount*4);

    int nRet = pProc();

    _asm
    {
      mov ecx, nArgumentsCount;
      loop_start_02:
      pop eax;
      loop loop_start_02;
    }

    return nRet;
  }
  else
    return pProc();
}

As you can see here, in case the dispatcher finds a file my_script.<function_name_impl>.c_, it delegates a call to the my_script_c_proxy library, otherwise to my_script.dll, where the compiled script code is located. This actually is a substitution. Before the call, it simulates the stack, knowing the pointer at the original one and its size; after the call � simple unwinding. Simple, right?

my_script_c_proxy library contains four files. (Here I should say, since we are going to change the code at run-time, we need some kind of a C-interpreter. I took Cint. Cint is a free C-interpreter, powerful enough and very suitable for this demo, yet there are a couple of issues, which means that some disadvantages in this demo implementation will be closely connected to this particular interpreter.) G__clink.c, G__clink.h � these files are generated from my_script_d_proxy.gnrtd.h (my_script_d_proxy folder) by Cint, since Cint, during interpretation should not call the script functions, but the stubs implemented inside the my_script_d_proxy library, so you would be able to reimplement any function you need, and not the whole script. The rest of the functions are to be called from my_script.dll. It�s a little bit tricky. The file my_script_c_proxy.gnrtd.c contains stubs which look like this:

MY_SCRIPT_C_PROXY_API int 
   c__my_entry_point_impl(int nArg, char* pString, 
   SMyStructure* pMyStruct)
{
  char tmp[128];
  int nRet;

  s__setup_cint();

  sprintf(tmp,"c__my_entry_point_impl((int)%d, (void*)0x%08lx, 
     (SMyStructure*)0x%08lx);", nArg, (int)pString, pMyStruct);
  nRet = G__calc(tmp).obj.i; /* Call Cint parser */

  return nRet;
}

G__calc is a Cint function, which makes a call to the script. Well, actually, that�s it.

Let us see how it works.

The context of c_\Debug folder (after getting the project built) looks like this:

C_dispatcher.dll
frontend_app.exe
my_script.dll
my_script_c_proxy.dll
my_script_d_proxy.dll

Starting the application we are getting:

-----------
before:
nArg: 1, string: My string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString:
++++++after:
nArg: 1, string: My string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString: not equal
ret: 11
-------------

This is what is produced by the compiled script, and now located in the m_script.dll library.

Now in Debug folder, we are creating an empty file: my_script.c__get_value_1_impl.c_. The existence of this file will be a sign to the dispatcher that there is a substitution for c__get_value_1_impl function. We should cerate my_script.c_ file also, within the next content (the presence of two files is that disadvantage I told earlier caused by Cint).

// my_script.cpp : Defines the entry point for the DLL application.

//

#include <stdio.h>

#include "..\\include\\my_structs.h"

int c__get_value_1_impl(char* pString)
{
  pString[1] = 'X';
  printf("c__get_value_1 ==>> str: %s\n", pString);
  return 2;
}

The context of c_\Debug folder looks like this:

C_dispatcher.dll
frontend_app.exe
my_script.c_
my_script.c__re_entry_impl.c_
my_script.dll
my_script_c_proxy.dll
my_script_d_proxy.dll

Restarting application getting the result:

-----------
before:
nArg: 1, string: My string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString:
c__get_value_1 ==>> str: MX string here.
++++++after:
nArg: 1, string: MX string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString: equal
ret: 11
-------------

Now let us try to re-implement two functions. For this purpose, we are creating the second file: my_script.c__re_entry_impl.c_, in order to signalize the dispatcher, and modifying the script:

// my_script.cpp : Defines the entry point for the DLL application.

//

#include <stdio.h>

#include "..\\include\\my_structs.h"


int c__get_value_1_impl(char* pString)
{
  pString[1] = 'X';

  printf("c__get_value_1 ==>> str: %s\n", pString);

  return 2;
}

int c__re_entry_impl(int nArg, char* pString, SMyStructure* pMyStruct)
{
  printf("\"I'll not be juggled with.\nTo hell, allegiance! Vows, to the
     blackest devil!\nConscience and grace, to the profoundest pit!\nI dare
     damnation. To this point I stand,\"\n");
  printf("...for this is script\n");

  int nVar1 = c__get_value_1(pString);
  int nVar2 = c__get_value_2(nArg);

  if (nVar1 == nVar2)
  {
    c__call_in_case_varables_are_equal(pMyStruct);
  }
  else
  {
    c__call_in_case_varables_are_not_equal(pMyStruct);
  }

  return 11;
}

The result:

-----------
before:
nArg: 1, string: My string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString:
"I'll not be juggled with.
To hell, allegiance! Vows, to the blackest devil!
Conscience and grace, to the profoundest pit!
I dare damnation. To this point I stand,"
...for this is script
c__get_value_1 ==>> str: MX string here.
++++++after:
nArg: 1, string: MX string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString: equal
ret: 11
-------------

Now a little bit about parameters or arguments to functions. I might see already that the string �My string here� has been changed to �MX string here�. It has been done by means of c__get_value_1_impl, re-implemented in the script. We are able to do the same with structures. Creating the new file: my_script.c__call_in_case_varables_are_equal_impl.c_ and adding the next function to the script:

int c__call_in_case_varables_are_equal_impl(SMyStructure* pMyStruct)
{
  pMyStruct->m_nVal = 0;
  strcpy(pMyStruct->m_sString, "-- EQUAL --");
  return 0;
}

The result:

-----------
before:
nArg: 1, string: My string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString:
"I'll not be juggled with.
To hell, allegiance! Vows, to the blackest devil!
Conscience and grace, to the profoundest pit!
I dare damnation. To this point I stand,"
...for this is script
c__get_value_1 ==>> str: MX string here.
++++++after:
nArg: 1, string: MX string here.
pMyStruct->m_nVal: 0, pMyStruct->m_sString: -- EQUAL --
ret: 11
-------------

By now the context of c_\Debug folder looks like this:

c_dispatcher.dll
frontend_app.exe
my_script.c_
my_script.c__call_in_case_varables_are_equal_impl.c_
my_script.c__get_value_1_impl.c_
my_script.c__re_entry_impl.c_
my_script.dll
my_script_c_proxy.dll
my_script_d_proxy.dll

It works.

As you can see:

It is possible to change (or rather to say substitute) the code (script) at runtime, no recompilation is required.
It is not a hard task.

Performance

Yes, of course, using script instead of native code does mean significant loss of performance, yet there are two things to be said:

In systems where performance is a key point, such as real-time systems, no substitution is to be allowed. It means there should not be any dispatcher library and all the calls are to be compiled as direct ones and linked during compilation. In this approach, there will not be any loss of performance. Yet in development in QA where possibility for substitution is highly required but performance does not play a significant role, this approach will be applicable.
In general, performance is not a key point. In this case, if we need a substitution right in production it is possible to do that without significant loss of performance. In order to do that we should:
1. Create and compile a separate library, let it have a name my_scipt_subst.dll. This library would contain re-implementation of those functions which we need to substitute.
2. Create and compile additional proxy library, let it have a name my_script_s_prioxy.dll which should look like my_script_c_prioxy.dll, save that all the call will be delegated not to Cint, but to my_scipt_subst.dll (see step a.)
3. Modify dispatcher so it should know what my_script_s_prioxy.dll is.

I didn�t do that in order not to overload the code. If the basic idea is understandable, the rest � is but technique.

Pros and Cons

Had I had patience and time I would write a book here, or two. Yet in brief.

Disadvantages

The build procedure becomes more complicated, additional parsing is required.
There should be an interpreter supplied.
Read the performance section.
Using C as a script might cause some problems, since C, by default, has a direct access to memory and has no mechanism of automatic unwinding, which might potentially cause leaks. Yet, that should be a Cscript, not a C, it means that all functions which have access to memory should be exposed as entities.

Benefits

A clarity. OOP code is much less readable than plain script. And this is IMHO the main goal.
Ability to change the business logic at run-time.
A control. Just think what we are able to do having all entry-points in our hands.

TO-DO

A lot.

There should be a suitable C-interpreter.
Parsing procedure.
See the second clause described in the �Performance� section.
Dispatcher. Yes of course the way it is implemented is not applicable to a real system. There should be a map of functions which is to be updated in a separate thread, according to the timestamp of the modification.
And so on�

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here