Contents
|
Figure 1: Imagine such code being legal in VB.
|
The purpose of this investigation is to enable applications written in Visual Basic 6 to use function pointers. Other advantages may be discovered, such as embedding native code in Visual Basic applications, thus extending the world of possibilities without the need of external DLLs. For the sake of keeping this as brief and concise as possible, other variations of the techniques used have been ignored and the focus has been maintained on detailing the one methodology which handles more common situations. Before reading a comprehensive examination, I'm assuming you'd like to actually see a working sample project: NativeCode.zip.
Since pointers in general aren't Visual Basic 6 specific, it might sound crazy to talk about function pointers in Visual Basic 6. So, let's see first how we usually get function pointers in Visual Basic 6. AddressOf
operator? Yes, but not just that! Well, there's also the GetProcAddress
Win32 API from kernel32.dll which can be used at runtime to retrieve addresses of functions exported by other DLLs. Again, not just this... there're other scenarios! But why would one use runtime loading when one can simply use a Declare
statement? For the same reasons a C/C++ programmer would, for example, using different DLLs (or versions of the same DLL) depending on the environment in which the application is running.
In fact, the whole concept of plug-ins is based on loading, probably on demand, external components which export the expected functionality. For example, some codecs are even downloaded on demand and then loaded by the application which uses them. Also, a function pointer might be given by an external module as a callback (aka delegate) and its value may depend on the state of some objects. In OOP, a well-known behavioral pattern called the "Template Method" is characterized by changing control flow at runtime. Using function pointers may reduce considerably the effort put in its implementation by reducing the number of classes that need to be defined.
If you ever used the EnumWindows
API in Visual Basic 6, you may have noticed that there's nothing which enforced you to pass as a callback, the address of a function with a correct prototype. The code will fail at runtime, but compile without complaining. Delegates in Visual Basic .NET overcome this issue, although their main purpose might have been to ensure availability of code because the CLR may discard compiled code that is not referenced. While other languages have a way to declare type-safe pointers (as typedef
in C/C++), in Visual Basic 6 we can only treat them as signed 4-byte numbers. The developer will be responsible for type checking, without any help from the compiler.
The method of choice for calling addresses stored as Long
is replacing entries in the "Virtual Function Table" (VFT
) of a class, because it provides enough flexibility while still being easy to use. It has the same behaviour in IDE, Native code and P-Code, which helps debugging. A vftable
is a mechanism used in programming language implementations in order to support dynamic polymorphism, i.e. runtime method binding. Where some online resources are describing how the Visual C++ 6.0 compiler creates vftable
s for classes, I couldn't find anything regarding the Visual Basic 6.0 compiler.
Typically, a compiler creates a separate vftable
for each class and stores its pointer as a hidden member of each base class, often as the first member. The Visual C++ compiler builds the vftable
for each class in a read-only page of the code section in a similar way to string literals. In order to modify its contents, one needs to use the VirtualProtect
API for temporarily changing the access rights. The first address in the vftable
created by Visual C++ points to the scalar destructor function of the class, and it is followed by the addresses of virtual functions in their order of declaration.
While Visual C++ supports multiple inheritance and handles pure virtual function calls, our objective can simply be achieved by identifying how Visual Basic retrieves the address of a public
method given the address of an instance from a class. A visual inspection of a class instance was required to establish the location and content of the vftable
. Displaying the memory content from the addresses of two instances of the same class, we should be able to identify which is the pointer to the vftable
. Since both instances point to the same vftable
, the pointer value must be the same and must belong to our Visual Basic module (by default, the starting address is 0x00400000
). As you can see, the pointer to the vftable
is stored as the first 4 bytes within the class instances.
Figure 2: Two objects sharing same VFT.
|
Figure 3: Addresses at offset &H1C belong to our module.
|
The first seven addresses in the vftable
point to code within msvbvm60.dll. Modifying the class definition by adding more public
methods will change the vftable
content starting with offset &H1C. In order to consolidate the theory, I wrote an external DLL for breaking into Visual Basic calls to methods of an object and reading the disassembly with the Visual C++ debugger.
This is much easier than it sounds. Have a global procedure creating an instance of a Visual Basic class and call its first public
method. At runtime, display the address of the global function, the address of the class instance and the pointer to the vftable
. Load the process with the Visual C++ debugger by pressing F11
. Write down the current instruction pointer (eip
of the application's entry point) and change it to the address of the global procedure (type it in the Registers
window and press Enter). Set a breakpoint at this address. Change the eip
to the old value and resume execution.
When the breakpoint is reached, step through the code and observe the registry values until the method of the class is being called. The screenshots are showing how the eax
register is getting the address of the object (0x14BE70
), from which 4 bytes are copied into the ecx
register, representing the address of the VFT (0x4033B8
). The instruction at 0x402391 is the call to the first public method of that class and, as you can see, its offset in the vftable
is &H1C.
Figure 4: Disassembly showing how to get addresses from
vftable
.
My investigation continued to see if private
member data or methods affect the location or content of the vftable
. The answer was no, but public
member variables change the contents of the vftable
by inserting accessors and modifiers which will be called when using the member data outside the class. For proof of concept, I wrote the following test:
VERSION 1.0 CLASS
Attribute VB_Name = "DynamicVFT"
Private Const OffsetToVFT = &H1C
Private Sub SwapPlayers()
Dim pVFT As Long
CopyMemory pVFT, ByVal ObjPtr(Me), 4
Dim fnAddress As Long
CopyMemory fnAddress, ByVal pVFT + OffsetToVFT, 4
CopyMemory ByVal pVFT + OffsetToVFT, ByVal pVFT + OffsetToVFT + 4, 4
CopyMemory ByVal pVFT + OffsetToVFT + 4, fnAddress, 4 _
End Sub
Public Sub Play()
Debug.Print "Dynamic VFT plays: White move."
Call SwapPlayers
End Sub
Public Sub Replay()
Debug.Print "Dynamic VFT plays: Black move."
Call SwapPlayers
End Sub
Sub Main()
Dim dynObj As New DynamicVFT
Dim idx As Long
For idx = 0 To 9
dynObj.Play
Next idx
End Sub
Please note that these methods had the same prototype. Swapping the addresses in the vftable
worked as expected and the output of the above code is shown below:
Dynamic VFT: White move.
Dynamic VFT: Black move.
Dynamic VFT: White move.
Dynamic VFT: Black move.
...
So, changing the values in the VFT of a Visual Basic 6 class will replace the methods of that class. One more important thing to remember when modifying vftable
s, is that they are shared by all instances of that class.
Assuming that we'll change the addresses in the VFT of a class, further examination is required on how these methods are called. This section will describe the calling convention used, as well as the position of the parameters on the stack. Displaying from the VFT an address of a member procedure taking one Long
parameter (4 bytes on the stack) and loading the process in the Visual C++ debugger, the assembly at that address can be observed. It belongs to a jump table:
00401451 jmp 004025A0
Each entry of the jump table points to the location of the compiled code for the member method defined:
004025A0 push ebp
004025A1 mov ebp,esp
004025A3 sub esp,0Ch
004025A6 push 401136h
004025AB mov eax,fs:[00000000]
004025B1 push eax
004025B2 mov dword ptr fs:[0],esp
004025B9 sub esp,8
004025BC push ebx
004025BD push esi
004025BE push edi
004025BF mov dword ptr [ebp-0Ch],esp
004025C2 mov dword ptr [ebp-8],401118h
004025C9 mov dword ptr [ebp-4],0
004025D0 mov eax,dword ptr [ebp+8]
004025D3 push eax
004025D4 mov ecx,dword ptr [eax]
004025D6 call dword ptr [ecx+4]
004025D9 mov eax,dword ptr [ebp+8]
004025DC push eax
004025DD mov edx,dword ptr [eax]
004025DF call dword ptr [edx+8]
004025E2 mov eax,dword ptr [ebp-4]
004025E5 mov ecx,dword ptr [ebp-14h]
004025E8 pop edi
004025E9 pop esi
004025EA mov dword ptr fs:[0],ecx
004025F1 pop ebx
004025F2 mov esp,ebp
004025F4 pop ebp
004025F5 ret 8
|
Figure 5: Stack differences.
|
For experienced assembly readers, the above listing is quite straightforward. At this point, the emphasis is on the instructions modifying the stack. In fact, the last instruction is telling us almost all we need to know. First, the callee is cleaning the stack, not the caller. Thus, the calling convention cannot be __cdecl
or __fastcall
. Since this method takes only one Long
parameter (4-bytes in size), why is the method removing 8 bytes from the stack? Because there's an extra parameter pushed onto stack before the call: the pointer to the object for which we're calling the procedure (aka the this
pointer).
To confirm that the __thiscall
calling convention isn't used, have another look at the assembly listing. You'll see that the first time the ecx
register is used, at address 4025D4
, it is being written, not read. Thus, the pointer to the object isn't passed through the ecx
register, nor the other registers. Without even looking at what the caller does, we can already consider the calling convention as __stdcall
with the object pointer passed as last parameter on the stack. No surprise here, since Visual Basic 6 is known to use it extensively. Another test is in order, which is supposed to confirm the value of the last, extra parameter pushed onto stack. Remember that in the __stdcall
calling convention, parameters are pushed from right to left, as they are declared.
VERSION 1.0 CLASS
Attribute VB_Name = "MemberVsGlobal"
Private Sub ReplaceMemberWithGlobal(ByVal fnAddress As Long)
Dim pVFT As Long
CopyMemory pVFT, ByVal ObjPtr(Me), 4
Static oldAddress As Long
If (oldAddress = 0) Then
CopyMemory oldAddress, ByVal pVFT + OffsetToVFT, 4
End If
If (fnAddress = 0) Then
CopyMemory ByVal pVFT + OffsetToVFT, oldAddress, 4
Else
CopyMemory ByVal pVFT + OffsetToVFT, _
fnAddress, 4
End If
End Sub
Private Sub Class_Terminate()
ReplaceMemberWithGlobal 0
End Sub
Public Sub MemberProcedure(ByVal fnAddress As Long)
Debug.Print Hex$(ObjPtr(Me)) & ".MemberProcedure(0x" & Hex$(fnAddress) & ")"
ReplaceMemberWithGlobal fnAddress
End Sub
Private Sub GlobalProcedure(ByVal objInstance As Long, ByVal parameter As Long)
Debug.Print Hex$(objInstance) & ".GlobalProcedure(0x" & Hex$(parameter) & ")"
End Sub
Sub Main()
Dim MvsG As New MemberVsGlobal
MvsG.MemberProcedure AddressOf GlobalProcedure
MvsG.MemberProcedure &HB0A
End Sub
Notice that in the output there is the same value obtained by ObjPtr(Me)
in the class' method as the objInstance
parameter for the global procedure that replaced it, thus confirming our theory that a pointer to the object is pushed onto the stack before the call:
2499D8.MemberProcedure(0xAB16B4)
2499D8.GlobalProcedure(0xB0A)
An interesting behaviour was observed while running the last example under the IDE. If you remove the Class_Terminate
implementation, running twice the code will not call the original MemberProcedure
, if not either remake (Alt+F K) the executable or reload the project. An examination of this behaviour under the IDE isn't required for our purpose, but is good to know that in case you observe a corrupt object while debugging, you should remake before restarting execution.
At this point, we know that we can use a global procedure to replace a member procedure of a class if the global procedure has the same prototype, but with an extra parameter: the pointer to the object for which the method is being called. Hmm, but what we need is almost the opposite! We won't be needing to call global procedures taking, as the first parameter, a pointer to an object. How do we remove the extra parameter before the call reaches the global procedure? We will embed some native code written in the assembly.
Such an implementation is known as a stub or proxy. In other situations, it can be used for logging or sub-classing. Our purpose is to adapt the call as expected. When the call to the member procedure has been made, the stack will contain the return address, the pointer to the object, followed by the parameters for the method called. Our stub is supposed to remove the pointer to the object, such as the return address immediately followed by the parameters, and then jump to the desired forwarding address. Remember that we are assuming at this stage that the forwarding address points to a procedure, not a function (does not return anything) and its calling convention is __stdcall
. The assembly that will do the job follows:
pop eax // remove return address from stack
pop ecx // remove pointer to object from stack
push eax // push return address onto stack
mov eax,XXXXXXXXh // XXXXXXXX can be replaced here with
// any forwarding address
jmp eax // jump to forwarding address
You can use any assembler to produce native code from the above. I have written it as a Visual C++ __asm
block and then copied the native code produced from the disassembly view. The native code can also be found in the associated *.cod file (assembly listing file) if you set the Visual C++ compiler option /FAc
or /FAcs
(listing file type: Assembly with Machine Code
or Assembly, Machine Code, and Source
). How do we embed the native code in Visual Basic 6? Copy the listing; remove the addresses at the beginning of each line and the assembly source, keeping only the machine code bytes as hex characters; remove any spacing between them; format it as a Visual Basic constant string and you should obtain something like this: "58" & "59" & "50" & "B8XXXXXXXX" & "FFE0"
.
The XXXXXXXX
can have any value you want, since it will be replaced at runtime with the forwarding address that we need to call, our function pointer. Declare a hex string as a constant; convert it to a Byte
array; allocate the same number of bytes with the GlobalAlloc
API and copy the Byte
array; use the memory handle as the address to our native code; discard the allocation with GlobalFree
API when the native code is not required anymore. To demonstrate how embedding native code works and that a stub can successfully replace a member procedure with a non-member procedure of the same prototype, I am providing a more general approach in the following class implementation and sample test of using the class:
VERSION 1.0 CLASS
Attribute VB_Name = "StubCallToSub"
Private Const hexStub4ProcedureCall = "58" & "59" & "50" & "B822114000" & _
"FFE0"
Private VFTable() As Long
Private VFArray() As Long
Private Sub Class_Initialize()
ReDim VFTable(0)
VFTable(0) = 0
ReDim VFArray(0)
VFArray(0) = 0
End Sub
Private Sub RemoveStub(ByVal index As Long)
Dim pVFT As Long
CopyMemory pVFT, ByVal ObjPtr(Me), 4
If (index < 1) Then Exit Sub
If (index > UBound(VFTable)) Then Exit Sub
If (VFTable(index) <> 0) Then
Dim oldAddress As Long
oldAddress = VFTable(index)
CopyMemory ByVal pVFT + OffsetToVFT + index * 4, oldAddress, 4
VFTable(index) = 0
GlobalFree VFArray(index)
VFArray(index) = 0
End If
End Sub
Public Sub ReplaceMemberSubWithStub(ByVal index As Long, _
ByVal fnAddress As Long)
Dim pVFT As Long
CopyMemory pVFT, ByVal ObjPtr(Me), 4
If (index < 1) Then
For index = 1 To UBound(VFTable)
RemoveStub index
Next index
Else
If (fnAddress = 0) Then
RemoveStub index
Else
If (index > UBound(VFTable)) Then
ReDim Preserve VFTable(index)
VFTable(index) = 0
ReDim Preserve VFArray(index)
VFArray(index) = 0
End If
RemoveStub index
Dim oldAddress As Long
CopyMemory oldAddress, ByVal pVFT + OffsetToVFT + index * 4, 4
VFTable(index) = oldAddress
Dim hexCode As String
hexCode = hexStub4ProcedureCall
Dim nBytes As Long
nBytes = Len(hexCode) \ 2
Dim Bytes() As Byte
ReDim Preserve Bytes(1 To nBytes)
Dim idx As Long
For idx = 1 To nBytes
Bytes(idx) = Val("&H" & Mid$(hexCode, idx * 2 - 1, 2))
Next idx
CopyMemory Bytes(5), fnAddress, 4
Dim addrStub As Long
addrStub = GlobalAlloc(GMEM_FIXED, nBytes)
CopyMemory ByVal addrStub, Bytes(1), nBytes
CopyMemory ByVal pVFT + OffsetToVFT + index * 4, _
addrStub, 4
VFArray(index) = addrStub
End If
End If
End Sub
Private Sub Class_Terminate()
ReplaceMemberSubWithStub 0, 0
End Sub
Public Sub PrintMessage(ByVal msg As String)
Debug.Print "PrintMessage says: " & msg
End Sub
Private Sub PrintFirstParameter(ByVal msg As String)
Debug.Print "PrintFirstParameter says: " & msg
End Sub
Sub Main()
Dim fwdSub As New StubCallToSub
fwdSub.PrintMessage "Hello!"
fwdSub.ReplaceMemberSubWithStub 1, AddressOf PrintFirstParameter
fwdSub.PrintMessage "A stub called me instead!"
fwdSub.ReplaceMemberSubWithStub 1, 0
fwdSub.PrintMessage "My address has been restored in VFT!"
End Sub
Please observe in the output which code is printing the messages after each call to ReplaceMemberSubWithStub
method:
PrintMessage says: Hello!
PrintFirstParameter says: A stub called me instead!
PrintMessage says: My address has been restored in VFT!
Additional member procedures (not functions) with different prototypes may be declared in this class and, calling the ReplaceMemberSubWithStub
method, we can change their destination. The class is self-cleaning in Class_Terminate
, so you only have to exercise caution when choosing the index for the method you're setting the pointer to. For larger projects, I would recommend using an Enum
which somehow correlates the function pointer names with their prototypes:
Public Enum FwdSubIdx
idxPrintMessage = 1
idxOtherSub
idxAnotherSub
End Enum
This is where things get complicated. If you feel like taking a break, it might be a good time to reflect on what has been said and assimilate the ideas presented here. It may look like we have already found a general method to handle function pointers, but the truth is that much more investigation is required. Simply changing the member procedure from the previous example into a member function will not work as required. The return value will be lost, the stack will not be adjusted correctly and the control flow after the function call will be undefined. The reason behind it is that a member function is called in a different way than a global function. Since global functions can be used as callbacks to API calls, their behaviour is quite well-known. For example, if it is supposed to return a Long
, we will use the eax
register as shown below:
Private Function GlobalFunction(ByVal param As Long) As Long
GlobalFunction = param
End Function
This becomes:
00402AA0 mov eax,dword ptr [esp+4] // GlobalFunction = param
00402AA4 ret 4 // removes param from stack on return
On the other hand, member functions have a different mechanism. They are being told by the caller where to copy the return value, and I will show you how. Once again, the delightful / painful process of finding, reading and understanding the disassembly of native code generated by the Visual Basic 6 compiler is required:
Public Function MemberFunction(ByVal param As Long) As Long
MemberFunction = param
End Function
This becomes:
00404230 push ebp
// 'ebp' register is saved on the stack bellow the return address
00404231 mov ebp,esp
// stack frame established (new ebp points to old ebp)
00404233 sub esp,0Ch
00404236 push 4011E6h // exception handler address
0040423B mov eax,fs:[00000000]
00404241 push eax
00404242 mov dword ptr fs:[0],esp
// register exception handler frame
00404249 sub esp,0Ch
0040424C push ebx
0040424D push esi // 'esi' register saved
0040424E push edi
0040424F mov dword ptr [ebp-0Ch],esp
00404252 mov dword ptr [ebp-8],4011D0h
00404259 xor esi,esi // esi set to zero
0040425B mov dword ptr [ebp-4],esi // local temp0 gets zero value
0040425E mov eax,dword ptr [ebp+8] // gets pointer to object
00404261 push eax
00404262 mov ecx,dword ptr [eax]
00404264 call dword ptr [ecx+4]
// the address called here belongs to 'msvbvm60.dll'
00404267 mov edx,dword ptr [ebp+0Ch]
// 'edx' register gets the value of param
0040426A mov dword ptr [ebp-18h],esi
0040426D mov dword ptr [ebp-18h],edx
// local temp2 stores value of param
00404270 mov eax,dword ptr [ebp+8] // gets pointer to object
00404273 push eax
00404274 mov ecx,dword ptr [eax]
00404276 call dword ptr [ecx+8]
// the address called here belongs to 'msvbvm60.dll'
00404279 mov edx,dword ptr [ebp+10h]
// given 'return value' address is copied in 'edx'!!!
0040427C mov eax,dword ptr [ebp-18h]
// 'eax' register gets value of param from local temp2
0040427F mov dword ptr [edx],eax
// param value is copied at given 'return value' address!!!
00404281 mov eax,dword ptr [ebp-4]
// 'eax' register gets zero value from local temp0
00404284 mov ecx,dword ptr [ebp-14h]
00404287 pop edi
00404288 pop esi // restores 'esi' register
00404289 mov dword ptr fs:[0],ecx
// restores previous exception handler frame
00404290 pop ebx
00404291 mov esp,ebp // removes stack frame
00404293 pop ebp // restores 'ebp' register
00404294 ret 0Ch // removes 12 bytes from the stack on return!!!
We've learned that a member procedure is being passed an extra parameter representing the pointer to the object for which the method is called. Now we find that a member function is given one more parameter representing the address where the return value needs to be stored. Unfortunately, it is given as the last parameter (first pushed onto the stack, before our defined parameters), which complicates the situation greatly. How can such an implementation of a member function be replaced while handling accurately the return value? See for yourself:
VERSION 1.0 CLASS
Attribute VB_Name = "StubFctCall"
Private Sub ReplaceMemberWithGlobal(ByVal fnAddress As Long)
Dim pVFT As Long
CopyMemory pVFT, ByVal ObjPtr(Me), 4
Static oldAddress As Long
If (oldAddress = 0) Then
CopyMemory oldAddress, ByVal pVFT + OffsetToVFT, 4
End If
If (fnAddress = 0) Then
CopyMemory ByVal pVFT + OffsetToVFT, oldAddress, 4
Else
CopyMemory ByVal pVFT + OffsetToVFT, _
fnAddress, 4
End If
End Sub
Public Function MemberFunction(ByVal fnAddress As Long) As Long
Debug.Print Hex$(ObjPtr(Me)) & ".MemberFunction(0x" & Hex$(fnAddress) & ")"
ReplaceMemberWithGlobal fnAddress
MemberFunction = fnAddress
End Function
Private Function GlobalFunction(ByVal objInstance As Long, _
ByVal parameter As Long, ByRef retVal As Long) As Long
Debug.Print Hex$(objInstance) & ".GlobalFunction(0x" & Hex$(parameter) & ")"
retVal = parameter
GlobalFunction = 0
End Function
Sub Main()
Dim FwdFct As New StubFctCall
Dim retVal As Long
retVal = FwdFct.MemberFunction(AddressOf GlobalFunction)
Debug.Print "StubFctCall.MemberFunction() returned value: " & Hex$(retVal)
retVal = FwdFct.MemberFunction(&HB0AB0A)
Debug.Print "GlobalFunction() returned value: " & Hex$(retVal)
End Sub
The output of the above test shows the correct behaviour on returning the same Long
parameter:
1729D0.MemberFunction(0xAB1BD4)
StubFctCall.MemberFunction() returned value: AB1BD4
1729D0.GlobalFunction(0xB0AB0A)
GlobalFunction() returned value: B0AB0A
But why do we set the eax
register to zero (GlobalFunction = 0
) before returning? Because there's some sort of validation mechanism after a member function is being called:
Dim retVal As Long
retVal = FwdFct.MemberFunction(&HB0AB0A)
This becomes:
00402E52 lea edx,[ebp-0C8h]
// 'edx' register gets address of return value
00402E58 push edx // push address of return value
00402E59 push 0B0AB0Ah // push parameter
00402E5E push eax // push pointer to object 'FwdFct'
00402E5F mov ebx,eax // save pointer to object
00402E61 call dword ptr [ecx+1Ch]
// call first public member of the class
00402E64 cmp eax,esi
// here, 'esi' register is zero!!! success is returned as 0!!!
00402E66 fnclex
00402E68 jge 00402E79 // if success returned, jump after next call
00402E6A push 1Ch
00402E6C push 4022A8h
00402E71 push ebx
00402E72 push eax
00402E73 call dword ptr ds:[401024h]
00402E79 mov eax,dword ptr [ebp-0C8h]
// 'eax' register gets return value
00402E7F lea edx,[ebp-94h]
00402E85 lea ecx,[ebp-24h]
00402E88 push edx
00402E89 mov dword ptr [ebp-24h],eax
// retVal variable gets return value
The same behaviour can be extended for return types such as Double
, Date
, Currency
and so on. An API returning a Currency
type is using eax
and edx
registers, which we'll have to copy at the given address that points to the location where the return value is expected by the caller. Double
and Date
types are returned through the floating point register. It is important to understand that we need to write different stubs depending on the return type of a function pointer. My plan was to have a generic solution for procedures, functions that return 32-bit and 64-bit types, and functions that return quad-words through the floating point register.
Strings are returned by reference, which means they can be handled as a 32-bit pointer, returned in the eax
register. For procedures that do not return anything, the forwarding stub already presented can be safely used. For functions, however, we need to write different stubs which are copying one or two register values, or the floating point register, at the given location where the caller expects the return value. Moreover, since an extra, last parameter is given by the caller, our stubs should also remove the pointer to the return value, to properly adjust the stack pointer on return to the caller.
It is obvious that such a stub implementation cannot simply jump to the forwarding address as we were doing for procedure calls forwarding. Instead, the forwarding address must be called, in order to return back to our stub and not to the original caller. When the forwarding call is made, the return address to our stub is pushed on the stack, and that must happen immediately after the parameters expected by the function we're calling. However, that means that we have to remove the address of the Visual Basic caller from the stack and save it at some safe location where it will be available after the forwarding call.
One solution I found was to save it at the location given by the caller for the return value. Feasible, but it involves changing the stub implementation for each prototype of the functions we're forwarding calls to. This means that we'll have to provide the size in bytes of all the parameters that the function takes to the piece of code that replaces the address of a method in the VFT. Hmm, not an easy task to do...
Luckily, there is another way! If you're not already acquainted with it, let me introduce you to the "Thread Information Block" (TIB
). The format of the TIB structure can be found in many online resources (for example, "Under The Hood - MSJ, May 1996" by Matt Pietrek) and won't be described here. For our benefit, the TIB structure holds an entry pvArbitrary
that is available for application use. Very few applications make use of this location, thus not affecting other components by overwriting their data. Since the TIB structure is available on a per-thread basis, our implementation will be thread-safe.
Is it enough to store the return address in pvArbitrary
of the TIB? I'm afraid the answer is no. It will fail in case of re-entrant calls or when any stub replaces the return address of another stub. A common scenario is that an API is called through our forwarding technique, being passed a callback which also calls an API through a stub. How do we make two imbricated calls on the same thread not overwrite their return address? We create a linked list which acts as a LIFO (last in, first out) stack. The pvArbitrary
will always point to the head of the list, which stores the return address of the last call.
Such a linked list requires allocation and deletion, for which I have chosen the GlobalAlloc
and GlobalFree
APIs from kernel32.dll because they are always available. The resulting assembly that can be used to form stubs handling Double
/ Date
(64-bit floating point register), Currency
(eax
and edx
registers) and Long
/ Integer
/ Boolean
/ ByVal String
(eax
register) returns is explained below:
push 8 // we need 8 bytes allocated
push 0 // we need fixed memory (GMEM_FIXED)
mov eax,XXXXXXXXh
// replace here XXXXXXXX with the address of GlobalAlloc
call eax // allocate new list node
pop ecx // remove return address from stack
mov dword ptr [eax],ecx // store return address in list node
pop ecx // remove pointer to object from stack
mov ecx,dword ptr fs:[18h] // get pointer to TIB structure
mov edx,dword ptr [ecx+14h]
// get pointer to previous list node from pvArbitrary
mov dword ptr [eax+4],edx // link list nodes
mov dword ptr [ecx+14h],eax
// store new head of list at pvArbitrary
mov eax,XXXXXXXXh
// XXXXXXXX can be replaced here with any forwarding address
call eax // call the forwarding address
pop ecx
// get the location where return value is expected by VB caller
#ifdef _RETURN64_DOUBLE_ // return 64-bit floating point value
fstp qword ptr [ecx] // return value gets double result
#else
mov dword ptr [ecx],eax // copy first 32-bit of the return value
#ifdef _RETURN64_ // return 64-bit value
mov dword ptr [ecx+4],edx
// copy second 32-bit of the return value
#endif
#endif
mov ecx,dword ptr fs:[18h] // get pointer to TIB structure
mov eax,dword ptr [ecx+14h]
// get pointer to head of list from pvArbitrary
mov edx,dword ptr [eax] // get return address from list node
push edx // restore return address onto stack
mov edx,dword ptr [eax+4] // get pointer to previous list node
mov dword ptr [ecx+14h],edx
// store pointer to previous node at pvArbitrary
push eax // we need to free the list node
mov eax,XXXXXXXXh
// replace here XXXXXXXX with the address of GlobalFree
call eax // free list node
ret // return to VB caller
What happens with the validation mechanism after a member function is called? Before returning to the Visual Basic caller, the eax
register needs to be set to zero. The documentation of the GlobalFree
API says that it returns zero when the function succeeds. I am counting on the successful discard of the allocated node, as I'm counting on the successful allocation of it. If you prefer, insert a xor eax,eax
(native code is "33C0"
) above the ret
instruction. Any other improvements to this stub are left as an exercise to the reader. Stub implementations can be written to handle even __stdcall
to __cdecl
forwarding. The world of possibilities is endless. The native code produced from the above assembly can be split into 3 parts, to ease the formation of a stub handling the required return type just by concatenating strings:
VERSION 1.0 CLASS
Attribute VB_Name = "StubFwdWithStackOnTIB"
Private Const hexStub4FunctionProlog = "6A08" & "6A00" & _
"B8XXXXXXXX" & "FFD0" & "59" & "8908" & _
"59" & "648B0D18000000" & "8B5114" & "895004" & "894114" & _
"B8XXXXXXXX" & "FFD0" & "59"
Private Const hexStub4ReturnDbl = "DD19"
Private Const hexStub4Return32bit = "8901"
// copy second 32-bit of the return value
Private Const hexStub4Return64bit = "8901" & "895104"
Private Const hexStub4FunctionEpilog = "648B0D18000000" & _
"8B4114" & "8B10" & "52" & "8B5004" & _
"895114" & "50" & "B8XXXXXXXX" & "FFD0" & "C3"
Private Enum StubTypes
ret0bit
ret32bit
ret64bit
retDbl
End Enum
Private VFTable() As Long
Private VFArray() As Long
Private pGlobalAlloc As Long
Private pGlobalFree As Long
Private Sub Class_Initialize()
ReDim VFTable(0)
VFTable(0) = 0
ReDim VFArray(0)
VFArray(0) = 0
Dim hKernel32 As Long
hKernel32 = LoadLibrary("kernel32.dll")
pGlobalAlloc = GetProcAddress(hKernel32, "GlobalAlloc")
pGlobalFree = GetProcAddress(hKernel32, "GlobalFree")
End Sub
Private Sub Class_Terminate()
ReplaceMethodWithStub 0, 0, 0
End Sub
Private Sub RemoveStub(ByVal index As VFTidxs)
Dim pVFT As Long
CopyMemory pVFT, ByVal ObjPtr(Me), 4
If (index < 1) Then Exit Sub
If (index > UBound(VFTable)) Then Exit Sub
If (VFTable(index) <> 0) Then
Dim oldAddress As Long
oldAddress = VFTable(index)
CopyMemory ByVal pVFT + OffsetToVFT + index * 4, oldAddress, 4
VFTable(index) = 0
GlobalFree VFArray(index)
VFArray(index) = 0
End If
End Sub
Private Sub ReplaceMethodWithStub(ByVal index As VFTidxs, _
ByVal fnType As StubTypes, ByVal fnAddress As Long)
Dim pVFT As Long
CopyMemory pVFT, ByVal ObjPtr(Me), 4
If (index < 1) Then
For index = 1 To UBound(VFTable)
RemoveStub index
Next index
Else
If (fnAddress = 0) Then
RemoveStub index
Else
If (index > UBound(VFTable)) Then
ReDim Preserve VFTable(index)
VFTable(index) = 0
ReDim Preserve VFArray(index)
VFArray(index) = 0
End If
RemoveStub index
Dim oldAddress As Long
CopyMemory oldAddress, ByVal pVFT + OffsetToVFT + index * 4, 4
VFTable(index) = oldAddress
Dim hexCode As String
Select Case fnType
Case StubTypes.retDbl
hexCode = hexStub4FunctionProlog & hexStub4ReturnDbl & _
hexStub4FunctionEpilog
Case StubTypes.ret64bit
hexCode = hexStub4FunctionProlog & hexStub4Return64bit & _
hexStub4FunctionEpilog
Case StubTypes.ret32bit
hexCode = hexStub4FunctionProlog & hexStub4Return32bit & _
hexStub4FunctionEpilog
Case Else
hexCode = hexStub4ProcedureCall
End Select
Dim nBytes As Long
nBytes = Len(hexCode) \ 2
Dim Bytes() As Byte
ReDim Preserve Bytes(1 To nBytes)
Dim idx As Long
For idx = 1 To nBytes
Bytes(idx) = Val("&H" & Mid$(hexCode, idx * 2 - 1, 2))
Next idx
If (fnType = ret0bit) Then
CopyMemory Bytes(5), fnAddress, 4
Else
CopyMemory Bytes(6), pGlobalAlloc, 4
CopyMemory Bytes(33), fnAddress, 4
CopyMemory Bytes(nBytes - 6), pGlobalFree, 4
End If
Dim addrStub As Long
addrStub = GlobalAlloc(GMEM_FIXED, nBytes)
CopyMemory ByVal addrStub, Bytes(1), nBytes
CopyMemory ByVal pVFT + OffsetToVFT + index * 4, _
addrStub, 4
VFArray(index) = addrStub
End If
End If
End Sub
The sample project provided shows a class implementation that tests all the discussed return types, as well as a callback which uses call forwarding. This is done to ensure that our return addresses are not being replaced when saved as a linked list stored in the pvArbitrary
entry of the TIB.
While C++ has typedef
for prototype definition of function pointers, __asm
blocks, intrinsic __asm __emit
and __declspec(naked)
function declaration specification, the best I could write in Visual Basic 6 is the class described in this last section, which I believe can be used to achieve about everything you could with C++. For those who are too confident when writing code, type checking of parameters passed to a function pointer can be utterly removed. In most situations, a developer will know at compile time a prototype of a function for which it has a pointer. Very few applications would require call construction at runtime, with any number of parameters, of any known types.
Some years ago, I was looking for an application that could read some sort of script where I could define how to test some DLLs I had developed. It would've helped regression testing, and could've been easily maintained by a non-developer tester. Well, such an application can be written as only one Visual Basic 6 module, with the help of TypelessFwd
class provided in the sample project.
I do not recommend using typeless forwarding when you know the prototype of the function at compile time. This technique merely helps in understanding function calls and parameter passing. It should rarely be used and with advised caution, as when writing an assembly. Because it splits apart the parameter's pushing and the call itself, there'll be no correlation that the compiler can test or understand. If you can live with that and assume the full responsibility of checking the types of parameters expected by a function called through a pointer, then its maximum flexibility will be the reward:
Dim pFn As New TypelessFwd
Dim sqValue As Double
sqValue = 5
Dim hVBVM As Long, pSquareRoot As Long
pSquareRoot = pFn.AddressOfHelper(hVBVM, "msvbvm60.dll", "rtcSqr")
pFn.ByRefPush VarPtr(sqValue) + 4
pFn.ByRefPush VarPtr(sqValue)
Debug.Print "Sqr(" & sqValue & ") = " & pFn.CallDblReturn(pSquareRoot)
pFn.AddressOfHelper hVBVM, vbNullString, vbNullString
Dim interval As String, number As Double, later As Date
interval = "h"
number = 10
pFn.ByValPush VarPtr(number)
pFn.ByValPush StrPtr(interval)
later = pFn.CallDblReturn(AddressOf DateFromNow)
Debug.Print "In " & number & " " & interval & " will be: " & later
Dim sSentence As String
sSentence = "The third character in this sentence is:"
pFn.ByValPush 1
pFn.ByValPush 3
pFn.ByValPush VarPtr(sSentence)
Dim retVal As String
Debug.Print sSentence & " '" & pFn.CallStrReturn(AddressOf SubString) & "'."
Dim sCpuVendorID As String
sCpuVendorID = Space$(12)
Dim pGet_CPU_VendorID As Long
pFn.NativeCodeHelper pGet_CPU_VendorID, hexGet_CPU_VendorID
pFn.ByValPush StrPtr(sCpuVendorID)
pFn.Call0bitReturn pGet_CPU_VendorID
pFn.NativeCodeHelper pGet_CPU_VendorID, vbNullString
Debug.Print "CPU Vendor ID: " & sCpuVendorID & "."
The class provides two helper methods for obtaining function pointers. AddressOfHelper
can be used to load and unload DLLs at runtime and retrieve pointers to exported functions by name or ordinal, as with the GetProcAddress
API. NativeCodeHelper
will allocate and free memory storing native code given as hex strings. Whenever you discover a bottleneck in your Visual Basic code, write some assembly that does it faster and embed the native code through a hex string. One remark should be made about embedded native code: edi
, esi
and ebx
register values should be preserved, since it seems that Visual Basic makes use of one of them for validating the return value. The sample project shows how to retrieve some CPU information (using cpuid
instruction) by calling embedded native code.
Two other methods of this class will help you push parameters onto the stack before calling a pointer to a function. ByValPush
takes a Long
value (32-bit) and leaves it onto the stack as it is given. When a parameter to a function is passed ByRef
, the pointer to the value should be pushed onto the stack. For example, ByRef param As Double
means the address of the double
variable should be pushed onto the stack before the call, which can be achieved using ByValPush VarPtr(param)
. It is like saying, "Push the 32-bit value of the address to double parameter." How to pass a Double
type ByVal
instead of ByRef
? Use the ByRefPush
method to dereference any address, by pushing onto the stack the 32-bit value it points to. Since Double
is a 64-bit type, we must call ByRefPush
twice.
First, we'll push the value of the last 32-bit of the Double
and then push the first 32-bit, because parameters and their contents get pushed from right to left. The reason, for those who didn't understand already, is that the esp
register (stack pointer) decreases 4-bytes after a push
instruction (stack grows downwards). Don't be confused by the naming of these methods, as they shouldn't be associated with the ByVal
/ ByRef
declaration of parameters in a function prototype. Calling 'ByRefPush VarPtr(param) + 4'
is like saying, "Add 4 bytes to the address of the double parameter and use the resulting address to read a 32-bit value that must be pushed onto the stack."
Usually, custom types are passed by reference. However, even when passed by value, the same policy can be applied while considering their size and byte-alignment. Pushing each 32-bit value of a structure, giving its address to the ByRefPush
method, will achieve a ByVal
parameter passing of a custom type. String
type directly maps a BSTR
(an Automation type representing a basic string commonly used through COM) and you can get both the pointer to the encapsulated buffer, as well as the pointer to the object.
When a parameter of type String
is defined ByVal
, it means that the function needs the pointer to the actual buffer, which can be retrieved with StrPtr
. When the String
parameter is defined ByRef
, it means that the function needs the pointer to the object (double pointer to the buffer), which can be retrieved with VarPtr
. In both cases, use ByValPush
with the appropriate address. You can find in my tests calls that receive string
s and double
s passed both as ByVal
and ByRef
.
The other five methods of the class, being replaced with stubs as the previous two, have the purpose of making a call to a given address while handling different types of return values: none (procedure with no return value), 32-bit value, pointer to string (32-bit address), 64-bit value (ex. Currency
) and double (64-bit floating point value). Please note here that a pointer to string return is specifically declared (Long
cannot be used instead), because Visual Basic must see the return as ByVal String
. The assignment operator copying a Long
into a String
would result in a textual number representing the address of the buffer and not the content of the buffer. Looking at the assembly of the stubs used for these methods, you'll observe that our list kept at pvArbitrary
in the TIB structure stores also the location of the return value. This is because, in the moment of the call, it is pushed onto the stack after the parameters for the forwarding call. It thus must be removed and saved as the return address to the Visual Basic caller.
Before going through the sample code and trying to adapt it to your needs, let me summarize the topics discussed:
- Public methods of a class are called through jump tables and the corresponding addresses are stored in a "Virtual Function Table" (
VFT
). The pointer to the vftable
is stored as the first 4 bytes (hidden member data) within each instance of the class.
- The compiler will generate accessors and modifiers for public variables (member data) of a class and their addresses will be inserted in the VFT. For classes having no public member data, the VFT will contain addresses of our defined methods starting with offset &H1C (28 bytes), in the order of their declaration.
- Replacing the addresses in the VFT will redirect calls to whatever code we need and, as long as the prototype and calling convention aren't changed, the call will return successfully.
- Each
vftable
is shared by all instances of the same class. Thus, changing it for one object means it will affect even the objects created later during the lifetime of that code section (until a corresponding Visual Basic DLL is unloaded respectively or the entire process when it resides is in the main executable).
- Member procedures are given an extra parameter (pushed last onto the stack) representing the pointer to the object for which we make the call.
- Member functions are given two extra parameters: the object pointer (as the first parameter, pushed last onto the stack) and the location where the return value is expected by the caller (as the last parameter, pushed first onto the stack).
- There'll be no string conversion or other type conversion for parameters, as it is done for calls to functions defined using the
Declare
statement. The same can be assumed for return types. Strings passed or returned by value can be handled as a 32-bit pointer to the encapsulated buffer, as known from BSTR
type description.
- Native code can be easily embedded by converting an equivalent hex string into a fixed memory buffer dynamically allocated with
GlobalAlloc
. Embedding native code, stub functions can replace the methods of a class. These can forward our calls while adjusting the stack as expected and handling the return value, depending on its type. It is not shown here, but it is even possible to convert calls made with the __stdcall
calling convention and received by functions written as the __cdecl
calling convention.
- When these stubs need control after the forwarding call returns, they must temporarily store the return address of the Visual Basic caller. Different designs may also need to store the pointer to the return value or, not shown here, even some context registers. In order to be thread-safe and re-entrant capable, these can be saved in a linked list dynamically managed as a LIFO stack. The pointer to its head node can be copied into the
pvArbitrary
member of the "Thread Information Block" (TIB
) structure.
- Visual Basic generates some validation code testing
eax
registers against one of these registers: edi
, esi
, ebx
. If the embedded native code is using these registers, they must be saved and eax
set to zero before returning.
I'm hoping that you didn't find it too obscure. Thank you very much for reading and I wish that it will prove useful in your future implementations. Any kind of feedback would be greatly appreciated.
- 06-06-2007: Original version.
Known issue: due to the lack of time available, an unresolved issue noticed in the sample code was left for further investigation. I can describe the behaviour and the reason such that you'll be aware of the problem. Running the code in Visual Basic's IDE will not report any problem because first-chance exceptions are not displayed to the user. The heap manager reports that the same invalid address was given to RtlSizeHeap
and RtlFreeHeap
. It seems that this happens when a method returning a String
type has been replaced by a function that does not return a discardable BSTR
. It all works perfectly if the method is replaced by another Visual Basic implementation. However, testing the GetCommandLine
API which returns a const
buffer and not a discardable BSTR
, Visual Basic has tried to delete the returned memory. Thus, the heap manager complains about being an invalid address. A workaround can be used for APIs returning constant string buffers. Declare them as returning a Long
and copy the memory from the returned pointer until the null-terminating character.