Table of Contents
This article is the obvious culmination of the previous effort of writing the Rebel.NET application and the first of a two series of articles about the .NET Framework internals and the protections available for .NET assemblies. The next article will be about .NET native compiling. As the JIT inner workings haven't been analyzed yet, .NET protections are quite naïve nowadays. This situation will rapidly change as soon as the reverse engineering community will focus its attention on this technology. These two articles are aimed to raise the consciousness about the current state of .NET protections and what is possible to achieve but hasn't been done yet. In particular, the current article about .NET code injection represents, let's say, the present, whereas the next one about .NET native compiling represents the future. What I'm presenting in these two articles is new at the time I'm writing it, but I expect it to become obsolete in less than a year. Of course, this is obvious as I'm moving the first steps out from current .NET protections in the direction of better ones. But this article isn't really about protections: exploring the .NET Framework internals can be useful for many purposes. So, talking about protections is just a means to an end.
.NET code injection is the "strong" brother of .NET packers (which unpack the entire assembly in memory). What .NET code injectors do is to hook the JIT and when the MSIL code of a method is requested, they filter the request and provide the real MSIL instead of the MSIL contained in the assembly, which, most of the times, is just a ret. By injecting one (or quasi) method at a time, the MSIL code will remain concealed. Even if one manages to dump the code, it isn't to be expected that the protection left the necessary space for the real MSIL code in the .NET assembly, although many commercial protections do so. Rebuilding the assembly from scratch is the universally valid way to proceed. This, of course, is not a problem with Rebel.NET.
It should be obvious to the reader that .NET code injectors aren't reliable protections. It's just playing hide and seek with the reverser. But, as many software producers are putting their intellectual property in the hands of such protections, it is necessary to analyze them throughout.
One thing should be clear from the beginning: there isn't only one method to inject MSIL. Thus, to remove this kind of protection you have to evaluate the specific case.
A very clean approach, though unused yet, would be to inject the MSIL through the .NET profiling API. There's a very in depth article about the .NET profiling API by Aleksandr Mikunov on MSDN. Anyway, as I already said, this approach isn't used by .NET protections. I referred to this approach as clean, simply because it uses the API provided by the framework itself. Thus, it'll work on every .NET Framework no matter what. Whereas .NET protections usually hook the JIT and this, although it might work just as well, it is not guaranteed to do so.
The .NET Framework's JIT is contained in the mscorjit.dll module. To identify the part of the JIT being hooked by the .NET protection, there's a very simple an effect way: dumping the mscorjit.dll module from the protection's process and comparing it to the original module on disk. I wrote a little CFF Explorer Script to make a comparison of a PE section which excludes the IAT in the comparison. This is especially useful when comparing the .text
section of two executables.
It was a ten minute job and it is extremely useful to identify the type of hook applied to the JIT. Let's look at a possible output of this script:
Comparison between section 0 of ...
C:\...\mscorjit.dll
... and
C:\...\mscorjit_dumped.dll
Differences found at:
RVA1 RVA2
000460A0 000460A0
000460A1 000460A1
000460A2 000460A2
000460A3 000460A3
Number of differences found: 4.
By looking at the patched dword
through a disassembler, it is possible to understand the kind of hook:
.text:790A60A0 ??_7CILJit@@6B@ dd offset
?compileMethod@CILJit@@EAG?AW4CorJitResult
@@PAVICorJitInfo@@PAUCORINFO_METHOD_INFO@@IPAPAEPAK@Z
.text:790A60A0
CILJit::compileMethod(ICorJitInfo *,CORINFO_METHOD_INFO *,uint,uchar * *,ulong *)
.text:790A60A4 dd offset ?clearCache@CILJit@@EAGXXZ
.text:790A60A8 dd offset ?isCacheCleanupRequired@CILJit@@EAGHXZ
CILJit::isCacheCleanupRequired(void)
The patched dword
is the offset of the compileMethod
method of the CILJit
class. This brings us to the next paragraph.
In this paragraph, I'll present to you the compileMethod
function as a way to gain complete control of the JIT. Code injectors don't need that much of control and their functionality is actually very easy.
As the mscorjit
's debug symbols already signalled, the compileMethod
function takes a pointer to the CORINFO_METHOD_INFO
structure among other parameters. To explore in depth the JIT internals I'll rely upon the Microsoft Rotor Project, which basically is a smaller open source version of the .NET Framework. I think almost everyone knows the existence of this project, but only a limited number of people know how much one can take from its internals to use in the official framework's context. And I'm not talking about those who create code injectors, because they don't need so much knowledge about the JIT internal workings to achieve what they do.
Let's take a look at the CORINFO_METHOD_INFO
structure:
struct CORINFO_METHOD_INFO
{
CORINFO_METHOD_HANDLE ftn;
CORINFO_MODULE_HANDLE scope;
BYTE * ILCode;
unsigned ILCodeSize;
unsigned short maxStack;
unsigned short EHcount;
CorInfoOptions options;
CORINFO_SIG_INFO args;
CORINFO_SIG_INFO locals;
};
The only thing code injectors have to do is to provide a valid MSIL pointer and size given the two members of this structure: ILCode
and ILCodeSize
. Code injectors rely on the ILCode
pointer to know which method is being requested. In fact, this pointer addresses the original MSIL code inside the .NET assembly. Many code injectors don't even need to know which method is being requested as the ILCode
points to data which only needs to be decrypted.
The pointer to the vftable
which contains the address to compileMethod
is retrieved through the only API exported by mscorjit: getJit
.
extern "C"
ICorJitCompiler* __stdcall getJit()
{
static char FJitBuff[sizeof(FJitCompiler)];
if (ILJitter == 0)
{
ILJitter = new(FJitBuff) FJitCompiler();
_ASSERTE(ILJitter != NULL);
}
return(ILJitter);
}
And this is about all that code injectors ought to know to do their job. But we go further. The FJitCompiler
class is this:
class FJitCompiler : public ICorJitCompiler
{
public:
CorJitResult __stdcall compileMethod (
ICorJitInfo* comp,
CORINFO_METHOD_INFO* info,
unsigned flags,
BYTE ** nativeEntry,
ULONG * nativeSizeOfCode
);
void __stdcall clearCache();
BOOL __stdcall isCacheCleanupRequired();
static BOOL Init();
static void Terminate();
private:
BOOL GetJitHelpers(ICorJitInfo* jitInfo);
};
ICorJitCompiler
is only an interface, so we don't have to discuss it. compileMethod
is the first member of the class, of course. The idea that I could gain complete control of the JIT hit me pretty fast. It's a bit difficult to explain in words, but in about ten minutes I noticed that the correspondences between the disassembled mscorjit
code and the Rotor
one were just too many. So, I decided to include the header files necessary to use the ICorJitInfo
class from the Rotor
project. Actually, to use this class, only two header files were necessary: corinfo.h and corjit.h. All include files can be found in the path sscli20\clr\src\inc of the Rotor
project. While the path of the JIT code is sscli20\clr\src\fjit. Here's a brief summary of what the main files in the JIT path contain:
fjit.cpp | The actual JIT. It's a huge file of code since it contains all the code to convert MSIL to native code, although the native code is defined externally. |
fjitcompiler.cpp | Contains the getJit and compileMethod functions. It's the interface provided by mscorjit to communicate with the JIT. |
Basically, the ICorJitCompiler
is the interface accessed by the Execution Engine to convert MSIL to native code. One of the arguments of compileMethod
is ICorJitInfo
which is a class used by the JIT to call back to the Execution Engine in order to retrieve the information it needs. Having complete access to the ICorJitInfo
class doesn't open up the whole framework for us. But it's a very good start. Let's have a look at what this class can do. Here's the base declaration:
class ICorJitInfo : public virtual ICorDynamicInfo
{
public:
virtual IEEMemoryManager* __stdcall getMemoryManager() = 0;
virtual void __stdcall allocMem (
ULONG hotCodeSize,
ULONG coldCodeSize,
ULONG roDataSize,
ULONG rwDataSize,
ULONG xcptnsCount,
CorJitAllocMemFlag flag,
void ** hotCodeBlock,
void ** coldCodeBlock,
void ** roDataBlock,
void ** rwDataBlock
) = 0;
virtual void * __stdcall allocGCInfo (
ULONG size
) = 0;
virtual void * __stdcall getEHInfo(
) = 0;
virtual void __stdcall yieldExecution() = 0;
virtual void __stdcall setEHcount (
unsigned cEH
) = 0;
virtual void __stdcall setEHinfo (
unsigned EHnumber,
const CORINFO_EH_CLAUSE *clause
) = 0;
virtual BOOL __cdecl logMsg(unsigned level, const char* fmt, va_list args) = 0;
virtual int __stdcall doAssert
(const char* szFile, int iLine, const char* szExpr) = 0;
struct ProfileBuffer
{
ULONG bbOffset;
ULONG bbCount;
};
virtual HRESULT __stdcall allocBBProfileBuffer (
ULONG size,
ProfileBuffer ** profileBuffer
) = 0;
virtual HRESULT __stdcall getBBProfileData(
CORINFO_METHOD_HANDLE ftnHnd,
ULONG * size,
ProfileBuffer ** profileBuffer,
ULONG * numRuns
) = 0;
};
This doesn't seem much, but the ICorJitInfo
class inherits from the ICorDynamicInfo
one.
class ICorDynamicInfo : public virtual ICorStaticInfo
{
public:
virtual DWORD __stdcall getThreadTLSIndex(
void **ppIndirection = NULL
) = 0;
virtual const void * __stdcall getInlinedCallFrameVptr(
void **ppIndirection = NULL
) = 0;
virtual LONG * __stdcall getAddrOfCaptureThreadGlobal(
void **ppIndirection = NULL
) = 0;
virtual SIZE_T* __stdcall
getAddrModuleDomainID(CORINFO_MODULE_HANDLE module) = 0;
virtual void* __stdcall getHelperFtn (
CorInfoHelpFunc ftnNum,
void **ppIndirection = NULL,
InfoAccessModule *pAccessModule = NULL
) = 0;
virtual void __stdcall getFunctionEntryPoint(
CORINFO_METHOD_HANDLE ftn,
InfoAccessType requestedAccessType,
CORINFO_CONST_LOOKUP * pResult,
CORINFO_ACCESS_FLAGS accessFlags = CORINFO_ACCESS_ANY) = 0;
virtual void __stdcall getFunctionFixedEntryPointInfo(
CORINFO_MODULE_HANDLE scopeHnd,
unsigned metaTOK,
CORINFO_CONTEXT_HANDLE context,
CORINFO_LOOKUP * pResult) = 0;
virtual void* __stdcall getMethodSync(
CORINFO_METHOD_HANDLE ftn,
void **ppIndirection = NULL
) = 0;
virtual bool __stdcall canEmbedModuleHandleForHelper(
CORINFO_MODULE_HANDLE handle
) = 0;
virtual CORINFO_MODULE_HANDLE __stdcall embedModuleHandle(
CORINFO_MODULE_HANDLE handle,
void **ppIndirection = NULL
) = 0;
virtual CORINFO_CLASS_HANDLE __stdcall embedClassHandle(
CORINFO_CLASS_HANDLE handle,
void **ppIndirection = NULL
) = 0;
virtual CORINFO_METHOD_HANDLE __stdcall embedMethodHandle(
CORINFO_METHOD_HANDLE handle,
void **ppIndirection = NULL
) = 0;
virtual CORINFO_FIELD_HANDLE __stdcall embedFieldHandle(
CORINFO_FIELD_HANDLE handle,
void **ppIndirection = NULL
) = 0;
virtual void __stdcall embedGenericHandle(
CORINFO_MODULE_HANDLE module,
unsigned metaTOK,
CORINFO_CONTEXT_HANDLE context,
CorInfoTokenKind tokenKind,
CORINFO_GENERICHANDLE_RESULT *pResult) = 0;
virtual CORINFO_LOOKUP_KIND __stdcall getLocationOfThisType(
CORINFO_METHOD_HANDLE context
) = 0;
virtual void* __stdcall getPInvokeUnmanagedTarget(
CORINFO_METHOD_HANDLE method,
void **ppIndirection = NULL
) = 0;
virtual void* __stdcall getAddressOfPInvokeFixup(
CORINFO_METHOD_HANDLE method,
void **ppIndirection = NULL
) = 0;
virtual LPVOID GetCookieForPInvokeCalliSig(
CORINFO_SIG_INFO* szMetaSig,
void ** ppIndirection = NULL
) = 0;
virtual CORINFO_JUST_MY_CODE_HANDLE __stdcall getJustMyCodeHandle(
CORINFO_METHOD_HANDLE method,
CORINFO_JUST_MY_CODE_HANDLE**ppIndirection = NULL
) = 0;
virtual void __stdcall GetProfilingHandle(
CORINFO_METHOD_HANDLE method,
BOOL *pbHookFunction,
void **pEEHandle,
void **pProfilerHandle,
BOOL *pbIndirectedHandles
) = 0;
virtual unsigned __stdcall getInterfaceTableOffset (
CORINFO_CLASS_HANDLE cls,
void **ppIndirection = NULL
) = 0;
virtual void __stdcall getCallInfo(
CORINFO_METHOD_HANDLE methodBeingCompiledHnd,
CORINFO_MODULE_HANDLE tokenScope,
unsigned methodToken,
unsigned constraintToken,
CORINFO_CONTEXT_HANDLE tokenContext,
CORINFO_CALLINFO_FLAGS flags,
CORINFO_CALL_INFO *pResult) = 0;
virtual BOOL __stdcall isRIDClassDomainID(CORINFO_CLASS_HANDLE cls) = 0;
virtual unsigned __stdcall getClassDomainID (
CORINFO_CLASS_HANDLE cls,
void **ppIndirection = NULL
) = 0;
virtual size_t __stdcall getModuleDomainID (
CORINFO_MODULE_HANDLE module,
void **ppIndirection = NULL
) = 0;
virtual void* __stdcall getFieldAddress(
CORINFO_FIELD_HANDLE field,
void **ppIndirection = NULL
) = 0;
virtual CORINFO_VARARGS_HANDLE __stdcall getVarArgsHandle(
CORINFO_SIG_INFO *pSig,
void **ppIndirection = NULL
) = 0;
virtual InfoAccessType __stdcall constructStringLiteral(
CORINFO_MODULE_HANDLE module,
mdToken metaTok,
void **ppInfo
) = 0;
virtual DWORD __stdcall getFieldThreadLocalStoreID (
CORINFO_FIELD_HANDLE field,
void **ppIndirection = NULL
) = 0;
virtual CORINFO_CLASS_HANDLE __stdcall findMethodClass(
CORINFO_MODULE_HANDLE module,
mdToken methodTok,
CORINFO_METHOD_HANDLE context
) = 0;
virtual void __stdcall setOverride(
ICorDynamicInfo *pOverride
) = 0;
virtual void __stdcall addActiveDependency(
CORINFO_MODULE_HANDLE moduleFrom,
CORINFO_MODULE_HANDLE moduleTo
) = 0;
virtual CORINFO_METHOD_HANDLE __stdcall GetDelegateCtor(
CORINFO_METHOD_HANDLE methHnd,
CORINFO_CLASS_HANDLE clsHnd,
CORINFO_METHOD_HANDLE targetMethodHnd,
DelegateCtorArgs * pCtorData
) = 0;
virtual void __stdcall MethodCompileComplete(
CORINFO_METHOD_HANDLE methHnd
) = 0;
};
Now this seems already much more interesting indeed. But we aren't done yet as the ICorDynamicInfo
class inherits from the ICorStaticInfo
. The ICorStaticInfo
class inherits from many classes:
class ICorStaticInfo : public virtual ICorMethodInfo, public virtual ICorModuleInfo,
public virtual ICorClassInfo, public virtual ICorFieldInfo,
public virtual ICorDebugInfo, public virtual ICorArgInfo,
public virtual ICorLinkInfo, public virtual ICorErrorInfo
{
public:
virtual void __stdcall getEEInfo(
CORINFO_EE_INFO *pEEInfoOut
) = 0;
};
Let's look at just one of them (ICorMethodInfo
):
class ICorMethodInfo
{
public:
virtual const char* __stdcall getMethodName (
CORINFO_METHOD_HANDLE ftn,
const char **moduleName
) = 0;
virtual unsigned __stdcall getMethodHash (
CORINFO_METHOD_HANDLE ftn
) = 0;
virtual DWORD __stdcall getMethodAttribs (
CORINFO_METHOD_HANDLE calleeHnd,
CORINFO_METHOD_HANDLE callerHnd
) = 0;
virtual void __stdcall setMethodAttribs (
CORINFO_METHOD_HANDLE ftn,
CorInfoMethodRuntimeFlags attribs
) = 0;
virtual void __stdcall getMethodSig (
CORINFO_METHOD_HANDLE ftn,
CORINFO_SIG_INFO *sig,
CORINFO_CLASS_HANDLE memberParent = NULL
) = 0;
virtual bool __stdcall getMethodInfo (
CORINFO_METHOD_HANDLE ftn,
CORINFO_METHOD_INFO* info
) = 0;
virtual CorInfoInline __stdcall canInline (
CORINFO_METHOD_HANDLE callerHnd,
CORINFO_METHOD_HANDLE calleeHnd,
DWORD* pRestrictions
) = 0;
virtual bool __stdcall canTailCall (
CORINFO_METHOD_HANDLE callerHnd,
CORINFO_METHOD_HANDLE calleeHnd,
bool fIsTailPrefix
) = 0;
virtual bool __stdcall canSkipMethodPreparation (
CORINFO_METHOD_HANDLE callerHnd,
CORINFO_METHOD_HANDLE calleeHnd,
bool fCheckCode,
CorInfoIndirectCallReason *pReason = NULL,
CORINFO_ACCESS_FLAGS accessFlags = CORINFO_ACCESS_ANY) = 0;
virtual bool __stdcall canCallDirectViaEntryPointThunk (
CORINFO_METHOD_HANDLE calleeHnd,
void ** pEntryPoint
) = 0;
virtual void __stdcall getEHinfo(
CORINFO_METHOD_HANDLE ftn,
unsigned EHnumber,
CORINFO_EH_CLAUSE* clause
) = 0;
virtual CORINFO_CLASS_HANDLE __stdcall getMethodClass (
CORINFO_METHOD_HANDLE method
) = 0;
virtual CORINFO_MODULE_HANDLE __stdcall getMethodModule (
CORINFO_METHOD_HANDLE method
) = 0;
virtual unsigned __stdcall getMethodVTableOffset (
CORINFO_METHOD_HANDLE method
) = 0;
virtual CorInfoIntrinsics __stdcall getIntrinsicID(
CORINFO_METHOD_HANDLE method
) = 0;
virtual CorInfoUnmanagedCallConv __stdcall getUnmanagedCallConv(
CORINFO_METHOD_HANDLE method
) = 0;
virtual BOOL __stdcall pInvokeMarshalingRequired(
CORINFO_METHOD_HANDLE method,
CORINFO_SIG_INFO* callSiteSig
) = 0;
virtual BOOL __stdcall canAccessMethod(
CORINFO_METHOD_HANDLE context,
CORINFO_CLASS_HANDLE parent,
CORINFO_METHOD_HANDLE target,
CORINFO_CLASS_HANDLE instance
) = 0;
virtual BOOL __stdcall satisfiesMethodConstraints(
CORINFO_CLASS_HANDLE parent, CORINFO_METHOD_HANDLE method
) = 0;
virtual BOOL __stdcall isCompatibleDelegate(
CORINFO_CLASS_HANDLE objCls,
CORINFO_CLASS_HANDLE methodParentCls,
CORINFO_METHOD_HANDLE method,
CORINFO_CLASS_HANDLE delegateCls,
CORINFO_MODULE_HANDLE moduleHnd,
unsigned methodMemberRef,
unsigned delegateConstructorMemberRef
) = 0;
virtual CorInfoInstantiationVerification __stdcall isInstantiationOfVerifiedGeneric (
CORINFO_METHOD_HANDLE method
) = 0;
virtual void __stdcall initConstraintsForVerification(
CORINFO_METHOD_HANDLE method,
BOOL *pfHasCircularClassConstraints,
BOOL *pfHasCircularMethodConstraint
) = 0;
virtual CorInfoCanSkipVerificationResult __stdcall canSkipMethodVerification (
CORINFO_METHOD_HANDLE ftnHandle,
BOOL fQuickCheckOnly
) = 0;
virtual CorInfoIsCallAllowedResult __stdcall isCallAllowed (
CORINFO_METHOD_HANDLE callerHnd, CORINFO_METHOD_HANDLE calleeHnd, CORINFO_CALL_ALLOWED_INFO * CallAllowedInfo ) = 0;
virtual void __stdcall methodMustBeLoadedBeforeCodeIsRun(
CORINFO_METHOD_HANDLE method
) = 0;
virtual CORINFO_METHOD_HANDLE __stdcall mapMethodDeclToMethodImpl(
CORINFO_METHOD_HANDLE method
) = 0;
virtual void __stdcall getGSCookie(
GSCookie * pCookieVal, GSCookie ** ppCookieVal ) = 0;
};
As you can see, the ICorMethodInfo
class contains many methods which take one or more CORINFO_METHOD_HANDLE
as parameters. The ICorModuleInfo
has methods which take CORINFO_MODULE_HANDLE
parameters. These handles should be discussed, because all the JIT inner workings rely on them. Here's their declaration:
typedef struct CORINFO_ASSEMBLY_STRUCT_* CORINFO_ASSEMBLY_HANDLE;
typedef struct CORINFO_MODULE_STRUCT_* CORINFO_MODULE_HANDLE;
typedef struct CORINFO_DEPENDENCY_STRUCT_* CORINFO_DEPENDENCY_HANDLE;
typedef struct CORINFO_CLASS_STRUCT_* CORINFO_CLASS_HANDLE;
typedef struct CORINFO_METHOD_STRUCT_* CORINFO_METHOD_HANDLE;
typedef struct CORINFO_FIELD_STRUCT_* CORINFO_FIELD_HANDLE;
typedef struct CORINFO_ARG_LIST_STRUCT_* CORINFO_ARG_LIST_HANDLE;
typedef struct CORINFO_SIG_STRUCT_* CORINFO_SIG_HANDLE;
typedef struct CORINFO_JUST_MY_CODE_HANDLE_*CORINFO_JUST_MY_CODE_HANDLE;
typedef struct CORINFO_PROFILING_STRUCT_* CORINFO_PROFILING_HANDLE;
typedef DWORD* CORINFO_SHAREDMODULEID_HANDLE;
typedef struct CORINFO_GENERIC_STRUCT_* CORINFO_GENERIC_HANDLE;
The structures are not defined in the code: as the comment states, they're "opaque". Actually, these handles are just pointers. But we'll see that later. What should be understood now is that they are used by the methods of the JIT to identify things. I pasted all the declarations above to give the reader an idea of the kind of power over the JIT given by the two includes I mentioned earlier. Let's have look, for instance, at the first method of ICorMethodInfo
:
virtual const char* __stdcall getMethodName (
CORINFO_METHOD_HANDLE ftn,
const char **moduleName
) = 0;
This function retrieves the method's name and class. And I'll use it to give a basic example of how to hook the JIT and retrieve basic information from it. But first, I have to introduce another thing: the .NET assembly loader.
As we need to hook the JIT before the victim assembly is jitted, the best way to do this is to load the victim assembly from another assembly which in the meantime has already hooked the JIT. I call this the loader and this method works only when the protection isn't wrapping the assembly into a native executable. In that case, you might consider adding the hook DLL to the import table of the native EXE or creating the victim process in a suspended state, injecting the DLL and then resuming the execution.
There are several ways to load an assembly into the current address space. Unfortunately, the most used ones often do not work in all cases. For example a common way to do this is use the Assembly.Load
/LoadFrom
function. This approach will cause the application to crash if the hosted assembly needs its own appdomain
.
This is the very simple approach I'm using in this article:
namespace rbloader
{
static class Program
{
[DllImport("rbcoree.dll", CallingConvention=CallingConvention.Cdecl)]
static extern void HookJIT();
[STAThread]
static void Main(string[] args)
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
OpenFileDialog openFileDialog = new OpenFileDialog();
openFileDialog.Filter = "Exe Files (*.exe)|*.exe|Dll Files (*.dll)|
*.dll|All Files (*.*)|*.*";
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
HookJIT();
AppDomain ad = AppDomain.CreateDomain("subAppDomain");
ad.ExecuteAssembly(openFileDialog.FileName);
}
}
}
}
Keep in mind that with certain protections this code loading method won't work. So, you have to evaluate each case (since it depends on how the injector is implemented) and find a way to hook the JIT before the protection does.
Also, the first executed assembly decides the .NET Framework platform. Thus, the loader should be compiled consequently. What this means is that you can choose the platform on which a certain assembly has to run from the Visual Studio options. If you choose a 64-bit platform, then the PE of the assembly will be a 64-bit PE which can run only on the platform it was meant for, whereas 32-bit PEs can run on every platform. To force a 32-bit PE to use the x86 framework, a flag has to be set in the .NET Directory's Flags field:
The code I wrote in this article is 64-bit compatible, but I only tested it on x86. Thus, the loader has the 32-bit code set. To use the loader in a 64-bit context, you have to unset this flag.
What I'm going to present here is a little example of how to hook the JIT and retrieve information from it. What I'm going to do is to show the class and method name of each jitted method and of each method this method is calling. This is a good time to introduce the fact that the compileMethod
function not only jits the method it's supposed to, but also all the methods called by that method and all the methods called by those methods etc. This is obviously so, as a call can't be jitted if the called method hasn't already been jitted along with its submethods etc. However, even if this is pretty obvious, it is worth keeping that in mind. To disassemble the method's MSIL, I use my DisasMSIL engine.
#include "stdafx.h"
#include <tchar.h>
#include <CorHdr.h>
#include "corinfo.h"
#include "corjit.h"
#include "DisasMSIL.h"
HINSTANCE hInstance;
extern "C" __declspec(dllexport) void HookJIT();
VOID DisplayMethodAndCalls(ICorJitInfo *comp, CORINFO_METHOD_INFO *info);
BOOL APIENTRY DllMain( HMODULE hModule,
DWORD dwReason,
LPVOID lpReserved
)
{
hInstance = (HINSTANCE) hModule;
HookJIT();
return TRUE;
}
BOOL bHooked = FALSE;
ULONG_PTR *(__stdcall *p_getJit)();
typedef int (__stdcall *compileMethod_def)(ULONG_PTR classthis, ICorJitInfo *comp,
CORINFO_METHOD_INFO *info, unsigned flags,
BYTE **nativeEntry, ULONG *nativeSizeOfCode);
struct JIT
{
compileMethod_def compileMethod;
};
compileMethod_def compileMethod;
int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
CORINFO_METHOD_INFO *info, unsigned flags,
BYTE **nativeEntry, ULONG *nativeSizeOfCode);
extern "C" __declspec(dllexport) void HookJIT()
{
if (bHooked) return;
LoadLibrary(_T("mscoree.dll"));
HMODULE hJitMod = LoadLibrary(_T("mscorjit.dll"));
if (!hJitMod)
return;
p_getJit = (ULONG_PTR *(__stdcall *)()) GetProcAddress(hJitMod, "getJit");
if (p_getJit)
{
JIT *pJit = (JIT *) *((ULONG_PTR *) p_getJit());
if (pJit)
{
DWORD OldProtect;
VirtualProtect(pJit, sizeof (ULONG_PTR), PAGE_READWRITE, &OldProtect);
compileMethod = pJit->compileMethod;
pJit->compileMethod = &my_compileMethod;
VirtualProtect(pJit, sizeof (ULONG_PTR), OldProtect, &OldProtect);
bHooked = TRUE;
}
}
}
int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
CORINFO_METHOD_INFO *info, unsigned flags,
BYTE **nativeEntry, ULONG *nativeSizeOfCode)
{
#ifdef _M_IX86
__asm
{
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
}
#endif
int nRet =
compileMethod(classthis, comp, info, flags, nativeEntry, nativeSizeOfCode);
DisplayMethodAndCalls(comp, info);
return nRet;
}
VOID DisplayMethodAndCalls(ICorJitInfo *comp,
CORINFO_METHOD_INFO *info)
{
const char *szMethodName = NULL;
const char *szClassName = NULL;
szMethodName = comp->getMethodName(info->ftn, &szClassName);
char CurMethod[200];
sprintf_s(CurMethod, 200, "%s::%s", szClassName, szMethodName);
char Calls[0x1000];
strcpy_s(Calls, 0x1000, "Methods called:\r\n\r\n");
#define MAX_INSTR 100
ILOPCODE_STRUCT ilopar[MAX_INSTR];
DISASMSIL_OFFSET CodeBase = 0;
BYTE *pCur = info->ILCode;
UINT nSize = info->ILCodeSize;
UINT nDisasmedInstr;
while (DisasMSIL(pCur, nSize, CodeBase, ilopar, MAX_INSTR,
&nDisasmedInstr))
{
for (UINT x = 0; x < nDisasmedInstr; x++)
{
if (info->ILCode[ilopar[x].Offset] == ILOPCODE_CALL)
{
DWORD dwToken = *((DWORD *) &info->ILCode[ilopar[x].Offset + 1]);
CORINFO_METHOD_HANDLE hCallHandle =
comp->findMethod(info->scope, dwToken, info->ftn);
szMethodName = comp->getMethodName(hCallHandle, &szClassName);
strcat_s(Calls, 0x1000, szClassName);
strcat_s(Calls, 0x1000, "::");
strcat_s(Calls, 0x1000, szMethodName);
strcat_s(Calls, 0x1000, "\r\n");
}
}
if (nDisasmedInstr < MAX_INSTR) break;
DISASMSIL_OFFSET next = ilopar[nDisasmedInstr - 1].Offset - CodeBase;
next += ilopar[nDisasmedInstr - 1].Size;
pCur += next;
nSize -= next;
CodeBase += next;
}
MessageBoxA(0, Calls, CurMethod, MB_ICONINFORMATION);
}
The findMethod
has some limitations which I will talk about later. The output of this little example is a series of message boxes informing the user about the method currently being jitted and its calls.
As you can see, I didn't include the callvirt
and calli
opcodes in the search. This was only for the sake of simplicity, there's no other reason. If you want to create a complete logger, you have to consider those opcodes as well.
When writing a dumper, or better a code ejector, for a code injecting protection, one has to choose how to proceed in order to collect the original MSIL code. There are two ways to proceed: either "stealth" or "brute". By stealth I mean dumping the MSIL only of those methods which have been jitted during execution. If you proceed that way, you'll need to retrieve along with the MSIL code, the token of the method, in order to rebuild the assembly then with Rebel.NET. There's a very useful function to retrieve the token from a CORINFO_METHOD_HANDLE
:
comp->getMethodDefFromMethod(info->ftn)
The stealth method will always work 100%, but it has the disadvantage that one can't be sure that all the methods in an assembly will be jitted. However, it is worth mentioning, since in some cases the goal to achieve is to dump just a few methods in order to decompile and analyze them.
The other way to eject the MSIL code I called "brute". By brute I mean forcing the protection to decrypt every method in a .NET assembly at once. What is necessary to do is to collect the CORINFO_METHOD_INFO
data (or at least ILCode
and ILCodeSize
), then retrieve the compileMethod
from the getJit function. By now the compileMethod
has already been hooked by the protection. Thus, calling it means to decrypt the MSIL data. When the protection's compileMethod
has decrypted the MSIL code, it will call the code ejector's compileMethod
, which, in some way, will notice by checking the parameters that this is the code ejection process and won't call the real compileMethod
function.
The code ejector I wrote also features a little dumper of the jitted assemblies. This is quite useful, since most times you have to dump the protected assembly. This can also be achieved by a generic .NET unpacker.
As you can see from the image, the dialog contains a "Generate Rebel File" button. When this button is pressed a rebel report file is requested as input. Basically, one needs to dump (if necessary) the assembly to rebuild first. Then, use the Rebel.NET to create a report file from that assembly. Only the methods should be included in the report file:
This report file will be used during the code ejection process to calculate the number of methods and to retrieve the MSIL code address and size. Actually, the JIT can be used to retrieve this information, but there's a problem. If you noticed, the findMethod
takes three arguments, the last one is a CORINFO_METHOD_HANDLE
called hContext
. This module is used to check the context of the request. If a method tries to access another method without being entitled to, the framework will show an error and terminate the process. The main goal is to obtain a valid CORINFO_METHOD_HANDLE
for every method token in the assembly. As I said earlier, these kind of handles are just pointers. Let's have a look at the memory pointer by a CORINFO_METHOD_HANDLE
:
03B8E1F0 01 00 00 3B 74 01 00 00 ...;t...
03B8E200 02 00 01 08 0D 00 00 00 03 00 02 08 75 01 00 00 ............u...
03B8E210 04 00 03 39 76 01 20 00 05 00 04 08 77 01 00 00 ...9v. .....w...
The first word seems to represent the method's number and after 8 bytes comes the next method. So, in theory, it might even be possible to calculate the right CORINFO_METHOD_HANDLE
, given a method's token. After having retrieved the CORINFO_METHOD_HANDLE
, it would only be a matter of calling getMethodInfo
to get the CORINFO_METHOD_INFO
structure. Note: I'll discuss later what the CORINFO_METHOD_HANDLE
really represents: this topic needs a paragraph on his own.
But, as said, this is not the way I proceeded. I used a Rebel.NET report file to retrieve the necessary data. This approach could obviously be changed. What follows is the code of the .NET code ejector.
#include "stdafx.h"
#include <CommCtrl.h>
#include <CommDlg.h>
#include <tlhelp32.h>
#include <tchar.h>
#include <CorHdr.h>
#include "corinfo.h"
#include "corjit.h"
#include "RebelDotNET.h"
#include "resource.h"
#ifndef PAGE_SIZE
#define PAGE_SIZE 0x1000
#endif
#define IS_FLAG(Value, Flag) ((Value & Flag) == Flag)
HINSTANCE hInstance;
extern "C" __declspec(dllexport) void HookJIT();
VOID ListThread();
BOOL APIENTRY DllMain( HMODULE hModule,
DWORD dwReason,
LPVOID lpReserved
)
{
hInstance = (HINSTANCE) hModule;
HookJIT();
if (dwReason == DLL_PROCESS_ATTACH)
{
CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE) ListThread,
NULL, 0, NULL);
}
return TRUE;
}
extern "C" __declspec(dllexport) int __stdcall _CorExeMain(void)
{
return 0;
}
BOOL bHooked = FALSE;
ULONG_PTR *(__stdcall *p_getJit)();
typedef int (__stdcall *compileMethod_def)(ULONG_PTR classthis, ICorJitInfo *comp,
CORINFO_METHOD_INFO *info, unsigned flags,
BYTE **nativeEntry, ULONG *nativeSizeOfCode);
struct JIT
{
compileMethod_def compileMethod;
};
compileMethod_def compileMethod;
int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
CORINFO_METHOD_INFO *info, unsigned flags,
BYTE **nativeEntry, ULONG *nativeSizeOfCode);
extern "C" __declspec(dllexport) void HookJIT()
{
if (bHooked) return;
LoadLibrary(_T("mscoree.dll"));
HMODULE hJitMod = LoadLibrary(_T("mscorjit.dll"));
if (!hJitMod)
return;
p_getJit = (ULONG_PTR *(__stdcall *)()) GetProcAddress(hJitMod, "getJit");
if (p_getJit)
{
JIT *pJit = (JIT *) *((ULONG_PTR *) p_getJit());
if (pJit)
{
DWORD OldProtect;
VirtualProtect(pJit, sizeof (ULONG_PTR), PAGE_READWRITE, &OldProtect);
compileMethod = pJit->compileMethod;
pJit->compileMethod = &my_compileMethod;
VirtualProtect(pJit, sizeof (ULONG_PTR), OldProtect, &OldProtect);
bHooked = TRUE;
}
}
}
struct AssemblyInfo
{
CORINFO_MODULE_HANDLE hCorModule;
WCHAR AssemblyName[MAX_PATH];
VOID *ImgBase;
UINT ImgSize;
BOOL bIdentified;
HANDLE hRebReport;
BOOL bDump;
TCHAR DumpFileName[MAX_PATH];
} LoggedAssemblies[100];
UINT NumberOfLoggedAssemblies = 0;
VOID LogAssembly(ICorJitInfo *comp, CORINFO_METHOD_INFO *info);
DWORD GetTokenFromMethodHandle(ICorJitInfo *comp, CORINFO_METHOD_INFO *info);
VOID AddMethod(CORINFO_METHOD_INFO *mi);
BOOL CreateRebFile(AssemblyInfo *ai);
int __stdcall my_compileMethod(ULONG_PTR classthis, ICorJitInfo *comp,
CORINFO_METHOD_INFO *info, unsigned flags,
BYTE **nativeEntry, ULONG *nativeSizeOfCode)
{
#ifdef _M_IX86
__asm
{
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
}
#endif
if (comp == NULL)
{
AddMethod(info);
return 0;
}
LogAssembly(comp, info);
int nRet =
compileMethod(classthis, comp, info, flags, nativeEntry, nativeSizeOfCode);
return nRet;
}
VOID AddressToModuleInfo(VOID *pAddress, WCHAR *AssemblyName,
VOID **pImgBase, UINT *pImgSize,
BOOL *pbIdentified)
{
DWORD dwPID = GetCurrentProcessId();
HANDLE hModuleSnap = INVALID_HANDLE_VALUE;
MODULEENTRY32 me32;
static BOOL bFirstUnkAsm = TRUE;
hModuleSnap = CreateToolhelp32Snapshot( TH32CS_SNAPMODULE, dwPID );
if (hModuleSnap == INVALID_HANDLE_VALUE)
return;
me32.dwSize = sizeof (MODULEENTRY32);
if (!Module32First(hModuleSnap, &me32 ))
{
CloseHandle(hModuleSnap);
return;
}
do
{
if (((ULONG_PTR) pAddress) > ((ULONG_PTR) me32.modBaseAddr) &&
((ULONG_PTR) pAddress) < (((ULONG_PTR) me32.modBaseAddr) +
me32.modBaseSize))
{
if (pImgBase) *pImgBase = (VOID *) me32.modBaseAddr;
if (pImgSize) *pImgSize = me32.modBaseSize;
wcscpy_s(AssemblyName, MAX_PATH, me32.szExePath);
if (pbIdentified) *pbIdentified = TRUE;
return;
}
} while (Module32Next(hModuleSnap, &me32));
CloseHandle(hModuleSnap);
if (pbIdentified) *pbIdentified = FALSE;
MEMORY_BASIC_INFORMATION mbi = { 0 };
VirtualQuery(pAddress, &mbi, sizeof (MEMORY_BASIC_INFORMATION));
if (pImgBase) *pImgBase = mbi.AllocationBase;
DWORD ImgSize = 0;
__try
{
IMAGE_DOS_HEADER *pDosHeader = (IMAGE_DOS_HEADER *)
mbi.AllocationBase;
if (pDosHeader->e_magic == IMAGE_DOS_SIGNATURE)
{
IMAGE_NT_HEADERS *pNtHeaders = (IMAGE_NT_HEADERS *)
(pDosHeader->e_lfanew + (ULONG_PTR) pDosHeader);
if (pNtHeaders->Signature == IMAGE_NT_SIGNATURE)
{
ImgSize = pNtHeaders->OptionalHeader.SizeOfImage;
}
}
}
__except (EXCEPTION_EXECUTE_HANDLER)
{
goto endinfo;
}
endinfo:
if (pImgSize) *pImgSize = ImgSize;
if (bFirstUnkAsm)
{
wsprintfW(AssemblyName, L"Base: %p - Size: %08X - Primary Assembly",
mbi.AllocationBase, ImgSize);
bFirstUnkAsm = FALSE;
}
else
{
wsprintfW(AssemblyName, L"Base: %p - Size: %08X - unidentfied",
mbi.AllocationBase, ImgSize);
}
}
VOID LogAssembly(ICorJitInfo *comp, CORINFO_METHOD_INFO *info)
{
for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
{
if (LoggedAssemblies[x].hCorModule == info->scope)
return;
}
AddressToModuleInfo(info->ILCode,
LoggedAssemblies[NumberOfLoggedAssemblies].AssemblyName,
&LoggedAssemblies[NumberOfLoggedAssemblies].ImgBase,
&LoggedAssemblies[NumberOfLoggedAssemblies].ImgSize,
&LoggedAssemblies[NumberOfLoggedAssemblies].bIdentified);
LoggedAssemblies[NumberOfLoggedAssemblies].hCorModule = info->scope;
LoggedAssemblies[NumberOfLoggedAssemblies].bDump = FALSE;
NumberOfLoggedAssemblies++;
}
LRESULT CALLBACK ListDlgProc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam);
VOID ListThread()
{
Sleep(2000);
InitCommonControls();
DialogBox(hInstance, MAKEINTRESOURCE(IDD_ASMLIST), NULL, (DLGPROC) ListDlgProc);
}
LRESULT CALLBACK ListDlgProc(HWND hDlg, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
switch (uMsg)
{
case WM_INITDIALOG:
{
HWND hList = GetDlgItem(hDlg, LST_ASMS);
LV_COLUMN lvc;
ZeroMemory(&lvc, sizeof (LV_COLUMN));
lvc.mask = LVCF_FMT | LVCF_WIDTH | LVCF_TEXT | LVCF_SUBITEM;
lvc.fmt = LVCFMT_LEFT;
lvc.cx = 500;
lvc.pszText = _T("Assembly Path");
ListView_InsertColumn(hList, 0, &lvc);
SendMessage(hList, LVM_SETEXTENDEDLISTVIEWSTYLE,
LVS_EX_FULLROWSELECT | LVS_EX_INFOTIP,
LVS_EX_FULLROWSELECT | LVS_EX_INFOTIP);
SendMessage(hDlg, WM_COMMAND, IDC_REFRESH, 0);
break;
}
case WM_CLOSE:
{
EndDialog(hDlg, 0);
break;
}
case WM_COMMAND:
{
switch (LOWORD(wParam))
{
case IDC_REFRESH:
{
HWND hList = GetDlgItem(hDlg, LST_ASMS);
ListView_DeleteAllItems(hList);
LV_ITEM lvi;
ZeroMemory(&lvi, sizeof (LV_ITEM));
lvi.mask = LVIF_TEXT | LVIF_STATE | LVIF_PARAM;
for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
{
lvi.lParam = (LPARAM) LoggedAssemblies[x].hCorModule;
lvi.pszText = LoggedAssemblies[x].AssemblyName;
ListView_InsertItem(hList, &lvi);
}
break;
}
case IDC_DUMPASM:
{
HWND hList = GetDlgItem(hDlg, LST_ASMS);
int nSel = ListView_GetNextItem(hList, -1, LVNI_SELECTED);
if (nSel == -1) break;
OPENFILENAME SaveFileName;
TCHAR DumpFileName[MAX_PATH];
ZeroMemory(DumpFileName, MAX_PATH * sizeof (TCHAR));
ZeroMemory(&SaveFileName, sizeof (OPENFILENAME));
SaveFileName.lStructSize = sizeof (OPENFILENAME);
SaveFileName.hwndOwner = hDlg;
SaveFileName.lpstrFilter = _T("All Files (*.*)\0*.*\0");
SaveFileName.lpstrFile = DumpFileName;
SaveFileName.nMaxFile = MAX_PATH;
SaveFileName.lpstrTitle = _T("Save Assembly As...");
if (!GetSaveFileName(&SaveFileName))
break;
LV_ITEM lvi;
ZeroMemory(&lvi, sizeof (LV_ITEM));
lvi.mask = LVIF_PARAM;
lvi.iItem = nSel;
ListView_GetItem(hList, &lvi);
for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
{
if (LoggedAssemblies[x].hCorModule ==
(CORINFO_MODULE_HANDLE) lvi.lParam)
{
HANDLE hFile = CreateFile(DumpFileName, GENERIC_WRITE,
FILE_SHARE_READ, NULL, CREATE_ALWAYS, 0, NULL);
if (hFile == INVALID_HANDLE_VALUE)
break;
DWORD dwOldProtect;
VirtualProtect(LoggedAssemblies[x].ImgBase,
LoggedAssemblies[x].ImgSize, PAGE_EXECUTE_READ,
&dwOldProtect);
for (UINT nPage = 0;
nPage < (LoggedAssemblies[x].ImgSize / PAGE_SIZE);
nPage++)
{
DWORD BW;
__try
{
VOID *pPage = (VOID *) ((nPage * PAGE_SIZE) +
(ULONG_PTR) LoggedAssemblies[x].ImgBase);
WriteFile(hFile, pPage, PAGE_SIZE, &BW, NULL);
}
__except (EXCEPTION_EXECUTE_HANDLER)
{
SetFilePointer(hFile, PAGE_SIZE, NULL, FILE_CURRENT);
SetEndOfFile(hFile);
}
}
CloseHandle(hFile);
MessageBox(hDlg, _T("Assembly successfully dumped."),
_T("Dumped"), MB_ICONINFORMATION);
}
}
break;
}
case IDC_REBFILE:
{
HWND hList = GetDlgItem(hDlg, LST_ASMS);
int nSel = ListView_GetNextItem(hList, -1, LVNI_SELECTED);
if (nSel == -1) break;
LV_ITEM lvi;
ZeroMemory(&lvi, sizeof (LV_ITEM));
lvi.mask = LVIF_PARAM;
lvi.iItem = nSel;
ListView_GetItem(hList, &lvi);
for (UINT x = 0; x < NumberOfLoggedAssemblies; x++)
{
if (LoggedAssemblies[x].hCorModule ==
(CORINFO_MODULE_HANDLE) lvi.lParam)
{
OPENFILENAME OpenFileName;
TCHAR ReportFileName[MAX_PATH];
ZeroMemory(ReportFileName, MAX_PATH * sizeof (TCHAR));
ZeroMemory(&OpenFileName, sizeof (OPENFILENAME));
OpenFileName.lStructSize = sizeof (OPENFILENAME);
OpenFileName.hwndOwner = hDlg;
OpenFileName.lpstrFilter =
_T("Report Rebel File (*.rebel)\0*.rebel\0");
OpenFileName.lpstrFile = ReportFileName;
OpenFileName.nMaxFile = MAX_PATH;
OpenFileName.lpstrTitle = _T("Select a Report Rebel File...");
if (!GetOpenFileName(&OpenFileName))
break;
LoggedAssemblies[x].hRebReport = CreateFile(ReportFileName,
GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
if (LoggedAssemblies[x].hRebReport == INVALID_HANDLE_VALUE)
break;
OPENFILENAME SaveFileName;
ZeroMemory(LoggedAssemblies[x].DumpFileName,
MAX_PATH * sizeof (TCHAR));
ZeroMemory(&SaveFileName, sizeof (OPENFILENAME));
SaveFileName.lStructSize = sizeof (OPENFILENAME);
SaveFileName.hwndOwner = hDlg;
SaveFileName.lpstrFilter = _T("Rebel File (*.rebel)\0*.rebel\0");
SaveFileName.lpstrFile = LoggedAssemblies[x].DumpFileName;
SaveFileName.nMaxFile = MAX_PATH;
SaveFileName.lpstrTitle = _T("Save Rebel File As...");
SaveFileName.lpstrDefExt = _T("rebel");
if (!GetSaveFileName(&SaveFileName))
{
CloseHandle(LoggedAssemblies[x].hRebReport);
break;
}
CreateRebFile(&LoggedAssemblies[x]);
break;
}
}
break;
}
}
break;
}
}
return FALSE;
}
DWORD RvaToOffset(VOID *pBase, DWORD Rva)
{
__try
{
DWORD Offset = Rva, Limit;
IMAGE_DOS_HEADER *pDosHeader = (IMAGE_DOS_HEADER *) pBase;
if (pDosHeader->e_magic != IMAGE_DOS_SIGNATURE)
return 0;
IMAGE_NT_HEADERS *pNtHeaders = (IMAGE_NT_HEADERS *) (
pDosHeader->e_lfanew + (ULONG_PTR) pDosHeader);
if (pNtHeaders->Signature != IMAGE_NT_SIGNATURE)
return 0;
IMAGE_SECTION_HEADER *Img = IMAGE_FIRST_SECTION(pNtHeaders);
if (Rva < Img->PointerToRawData)
return Rva;
for (WORD i = 0; i < pNtHeaders->FileHeader.NumberOfSections; i++)
{
if (Img[i].SizeOfRawData)
Limit = Img[i].SizeOfRawData;
else
Limit = Img[i].Misc.VirtualSize;
if (Rva >= Img[i].VirtualAddress &&
Rva < (Img[i].VirtualAddress + Limit))
{
if (Img[i].PointerToRawData != 0)
{
Offset -= Img[i].VirtualAddress;
Offset += Img[i].PointerToRawData;
}
return Offset;
}
}
}
__except (EXCEPTION_EXECUTE_HANDLER)
{
return 0;
}
return 0;
}
UINT GetMethodSize(REBEL_METHOD *rbMethod)
{
UINT nMethodSize = sizeof (REBEL_METHOD);
if (!IS_FLAG(rbMethod->Mask, REBEL_METHOD_MASK_NAMEOFFSET))
nMethodSize += rbMethod->NameOffsetOrSize;
if (!IS_FLAG(rbMethod->Mask, REBEL_METHOD_MASK_SIGOFFSET))
nMethodSize += rbMethod->SignatureOffsetOrSize;
if (!IS_FLAG(rbMethod->Mask, REBEL_METHOD_MASK_LOCVARSIGOFFSET))
nMethodSize += rbMethod->LocalVarSigOffsetOrSize;
nMethodSize += rbMethod->CodeSize;
nMethodSize += rbMethod->ExtraSectionsSize;
return nMethodSize;
}
static HANDLE hRebuildDump = INVALID_HANDLE_VALUE;
VOID AddMethod(CORINFO_METHOD_INFO *mi)
{
REBEL_METHOD rbMethod;
ZeroMemory(&rbMethod, sizeof (REBEL_METHOD));
rbMethod.Token = mi->locals.token;
rbMethod.CodeSize = mi->ILCodeSize;
DWORD BRW;
SetFilePointer(hRebuildDump, 0, NULL, FILE_END);
WriteFile(hRebuildDump, &rbMethod, sizeof (REBEL_METHOD), &BRW, NULL);
WriteFile(hRebuildDump, mi->ILCode, mi->ILCodeSize, &BRW, NULL);
SetFilePointer(hRebuildDump, 0, NULL, FILE_BEGIN);
REBEL_NET_BASE rbBase;
ReadFile(hRebuildDump, &rbBase, sizeof (REBEL_NET_BASE), &BRW, NULL);
rbBase.NumberOfMethods++;
SetFilePointer(hRebuildDump, 0, NULL, FILE_BEGIN);
WriteFile(hRebuildDump, &rbBase, sizeof (REBEL_NET_BASE), &BRW, NULL);
}
BOOL CreateRebFile(AssemblyInfo *ai)
{
DWORD BRW;
hRebuildDump = CreateFile(ai->DumpFileName, GENERIC_READ |
GENERIC_WRITE, FILE_SHARE_READ, NULL,
CREATE_ALWAYS, 0, NULL);
if (hRebuildDump == INVALID_HANDLE_VALUE)
return FALSE;
REBEL_NET_BASE rbBase;
ZeroMemory(&rbBase, sizeof (REBEL_NET_BASE));
rbBase.Signature = REBEL_NET_SIGNATURE;
rbBase.MethodsOffset = sizeof (REBEL_NET_BASE);
WriteFile(hRebuildDump, &rbBase, sizeof (REBEL_NET_BASE), &BRW, NULL);
HMODULE hJitMod = LoadLibrary(_T("mscorjit.dll"));
p_getJit = (ULONG_PTR *(__stdcall *)()) GetProcAddress(hJitMod, "getJit");
if (!p_getJit)
{
CloseHandle(hRebuildDump);
hRebuildDump = INVALID_HANDLE_VALUE;
return FALSE;
}
JIT *pJit = (JIT *) *((ULONG_PTR *) p_getJit());
REBEL_NET_BASE repBase;
if (!ReadFile(ai->hRebReport, &repBase, sizeof (REBEL_NET_BASE), &BRW, NULL))
{
CloseHandle(hRebuildDump);
hRebuildDump = INVALID_HANDLE_VALUE;
return FALSE;
}
UINT CurMethodOffset = repBase.MethodsOffset;
for (UINT x = 0; x < repBase.NumberOfMethods; x++)
{
SetFilePointer(ai->hRebReport, CurMethodOffset, NULL, FILE_BEGIN);
REBEL_METHOD rbRepMethod;
ReadFile(ai->hRebReport, &rbRepMethod, sizeof (REBEL_METHOD), &BRW, NULL);
BYTE *pMethodCode;
if (ai->bIdentified)
{
pMethodCode = (BYTE *) (rbRepMethod.RVA + (ULONG_PTR) ai->ImgBase);
}
else
{
DWORD Offset = RvaToOffset(ai->ImgBase, rbRepMethod.RVA);
if (Offset == 0) continue;
pMethodCode = (BYTE *) (Offset + (ULONG_PTR) ai->ImgBase);
}
BYTE HeaderFormat = *pMethodCode;
HeaderFormat &= 3;
if (HeaderFormat == 2) pMethodCode++;
else pMethodCode += (sizeof (DWORD) * 3);
CORINFO_METHOD_INFO mi = { 0 };
mi.ILCode = pMethodCode;
mi.ILCodeSize = rbRepMethod.CodeSize;
mi.scope = ai->hCorModule;
mi.locals.token = rbRepMethod.Token;
pJit->compileMethod((ULONG_PTR) pJit, NULL, &mi, 0, NULL, NULL);
CurMethodOffset += GetMethodSize(&rbRepMethod);
}
CloseHandle(hRebuildDump);
hRebuildDump = INVALID_HANDLE_VALUE;
MessageBox(0, _T("Assembly code successfully dumped."), _T("JIT Dumper"),
MB_ICONINFORMATION);
return TRUE;
}
I excuse myself if the code is a bit messy, but I didn't take much care in writing it as no time at all was put into designing it in the first place. I also had to re-write several parts of the code as I changed my approach three times. I hope you're able to understand it nonetheless.
Of course, a demonstration is necessary. As I do not intend to violate the license of commercial products, the victim of this paragraph will be rendari's "cryxenet 2 unpackme
", a little .NET crackme which uses code injection by hooking the compileMethod
function. The crackme comes with 3 files: an unpackme.exe, a native.dll and a cryxed.dll. When started, it shows a form which asks for a name / serial and then checks them if the user presses the button "Check". The crackme has to be considered as solved when the serial check process displays the message box of valid name / serial. Thanks to rendari for making this demonstration possible.
Actually, it's not necessary to analyze the crackme in order to solve it. But I'll do it anyway, as it might give an idea of how a basic code injector may work.
If one tries to decompile unpackme.exe's main function with the reflector, it'll show an error. So, let's see the MSIL code:
.method public static void main() cil
managed
{
.entrypoint
.custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = ( 01 00 00 00 )
.maxstack 4
.locals init (int64 V_0,
class [mscorlib]System.IO.FileInfo V_1,
class [mscorlib]System.Reflection.Assembly V_2,
object[] V_3,
string[] V_4,
int64 V_5,
int64 V_6,
int64 V_7,
int64 V_8,
int64 V_9,
class [mscorlib]System.IO.FileStream V_10,
native int V_11,
uint8[] V_12,
class Project1.Program/obfuscation8 V_13,
uint8[] V_14,
int64 V_15,
int64 V_16,
int64 V_17)
IL_0000: nop
IL_0001: ldc.i4.s 99
IL_0003: conv.i8
IL_0004: stloc.s V_5
IL_0006: ldc.i4.s 75
IL_0008: conv.i8
IL_0009: stloc.s V_6
IL_000b: ldc.i4 0x38d
IL_0010: conv.i8
IL_0011: stloc.s V_7
IL_0013: call int64
Project1.Program::IsDebuggerPresent()
IL_0018: pop
IL_0019: ldloc.s V_6
IL_001b: br.s IL_003d
IL_001d: add.ovf
IL_001e: stloc.s V_5
IL_0020: ldloc.s V_7
IL_0022: ldc.i4.7
IL_0023: conv.i8
IL_0024: sub.ovf
IL_0025: stloc.s V_7
IL_0027: ldloc.s V_7
IL_0029: ldloc.s V_5
IL_002b: add.ovf
IL_002c: stloc.s V_6
IL_002e: ldloc.s V_7
IL_0030: ldc.i4.7
IL_0031: conv.i8
IL_0032: add.ovf
IL_0033: stloc.s V_7
IL_0035: call int64
Project1.Program::IsDebuggerPresent()
IL_003a: pop
IL_003b: ldloc.s V_6
IL_003d: ldloc.s V_7
IL_003f: add.ovf
IL_0040: stloc.s V_5
IL_0042: ldloc.s V_7
IL_0044: ldc.i4.7
IL_0045: conv.i8
IL_0046: sub.ovf
IL_0047: stloc.s V_7
IL_0049: ldloc.s V_7
IL_004b: ldloc.s V_5
IL_004d: add.ovf
IL_004e: stloc.s V_6
IL_0050: ldloc.s V_7
IL_0052: ldc.i4.7
IL_0053: conv.i8
I'm not going to paste all of the code, since it's a very huge amount. As we can notice, there are plenty of nops
in the code, but that doesn't matter: a decompiler simply ignores them. Since I have developed an obfuscator in the past, I know what makes a decompiler crash. Thus, I started looking for jumps. I noticed one at the beginning of the code:
IL_001b:
br.s IL_003d
IL_001d: add.ovf
IL_001e: stloc.s V_5
IL_0020: ldloc.s V_7
IL_0022: ldc.i4.7
IL_0023: conv.i8
IL_0024: sub.ovf
IL_0025: stloc.s V_7
IL_0027: ldloc.s V_7
IL_0029: ldloc.s V_5
IL_002b: add.ovf
IL_002c: stloc.s V_6
IL_002e: ldloc.s V_7
IL_0030: ldc.i4.7
IL_0031: conv.i8
IL_0032: add.ovf
IL_0033: stloc.s V_7
IL_0035: call int64
Project1.Program::IsDebuggerPresent()
IL_003a: pop
IL_003b: ldloc.s V_6
IL_003d: ldloc.s V_7
The jump is unconditional. The opcodes between offset 0x1B
and 0x3D
will never be executed: I checked all the code after that, there's absolutely no reference to these opcodes. So, I nop
ped them (jump included) with the CFF Explorer and then decompiled again with the Reflector:
[STAThread]
public static void main()
{
long num6;
long num2 = 0x63L;
long num3 = 0x4bL;
long num4 = 0x38dL;
IsDebuggerPresent();
num2 = num3 + num4;
num4 -= 7L;
num3 = num4 + num2;
num4 += 7L;
IsDebuggerPresent();
num2 = num3 + num4;
num4 -= 7L;
num3 = num4 + num2;
num4 += 7L;
num2 = num3 + num4;
num4 -= 7L;
It works, but now we're facing code jungle. I only pasted a few instructions, since the jungle pattern is very easy, as you can see, it just repeats:
x = y + z;
z -= 7; / z += 7
I could have just removed all the jungle with the Notepad, but a little CFF Explorer script to do the job seemed a much more elegant solution to me. You can identify the pattern from these two jungle examples:
IL_003d:
ldloc.s V_7
IL_003f: add.ovf
IL_0040: stloc.s V_5
IL_0042: ldloc.s V_7
IL_0044: ldc.i4.7
IL_0045: conv.i8
IL_0046: sub.ovf
IL_0047: stloc.s V_7
IL_0049: ldloc.s V_7
IL_004b: ldloc.s V_5
IL_004d: add.ovf
IL_004e: stloc.s V_6
IL_0050: ldloc.s V_7
IL_0052: ldc.i4.7
IL_0053: conv.i8
IL_0054: add.ovf
IL_0055: stloc.s V_7
The instructions can change a bit, but it's still very simple to find a pattern. Here's the CFF Explorer script to de-jungle the code:
filename = GetOpenFile()
if filename == null
then
return
end
hFile = OpenFile(filename)
if hFile ==
null then
return
end
-- nop the initial assignment of the variables
-- and also the jump that causes the decompiler
-- to crash
FillBytes(hFile,
0x105C, 0x49, 0)
jungle = { 0x11, ND,
ND, ND,
ND, ND,
ND, 0x11, ND, 0x1D,
0x6A, ND, 0x13, 0x07 }
Offset = SearchBytes(hFile,
0x105D, jungle)
while Offset !=
null do
-- check if it exceeds the method function
if Offset >
0x1050 + 1483 then
break
end
-- nop jungle
FillBytes(hFile,
Offset ,
#jungle, 0)
Offset = SearchBytes(hFile,
Offset + 1, jungle)
end
if SaveFile(hFile)
== true then
MsgBox("dejungled")
end
And now it's possible to decompile (and read) the code:
public static void main()
{
long num6;
IsDebuggerPresent();
FileStream stream = new FileInfo("native.dll").OpenRead();
long length = stream.Length;
byte[] array = new byte[((int) length) + 1];
stream.Read(array, 0, (int) length);
stream.Close();
long num7 = length;
for (num6 = 0L; num6 <= num7; num6 += 1L)
{
array[(int) num6] = (byte) (array[(int) num6] ^ 0x37);
}
IntPtr destination = new IntPtr();
destination = Marshal.AllocCoTaskMem((int) length);
Marshal.Copy(array, 0, destination, (int) length);
obfuscation8 delegateForFunctionPointer = (obfuscation8)
Marshal.GetDelegateForFunctionPointer(destination, typeof(obfuscation8));
IsDebuggerPresent();
long num = new long();
num = delegateForFunctionPointer();
IsDebuggerPresent();
stream = new FileInfo("cryxed.dll").OpenRead();
length = stream.Length;
byte[] buffer2 = new byte[((int) length) + 1];
IsDebuggerPresent();
stream.Read(buffer2, 0, (int) length);
stream.Close();
IsDebuggerPresent();
long num8 = length;
for (num6 = 0L; num6 <= num8; num6 += 1L)
{
buffer2[(int) num6] = (byte) (buffer2[(int) num6] ^ 0x37);
}IsDebuggerPresent();
Assembly assembly = Assembly.Load(buffer2);
IsDebuggerPresent();
object[] parameters = new object[1];
string[] strArray = new string[] { "" };
parameters[0] = strArray;
assembly.EntryPoint.Invoke(null, parameters);
IsDebuggerPresent();
}
This part of the code decrypts (with an xor
) native.dll and transforms it into a native function:
FileStream stream = new FileInfo("native.dll").OpenRead();
long length = stream.Length;
byte[] array = new byte[((int) length) + 1];
stream.Read(array, 0, (int) length);
stream.Close();
long num7 = length;
for (num6 = 0L; num6 <= num7; num6 += 1L)
{
array[(int) num6] = (byte) (array[(int) num6] ^ 0x37);
}
IntPtr destination = new IntPtr();
destination = Marshal.AllocCoTaskMem((int) length);
Marshal.Copy(array, 0, destination, (int) length);
obfuscation8 delegateForFunctionPointer = (obfuscation8)
Marshal.GetDelegateForFunctionPointer(destination, typeof(obfuscation8));
GetDelegateForFunctionPointer
makes it possible to call a native function by passing the function's pointer. To view the code of the function, just open the CFF Explorer, go to Hex Editor, right click on the hex view and press Select All. Then right click again and click on Modify. Put the byte 0x37
in the value box and press ok. You now have the decrypted file which can be disassembled.
Exactly the same approach is used to decrypt cryxed.dll, which is the protected .NET assembly. So, in order to obtain the assembly to rebuild, it's not necessary to dump it from memory: it can be obtained following the simple decryption approach just explained. It should be noted that the crackme uses the Assembly.Load
approach which I have addressed earlier in the .NET loader paragraph. To sum up, the main function of the crackme hooks the JIT, then loads the protected assembly and invokes its entry point.
The first thing we should be done with is to create a Rebel.NET report file out of either the decrypted cryxed.dll or the dumped assembly. The protected assembly is the unidentfied one:
If the rebel report file has already been created, then it is possible to generate the rebuilding rebel file by clicking on "Generate Rebel File". If the ejection process succeeded, a message box will prompt informing the user about the success of the operation.
After having successfully created the rebuilding rebl file, a simple rebuilding with Rebel.NET will generate a fully decompilable / runnable assembly:
Now that we have the virgin assembly, we can disassemble it. The UnpackMe.Form1 namespace
contains three button events. Here's the first button event:
.method private instance void
Button1_Click(object sender,
class [mscorlib]System.EventArgs e)
cil managed
{
.maxstack 3
.locals init (string V_0,
string V_1,
string V_2,
string V_3,
bool V_4)
IL_0000: nop
IL_0001: ldarg.0
IL_0002: callvirt instance class
[System.Windows.Forms]System.Windows.Forms.TextBox
UnpackMe.Form1::get_TextBox1()
IL_0007: callvirt instance string
[System.Windows.Forms]System.Windows.Forms.TextBox::get_Text()
IL_000c: stloc.1
IL_000d: ldarg.0
IL_000e: callvirt instance class
[System.Windows.Forms]System.Windows.Forms.TextBox
UnpackMe.Form1::get_TextBox2()
IL_0013: callvirt instance string
[System.Windows.Forms]System.Windows.Forms.TextBox::get_Text()
IL_0018: stloc.3
IL_0019: ldstr "Such a naiive serial routine :D"
IL_001e: stloc.0
IL_001f: ldarg.0
IL_0020: ldloc.1
IL_0021: ldloc.0
IL_0022: callvirt instance string UnpackMe.Form1::TripleDESEncode(string,
string)
IL_0027: stloc.2
IL_0028: ldarg.0
IL_0029: callvirt instance class
[System.Windows.Forms]System.Windows.Forms.TextBox
UnpackMe.Form1::get_TextBox2()
IL_002e: callvirt instance string
[System.Windows.Forms]System.Windows.Forms.TextBox::get_Text()
IL_0033: ldloc.2
IL_0034: ldc.i4.0
IL_0035: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.Operators::
CompareString(string, string, bool)
IL_003a: ldc.i4.0
IL_003b: ceq
IL_003d: stloc.s V_4
IL_003f: ldloc.s V_4
IL_0041: brfalse.s IL_0050
IL_0043: ldstr "Good work! Now go and post a solution or suggestion"
+ "ns so that I can improve the protector =)"
IL_0048: call valuetype
[System.Windows.Forms]System.Windows.Forms.DialogResult
[System.Windows.Forms]System.Windows.Forms.MessageBox::Show(string)
IL_004d: pop
IL_004e: br.s IL_005c
IL_0050: nop
IL_0051: ldstr "Invalid Serial. Pls don't hack me :'("
IL_0056: call valuetype
[System.Windows.Forms]System.Windows.Forms.DialogResult
[System.Windows.Forms]System.Windows.Forms.MessageBox::Show(string)
IL_005b: pop
IL_005c: nop
IL_005d: nop
IL_005e: ret
}
This obviously is the serial check routine. To overcome it in order to display the valid serial message box, it is only necessary to invert the highlighted branch in the code. This can easily be accomplished with the CFF Explorer:
Now, the crackme can be considered as solved, as it always shows the right message (except of course if you insert the valid name and serial, but solving a 3DES encryption just by taking a guess shouldn't be expected). Of course, the solved crackme along with its original files are available to download.
The crackme, of course, could have been solved in a different manner. But this goes beyond the scope of this paragraph which was a code ejection demonstration.
I said earlier that I was going to address the real meaning of the CORINFO_METHOD_HANDLE
. So, that's what I'm doing in this paragraph.
I first become conscious of the meaning of this pointer when I came across this code in the jitinterface.cpp.
CHECK CheckContext(CORINFO_MODULE_HANDLE scopeHnd, CORINFO_CONTEXT_HANDLE context)
{
CHECK_MSG(scopeHnd != NULL, "Illegal null scope");
CHECK_MSG(((size_t) context & ~CORINFO_CONTEXTFLAGS_MASK) !=
NULL, "Illegal null context");
if (((size_t) context & CORINFO_CONTEXTFLAGS_MASK) == CORINFO_CONTEXTFLAGS_CLASS)
{
TypeHandle handle((CORINFO_CLASS_HANDLE) ((size_t) context
& ~CORINFO_CONTEXTFLAGS_MASK));
CHECK_MSG(handle.GetModule() == GetModule(scopeHnd),
"Inconsistent scope and context");
}
else
{
MethodDesc* handle = (MethodDesc*) ((size_t) context
& ~CORINFO_CONTEXTFLAGS_MASK);
CHECK_MSG(handle->GetModule() == GetModule(scopeHnd),
"Inconsistent scope and context");
}
CHECK_OK;
}
Never mind the fact that an CORINFO_CONTEXT_HANDLE
is the second argument of the function. The code which calls CheckContext
passes a CORINFO_METHOD_HANDLE
as context.
What can be concluded is that CORINFO_METHOD_HANDLE
only is a pointer to a MethodDesc
class. The MethodDesc
class is one of the most important parts of the framework as it provides an incredible amount of information. The declaration of this class is inside the clr\src\vm\method.hpp file.
class MethodDesc
{
friend class EEClass;
friend class MethodTableBuilder;
friend class ArrayClass;
friend class NDirect;
friend class InstantiatedMethodDesc;
friend class MDEnums;
friend class MethodImpl;
friend class CheckAsmOffsets;
friend class ClrDataAccess;
friend class ZapMonitor;
friend class MethodDescCallSite;
public:
[...]
inline BOOL HasStableEntryPoint()
{
LEAF_CONTRACT;
return (m_bFlags2 & enum_flag2_HasStableEntryPoint) != 0;
}
inline TADDR GetStableEntryPoint()
{
WRAPPER_CONTRACT;
_ASSERTE(HasStableEntryPoint());
return *GetAddrOfSlotUnchecked();
}
BOOL SetStableEntryPointInterlocked(TADDR addr);
BOOL HasTemporaryEntryPoint();
TADDR GetTemporaryEntryPoint();
void SetTemporaryEntryPoint(BaseDomain *pDomain, AllocMemTracker *pamTracker);
inline BOOL HasPrecode()
{
LEAF_CONTRACT;
return (m_bFlags2 & enum_flag2_HasPrecode) != 0;
}
inline void SetHasPrecode()
{
LEAF_CONTRACT;
m_bFlags2 |= (enum_flag2_HasPrecode | enum_flag2_HasStableEntryPoint);
}
inline void ResetHasPrecode()
{
LEAF_CONTRACT;
m_bFlags2 &= ~enum_flag2_HasPrecode;
m_bFlags2 |= enum_flag2_HasStableEntryPoint;
}
inline Precode* GetPrecode()
{
LEAF_CONTRACT;
PRECONDITION(HasPrecode());
Precode* pPrecode = Precode::GetPrecodeFromEntryPoint(GetStableEntryPoint());
PREFIX_ASSUME(pPrecode != NULL);
return pPrecode;
}
inline BOOL MayHavePrecode()
{
WRAPPER_CONTRACT;
return !MayHaveNativeCode() || PrestubMayInsertStub() || RequiresPrestub();
}
void InterlockedUpdateFlags2(BYTE bMask, BOOL fSet);
Precode* GetOrCreatePrecode();
inline BYTE* GetCallablePreStubAddr()
{
WRAPPER_CONTRACT;
return HasStableEntryPoint() ? (BYTE*)GetStableEntryPoint() :
(BYTE*)GetTemporaryEntryPoint();
}
static inline MethodDesc* GetMethodDescFromStubAddr
(TADDR addr, BOOL fSpeculative = FALSE);
DWORD GetAttrs();
DWORD GetImplAttrs();
LPCUTF8 GetName();
LPCUTF8 GetName(USHORT slot);
FORCEINLINE LPCUTF8 GetNameOnNonArrayClass()
{
WRAPPER_CONTRACT;
return (GetMDImport()->GetNameOfMethodDef(GetMemberDef()));
}
COUNT_T GetStableHash();
DWORD GetNumGenericMethodArgs();
DWORD GetNumGenericClassArgs()
{
WRAPPER_CONTRACT;
return GetMethodTable()->GetNumGenericArgs();
}
BOOL IsGenericMethodDefinition();
BOOL ContainsGenericVariables();
Module* GetDefiningModuleForOpenMethod();
BOOL HasNonObjectClassOrMethodInstantiation();
BOOL IsTypicalMethodDefinition();
The size of this class is impressive because of the methods it contains. I could only paste a very small part, just to give the reader an idea. The comments above the class declaration remind of the data pointed by a CORINFO_METHOD_HANDLE
, which was also 8-byte aligned.
This is what can be found at the end of the MethodDesc
class declaration:
protected:
UINT16 m_wTokenRemainder;
BYTE m_chunkIndex;
enum {
enum_flag2_HasStableEntryPoint = 0x01,
enum_flag2_HasPrecode = 0x02,
enum_flag2_IsUnboxingStub = 0x04,
enum_flag2_MayHaveNativeCode = 0x08,
};
BYTE m_bFlags2;
WORD m_wSlotNumber;
WORD m_wFlags;
And this data exactly matches my previous intuition. We can now use every CORINFO_METHOD_HANDLE
as a MethodDesc
class. Of course, including the whole MethodDesc
class would be rather painful given its complexity. But one could write one's own simplified version of the MethodDesc
class: all that is necessary to do is to include the members I pasted above which will result in the 8-byte multiple size of the class.
The MethodDesc
class is useful for many purposes and its use is rather safe, since it is not supposed to change any time soon. And even if: its members (excluding the methods) are rather few, so I guess it won't be difficult to have a working simplified MethodDesc
class.
I would have provided an example of how to use the MethodDesc
class myself, but as I'm writing the article is already rather big and, although it's too late to keep it short, I'm still hoping to keep it readable. In fact, the journey into the .NET Framework internals is not yet concluded and some things have still to be discussed.
There are other very interesting parts, apart from the JIT, of the .NET Framework which should be discussed. Of course, I can't discuss them all in these two articles; I'm just trying to give the reader an idea of how easily they can be explored.
Some things have to be said about the execution engine, even though the interface which can be easily retrieved from the mscorwks
is not much interesting. But in this paragraph, I'm also addressing things which seem to be useful but really aren't.
The mscorwks.dll module exports a function named IEE
which could intrigue a reverser. However, the internals of this API are rather disappointing:
static BYTE g_CEEInstance[sizeof(CExecutionEngine)];
static IExecutionEngine * g_pCEE = NULL;
PTLS_CALLBACK_FUNCTION CExecutionEngine::Callbacks[MAX_PREDEFINED_TLS_SLOT];
extern "C" IExecutionEngine * __stdcall IEE()
{
LEAF_CONTRACT;
if ( !g_pCEE )
{
CExecutionEngine local;
memcpy(&g_CEEInstance, &local, sizeof(CExecutionEngine));
g_pCEE = (IExecutionEngine *)(CExecutionEngine*)&g_CEEInstance;
}
return g_pCEE;
}
As can be seen from the comments, this function only offers an interface for memory allocation and process synchronization. In fact, this is the declaration of the return class:
class CExecutionEngine : public IExecutionEngine, public IEEMemoryManager
{
public:
static void ThreadDetaching(void **pTlsData);
static void DeleteTLS(void **pTlsData);
static void SwitchIn();
static void SwitchOut();
static void **CheckThreadState(DWORD slot, BOOL force = TRUE);
static void **CheckThreadStateNoCreate(DWORD slot);
static void SetupTLSForThread(Thread *pThread);
static DWORD GetTlsIndex () {return TlsIndex;}
static BOOL HasDetachedTlsInfo();
static void CleanupDetachedTlsInfo();
static void DetachTlsInfo(void **pTlsData);
private:
friend class EEDbgInterfaceImpl;
SVAL_DECL (DWORD, TlsIndex);
static PTLS_CALLBACK_FUNCTION Callbacks[MAX_PREDEFINED_TLS_SLOT];
HRESULT STDMETHODCALLTYPE QueryInterface(
REFIID id,
void **pInterface);
ULONG STDMETHODCALLTYPE AddRef();
ULONG STDMETHODCALLTYPE Release();
VOID STDMETHODCALLTYPE TLS_AssociateCallback(
DWORD slot,
PTLS_CALLBACK_FUNCTION callback);
DWORD STDMETHODCALLTYPE TLS_GetMasterSlotIndex();
LPVOID STDMETHODCALLTYPE TLS_GetValue(DWORD slot);
BOOL STDMETHODCALLTYPE TLS_CheckValue(DWORD slot, LPVOID * pValue);
VOID STDMETHODCALLTYPE TLS_SetValue(DWORD slot, LPVOID pData);
VOID STDMETHODCALLTYPE TLS_ThreadDetaching();
CRITSEC_COOKIE STDMETHODCALLTYPE CreateLock(LPCSTR szTag, LPCSTR level,
CrstFlags flags);
void STDMETHODCALLTYPE DestroyLock(CRITSEC_COOKIE lock);
void STDMETHODCALLTYPE AcquireLock(CRITSEC_COOKIE lock);
void STDMETHODCALLTYPE ReleaseLock(CRITSEC_COOKIE lock);
EVENT_COOKIE STDMETHODCALLTYPE CreateAutoEvent(BOOL bInitialState);
EVENT_COOKIE STDMETHODCALLTYPE CreateManualEvent(BOOL bInitialState);
void STDMETHODCALLTYPE CloseEvent(EVENT_COOKIE event);
BOOL STDMETHODCALLTYPE ClrSetEvent(EVENT_COOKIE event);
BOOL STDMETHODCALLTYPE ClrResetEvent(EVENT_COOKIE event);
DWORD STDMETHODCALLTYPE WaitForEvent
(EVENT_COOKIE event, DWORD dwMilliseconds, BOOL bAlertable);
DWORD STDMETHODCALLTYPE WaitForSingleObject(HANDLE handle, DWORD dwMilliseconds);
SEMAPHORE_COOKIE STDMETHODCALLTYPE ClrCreateSemaphore(DWORD dwInitial, DWORD dwMax);
void STDMETHODCALLTYPE ClrCloseSemaphore(SEMAPHORE_COOKIE semaphore);
DWORD STDMETHODCALLTYPE ClrWaitForSemaphore
(SEMAPHORE_COOKIE semaphore, DWORD dwMilliseconds, BOOL bAlertable);
BOOL STDMETHODCALLTYPE ClrReleaseSemaphore
(SEMAPHORE_COOKIE semaphore, LONG lReleaseCount, LONG *lpPreviousCount);
MUTEX_COOKIE STDMETHODCALLTYPE ClrCreateMutex
(LPSECURITY_ATTRIBUTES lpMutexAttributes,
BOOL bInitialOwner,
LPCTSTR lpName);
void STDMETHODCALLTYPE ClrCloseMutex(MUTEX_COOKIE mutex);
BOOL STDMETHODCALLTYPE ClrReleaseMutex(MUTEX_COOKIE mutex);
DWORD STDMETHODCALLTYPE ClrWaitForMutex(MUTEX_COOKIE mutex,
DWORD dwMilliseconds,
BOOL bAlertable);
DWORD STDMETHODCALLTYPE ClrSleepEx(DWORD dwMilliseconds, BOOL bAlertable);
BOOL STDMETHODCALLTYPE ClrAllocationDisallowed();
void STDMETHODCALLTYPE GetLastThrownObjectExceptionFromThread(void **ppvException);
LPVOID STDMETHODCALLTYPE ClrVirtualAlloc(LPVOID lpAddress, SIZE_T dwSize,
DWORD flAllocationType, DWORD flProtect);
BOOL STDMETHODCALLTYPE ClrVirtualFree(LPVOID lpAddress, SIZE_T dwSize,
DWORD dwFreeType);
SIZE_T STDMETHODCALLTYPE ClrVirtualQuery(LPCVOID lpAddress,
PMEMORY_BASIC_INFORMATION lpBuffer, SIZE_T dwLength);
BOOL STDMETHODCALLTYPE ClrVirtualProtect(LPVOID lpAddress,
SIZE_T dwSize, DWORD flNewProtect, PDWORD lpflOldProtect);
HANDLE STDMETHODCALLTYPE ClrGetProcessHeap();
HANDLE STDMETHODCALLTYPE ClrHeapCreate(DWORD flOptions,
SIZE_T dwInitialSize, SIZE_T dwMaximumSize);
BOOL STDMETHODCALLTYPE ClrHeapDestroy(HANDLE hHeap);
LPVOID STDMETHODCALLTYPE ClrHeapAlloc(HANDLE hHeap, DWORD dwFlags, SIZE_T dwBytes);
BOOL STDMETHODCALLTYPE ClrHeapFree(HANDLE hHeap, DWORD dwFlags, LPVOID lpMem);
BOOL STDMETHODCALLTYPE ClrHeapValidate(HANDLE hHeap, DWORD dwFlags, LPCVOID lpMem);
HANDLE STDMETHODCALLTYPE ClrGetProcessExecutableHeap();
};
IExecutionEngine
and IEEMemoryManager
are just interfaces. Thus, no additional functionality is provided by the IEE
interface.
The framework also exports two functions which may call the reverser's attention: GetRealProcAddress
and GetCLRFunction
. Unfortunately, they're both useless. GetRealProcAddress
is only a call to LoadLibrary("mscorwks.dll")
followed by a GetProcAd<code>
dress:
extern "C"
STDAPI GetRealProcAddress(LPCSTR pwszProcName, VOID** ppv)
{
if(!ppv)
{
return E_POINTER;
}
HMODULE hLib = GetLibrary(LIB_mscorwks);
if(hLib == NULL)
{
return HRESULT_FROM_GetLastError();
}
*ppv = (void*) GetProcAddress(hLib,pwszProcName);
if(*ppv == NULL)
{
return HRESULT_FROM_GetLastError();
}
return S_OK;
}
And GetCLRFunction
can only retrieve the address of three functions and won't accept any other.
.text:79EA0C1B
.text:79EA0C1B public ?GetCLRFunction@@YGPAXPBD@Z
.text:79EA0C1B ?GetCLRFunction@@YGPAXPBD@Z proc near
.text:79EA0C1B
.text:79EA0C1B
[...]
.text:79EA0C1B
.text:79EA0C44 mov esi, [ebp+arg_0]
.text:79EA0C47 push offset aClrloadlibrary
.text:79EA0C4C push esi
.text:79EA0C4D call _strcmp
.text:79EA0C52 test eax, eax
.text:79EA0C54 pop ecx
.text:79EA0C55 pop ecx
.text:79EA0C56 jz loc_79EEF97B
.text:79EA0C5C push offset aClrfreelibrary
.text:79EA0C61 push esi
.text:79EA0C62 call _strcmp
.text:79EA0C67 test eax, eax
.text:79EA0C69 pop ecx
.text:79EA0C6A pop ecx
.text:79EA0C6B jz loc_7A0D7B8F
.text:79EA0C71 push offset aEeheapallocinp
.text:79EA0C76 push esi
.text:79EA0C77 call _strcmp
.text:79EA0C7C test eax, eax
.text:79EA0C7E pop ecx
.text:79EA0C7F pop ecx
.text:79EA0C80 jnz loc_79ED7512
I had to disassemble the function, because GetCLRFunction
is not available in the Rotor
project. Now that I got those two out of the way, I can talk about an interesting topic: internal calls.
Internal calls are methods implemented natively by the framework which can be called from managed code, although only in a very limited way, as we'll see later.
Such functions are defined in the clr\src\vm\ecall.cpp in this way:
FCFuncStart(gExceptionFuncs)
FCFuncElement("GetClassName", ExceptionNative::GetClassName)
FCFuncElement
("IsImmutableAgileException", ExceptionNative::IsImmutableAgileException)
FCFuncElement("_InternalGetMethod", SystemNative::CaptureStackTraceMethod)
FCFuncElement("nIsTransient", ExceptionNative::IsTransient)
FCFuncElement("GetMessageFromNativeResources",
ExceptionNative::GetMessageFromNativeResources)
FCFuncEnd()
FCFuncStart(gSafeHandleFuncs)
FCFuncElement("InternalDispose", SafeHandle::DisposeNative)
FCFuncElement("InternalFinalize", SafeHandle::Finalize)
FCFuncElement("SetHandleAsInvalid", SafeHandle::SetHandleAsInvalid)
FCFuncElement("DangerousAddRef", SafeHandle::DangerousAddRef)
FCFuncElement("DangerousRelease", SafeHandle::DangerousRelease)
FCFuncEnd()
FCFuncStart(gCriticalHandleFuncs)
FCFuncElement("FireCustomerDebugProbe", CriticalHandle::FireCustomerDebugProbe)
FCFuncEnd()
FCFuncStart(gPathFuncs)
FCFuncEnd()
FCFuncStart(gFusionWrapFuncs)
FCFuncElement("GetNextAssembly", FusionWrap::GetNextAssembly)
FCFuncElement("GetDisplayName", FusionWrap::GetDisplayName)
FCFuncElement("ReleaseFusionHandle", FusionWrap::ReleaseFusionHandle)
FCFuncEnd()
The first argument of FCFuncElement
specifies the name of the function in the managed context, whereas the second one specifies the location of the function. The syntax to access one of these ecalls (I suppose it stands for engine calls) is the following:
[MethodImpl(MethodImplOptions.InternalCall)]
internal extern type ECallMethodName();
In order to use MethodImpl
, one has to include the System.Runtime.CompilerServices namespace
. The problem is, even though you can implement such a call in your project, when you try to actually call one of these internal calls, such a message will be delivered by the framework:
These functions are, in fact, wrapped by the framework. Of course, I didn't introduce internal calls just to take note of that. The interesting part is the interaction between managed code and internal calls. Let's take for instance this ecall:
FCIMPL2(MethodBody *, RuntimeMethodHandle::GetMethodBody, MethodDesc **ppMethod,
EnregisteredTypeHandle enregDeclaringTypeHandle)
(MethodDesc **, EnregisteredTypeHandle)
The _GetMethodBody
internal call takes as first paramater a MethodDesc
pointer to pointer. The first managed wrapping of this function happens in the mscorlib
("clr\src\bcl\system\runtimehandles.cs").
[MethodImpl(MethodImplOptions.InternalCall)]
internal extern MethodBody _GetMethodBody(IntPtr declaringType);
internal MethodBody GetMethodBody(RuntimeTypeHandle declaringType)
{
return _GetMethodBody(declaringType.Value);
}
The first parameter disappears and becomes implicit. The class which contains this method also defines the implicit parameter at the beginning:
[Serializable()]
[System.Runtime.InteropServices.ComVisible(true)]
public unsafe struct RuntimeMethodHandle : ISerializable
{
internal static RuntimeMethodHandle EmptyHandle { get
{ return new RuntimeMethodHandle(null); } }
private IntPtr m_ptr;
The m_ptr
paramater is private
, so it can't be accessed normally from the outside. But maybe there's another way to obtain an equivalent value...
private RuntimeMethodHandle(SerializationInfo info, StreamingContext context)
{
if(info == null)
throw new ArgumentNullException("info");
MethodInfo m =(RuntimeMethodInfo)info.GetValue("MethodObj",
typeof(RuntimeMethodInfo));
m_ptr = m.MethodHandle.Value;
if(m_ptr.ToPointer() == null)
throw new SerializationException(Environment.GetResourceString
("Serialization_InsufficientState"));
}
MethodHandle.Value
is a public
value. Thus, we can obtain the same value contained in m_ptr
through the MethodInfo
class. And m_ptr
is just a pointer to a MethodDesc
class, also known as CORINFO_METHOD_HANDLE
. So, in order to obtain a MethodDesc
pointer through managed code one can write this kind of code:
MethodInfo mi = typeof(Form1).GetMethod("button1_Click");
MessageBox.Show(mi.MethodHandle.Value.ToString("X"));
The point I wanted to make is that it's possible to access part of the .NET internals from managed code as well. Looking at the interaction between managed code and ecalls is one good way to discover some interesting things.
Digging into the .NET Framework internals opens up many new possibilities. For instance, hooking one of the MSIL related methods in the MethodDesc
class could be an alternative way of code injection. The truth is that there isn't just "a way". Just like there isn't only one way to eject MSIL code. In fact, code ejection can go much further than code injection. In this article I presented a very simple, non-intrusive solution to retrieve the original MSIL of an assembly, but if one wants to become serious about code ejection, one could consider using a modified version of the Rotor
(or Mono
) project to retrieve the original MSIL. Or, to keep it simpler, modifying the official .NET framework, though not legal, might be a valid option. In either case, a code injector simply can't protect the original MSIL when the code ejection process is brought that far. There's nothing such a protection can do when the code ejector is the framework itself. That's why I said from the beginning that code injection protections are weak, they can hide the code as long as the reverser doesn't decide to become serious about retrieving the MSIL code.
As I've never read a book nor an article about the CLR infrastructure, what has been presented in this article are the .NET internals from the perspective of a reverser. Having the (almost complete) source code of the .NET Framework made things very easy and the days of research (development included) spent to write this article can be counted on a hand with only two fingers. It has been a much bigger effort writing the article. An effort which can only be compared to the pain one endures from actually reading it. The next article of this kind will be about .NET native compiling. It'll surely be less boring as I don't have to re-explain the basics of .NET internals already covered in this article.
History
- 14th May, 2008: Initial post