Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / ASM

Hook Interrupts and Call Kernel Routines in User Mode

4.92/5 (12 votes)
20 May 2009CPOL9 min read 77.6K   1.5K  
Inject user mode routine into kernel space and execute

Introduction

This is an enhanced version of my phymem driver, refer to Access Physical Memory, Port and PCI Configuration Space.

This article shows a kind of kernel code injection method by which we can inject our user mode code into the kernel space and let the OS kernel execute it. I don't think this is actually useful, it’s just an idea in my mind and I finally made it work. The source code may be helpful to driver and kernel developers. The reader is assumed to have some experience in kernel mode driver coding.

My codes about IDT management are based on the excellent article, Interrupt Hooking and Retrieving Device Information on Windows NT/2000/XP by Alexander M. Thanks very much.

Hook ISR in User Mode

Of course, it’s impossible to hook ISR in user mode. What I really mean is that we develop a normal application, and with the help of the phymem driver, we map a specific routine into the kernel address, hook this routine with some ISR. It’s commonly known as a callback function, with the difference that our function is called by the interrupt service routine at CPU DIRQL. It looks as the OS will call our application’s function when the corresponding interrupt occurs.

To make things meaningful, we should find some way to communicate with this code. Normal argument passing is definitely not available. I use a context pointer which points to a user defined structure to make argument transfer possible, just like the DeviceExtension used by device drivers.

There are many obstacles beyond us:

1. How to Get Function Size

To map a specific code to kernel space, we must first get the code size (how many bytes this code section occupies). First I thought that just putting a null function after and subtracting the two function addresses will work. But immediately I found the compiler helps us too much. It will add jump table in debug mode, and it will occasionally change the function order, and more. After searching the web and testing for a long time, I finally found this MAYBE reliable implementation in VC6.0 SP6.

C++
#pragma comment(linker, "/incremental:no")
#pragma code_seg("systemcode")
static void __stdcall isr(PISRCONTEXT context)
{
  context->a++;
}
static void __stdcall isrend()
{
}
#pragma code_seg()
DWORD sizeofcode=(DWORD)isrend-(DWORD)systemcode;
  • Disable linker incremental options or declare function as static to prevent the compiler from creating jump table which will make the function pointer different from its actual address.
  • Declare the two functions in a single section to prevent the compiler from changing their order. It seems that if you define more than two functions, their order may change even optimization is disabled.

2. How to Map the Code and Context into Kernel Space

In my previous article, Access Physical Memory, Port and PCI Configuration Space, I introduced the way to map kernel address to user space. It’s also very simple to map user address to kernel space.

  • Use IoAllocateMdl to allocate an MDL for this user address
  • Use MmProbeAndLockPages with the second parameter “UserMode” to lock the memory
  • Use MmGetSystemAddressForMdlSafe to get the mapped kernel address

3. How to Hook ISR

I won't talk too much about ISR hooking as Alexander and his excellent article, Interrupt Hooking and Retrieving Device Information on Windows NT/2000/XP, have explained it clearly. My code is also based on Alexander’s source code.

What I want to say is that unlink Alexander’s sample where new ISR is statically linked, our ISR is transferred to us at runtime. So we have no knowledge of its address at link time. We can setup a jump table and hook ISR to let them check if a new routine is installed, but it’s not efficient. ISR should run as quick as possible to improve the whole OS performance.

My approach is to dynamically create the new ISR code in a buffer, and fill the corresponding IDT entry with this buffer address. What’s actually located in the buffer is code, not data. You should be familiar with this technique if you know what “buffer overflow” means.

C++
The ISR C prototype and disassembled codes (Thanks Alexander):
: __declspec(naked) void Isr()
: {
:   __asm { PUSHAD };
00403030 60 pushad
:   __asm { PUSH context };
00403031 68 3C AC 42 00 push context (0042ac3c)
:   __asm { CALL newIsr };
00403036 E8 C5 DF FF FF call newIsr (00401000) //0x401000-0x403036-5
:   __asm { POPAD };
0040303B 61 popad
:   __asm { JMP originalIsr };
0040303C E9 0F E0 FF FF jmp originalIsr (00401050) //0x401050-0x40303c-5

Following is the dynamically code creating procedure:

C++
//pushad
isrCode[0]=0x60; //pushad
//push context
isrCode[1]=0x68; //push
RtlCopyMemory(&isrCode[2], &context, 4); //context address
//call isrAddr
isrCode[6]=0xE8; //call
relativeAddr=(LONG)isrAddr-(LONG)&isrCode[6]-5; //relative address
RtlCopyMemory(&isrCode[7], &relativeAddr, 4);
//popad
isrCode[11]=0x61; //popad
//jmp originalAddr
isrCode[12]=0xE9; //jmp
relativeAddr=(LONG)originalIsr-(LONG)&isrCode[12]-5; //relative address
RtlCopyMemory(&isrCode[13], &relativeAddr, 4);

The whole code size is only 17 bytes, and a jump table will occupy 256*4 bytes (I'm too lazy to calculate the product; the compiler will do it for us :).

4. Things to Take Care

Keep in mind that ISR is running at the lowest level of OS. Though your ISR source code is together with your application, at most times, it will be executed in another thread’s context. It means when this code is running, the application containing this code is sleeping. It seems weird but it’s true.

Don't touch anything except what’s in the context structure, don't call any subroutines. The compiler will not complain if you refer to global variables or call windows library routines. But their virtual address is invalid to kernel, so you will definitely meet with the death blue screen. It’s sick enough to call MessageBox from an interrupt service routine.

Call Kernel Routines in User Mode Application

With the previous knowledge, we can dig further to find the way to directly call kernel routines. We write in our application a routine which calls kernel routines such as READ_PORT_ULONG, and then we insert the routine to kernel, execute it, and check the result, just as we are writing a *.sys, not a *.exe.

1. Get Kernel Routine Address

To call kernel routine, we must first find its address. After I spent a whole afternoon writing code and debugging and finally succeeded, I accidentally found a kernel routine named “MmGetSystemRoutineAddress” which does almost the same thing. I keep using my code not only because of a half day’s brain and hand, but also my code can be improved to find any kernel routines. MmGetSystemRoutineAddress can only find NTOSKRNL.EXE and HAL.DLL exports. Moreover, my code runs mainly in user mode.

I always think a driver developer is luckier than an application developer as he owns the whole virtual address above 2G, and the address region is not divided by different processes. You can freely access the whole address region without worrying about whether the OS allows me, you are the OS itself. A particular kernel routine’s address is always the same in any thread context.

Understanding this will make finding kernel routines much easier and direct.

  • Get a kernel routine address in phymem driver as a reference and return it to the user application.
    e.g. DWORD refFuncAddr=(DWORD)IoAllocateIrp
  • In the user application, we can analyze the related file (NTOSKRNL.EXE, NTDLL.DLL, …), find all its exported function’s entry point and name
  • Calculate the difference between the searched function and the reference function’s entry point, add the value to refFuncAddr. The result is the searched function’s virtual address.

Look at sample code phymem\pmdll\peexports.cpp to see how to enumerate DLL export functions. It uses ImageHlp library. I first copy the code from an MSDN sample, and finally find the sample errors in some conditions. ImageHlp helps us in getting the export table, but the function names are ordered by “Hint” while their entry points are ordered by “Ordinal”. I was lucky enough to find the solution before ImageHlp drives me mad.

Note that many kernel routines are MACROS rather than actual functions. So we cannot find the address of “IoCopyCurrentIrpStackLocationToNext”, the longest named routine I have ever met.

2. Declare Kernel Routines in our Application

We cannot directly copy the function prototype from WDM.H or NTDLL.H because what we need is a function pointer. I never really understood the grammar of declaring a function pointer, but I know how to do this. Add “typedef” at the very first, then embrace the function name, finally add a “*” before the function name.

C++
Kernel function declaration
PVOID MmMapIoSpace(
  PHYSICAL_ADDRESS PhysicalAddress,
  ULONG NumberOfBytes,
  MEMORY_CACHING_TYPE CacheEnable
);
User function pointer declaration
typedef PVOID (__stdcall *MmMapIoSpace)(
  LARGE_INTEGER PhysicalAddress,
  ULONG NumberOfBytes,
  int CacheEnable
);

We should make some change to the prototype because WINDOWS.H lacks kernel data types. Just as the example shows, I change PHYSICAL_ADDRESS to LARGE_INTEGER and MEMORY_CACHING_TYPE to int. Add __stdcall to match the kernel routines’ argument pushing order and stack manipulation.

3. About chkesp()

If we call subroutines in a function, the compiler will append chkesp() in the function at debug build to catch possible errors. It may be good for a common application, yet very bad for us, because chkesp() has a user mode address and cannot be touched by the kernel. The /GZ compiler option controls whether chkesp() will be automatically appended. Delete this option and everything will be ok.

Microsoft tells us turning on /GZ will enable us to catch release-build errors in debug build. But what I met is just the opposite. Once I wrote a program which was fine in debug build, but failed in release build. I finally found that I forgot to initialize a local variable. /GZ filled the variable with 0xCC and it happened to be a legal value, but in release build it’s a random value and the program failed. Microsoft always tends to help us too much. I'm very afraid of the new Visual Studio which is so huge, so I kept using VC6. Another reason is that my notebook has only 256M memory and 30G hard disk, while my home PC with 2G memory and 300G hard disk is always occupied by my wife and daughter playing Tetris.

4. About the Test Program

In the test sample, I construct two routines, one is to map physical address to user space, and another unmap it. In phymem driver, you can find the standard kernel mode implementation.

I call the map procedure, get a virtual address point to physical address 0, and search in the first 1M physical memory for the ACPI RSDP signature “RSD PTR”. Finally, unmap the memory.

Using the Code

As I said at the very beginning, I don't think what I implemented here is actually useful, so I won't waste time talking about how to use the code. Just look at the source code. The useful part is shown in my previous article, Access Physical Memory, Port and PCI Configuration Space

DISCLAIMER

This code may cause system crash, so take care to backup your job first. If you are still using VC6 like me, save your source code to another place before running. VC6 has a horrible bug which may overwrite your source file with binary data when system suddenly crashes or shuts down. I have met with this many times as my eight years old notebook has the habit of shutting itself down randomly without telling me. I really dislike backup but Microsoft teaches me it’s a must, thanks.

History

  • 20th May, 2009: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)