Introduction
Recently I came across the description of a quite interesting security product, called Sanctuary. This product prevents execution of any program that does not appear on the list of software that is allowed to run on a particular machine. As a result, the PC user is protected against various add-on spyware, worms and trojans - even if some piece of malware finds its way to his/her computer, it has no chance of being executed, and, hence, has no chance of causing any damage to the machine. Certainly, I found this feature interesting, and, after a bit of thinking, came up with my own implementation of it. Therefore, this article describes how process creation can be programmatically monitored and controlled on a system-wide basis by means of hooking the native API.
This article makes a "bold" assumption that the target process is being created by user-mode code (shell functions, CreateProcess()
, manual process creation as a sequence of native API calls, etc). Although, theoretically, a process may be launched by kernel-mode code, such possibility is, for practical purposes, negligible, so we don't have to worry about it. Why??? Try to think logically - in order to launch a process from the kernel mode, one has to load a driver, which, in turn, implies execution of some user-mode code, in the first place. Therefore, in order to prevent execution of unauthorized programs, we can safely limit ourselves to controlling process creation by user-mode code on a system-wide basis.
Defining our strategy
First of all, let's decide what we have to do in order to monitor and control process creation on a system-wide basis.
Process creation is a fairly complex thing, which involves quite a lot of work (if you don't believe me, you can disassemble CreateProcess()
, so you will see it with your own eyes). In order to launch a process, the following steps have to be taken:
- Executable file has to be opened for
FILE_EXECUTE
access. - Executable image has to be loaded into RAM.
- Process Executive Object (
EPROCESS
, KPROCESS
and PEB
structures) has to be set up. - Address space for the newly created process has to be allocated.
- Thread Executive Object for the primary thread of the process (
ETHREAD
, KTHREAD
and TEB
structures) has to be set up. - Stack for the primary thread has to be allocated.
- Execution context for the primary thread of the process has to be set up.
- Win32 subsystem has to be informed about the new process.
In order for any of these steps to be successful, all previous steps have to be accomplished successfully (you cannot set up an Executive Process Object without a handle to the executable section; you cannot map an executable section without file handle, etc). Therefore, if we decide to abort any of these steps, all subsequent ones will fail as well, so that process creation will get aborted. It is understandable that all the above steps are taken by means of calling certain native API functions. Therefore, in order to monitor and control process creation, all we have to do is to hook those API functions that cannot be bypassed by the code that is about to launch a new process.
Which native API functions should we hook? Although NtCreateProcess()
seems to be the most obvious answer to the question, this answer is wrong - it is possible to create a process without calling this function. For example, CreateProcess()
sets up process-related kernel-mode structures without calling NtCreateProcess()
. Therefore, hooking NtCreateProcess()
is of no help to us.
In order to monitor process creation, we have to hook either NtCreateFile()
and NtOpenFile()
, or NtCreateSection()
- there is absolutely no way to run any executable file without making these API calls. If we decide to monitor calls to NtCreateFile()
and NtOpenFile()
, we have to make a distinction between process creation and regular file IO operations. This task is not always easy. For example, what should we do if some executable file is being opened for FILE_ALL_ACCESS
??? Is it just an IO operation or is it a part of a process creation??? It is hard to make any judgment at this point - we need to see what the calling thread is about to do next. Therefore, hooking NtCreateFile()
and NtOpenFile()
is not the best possible option.
Hooking NtCreateSection()
is a much more reasonable thing to do - if we intercept a call to NtCreateSection()
with the request of mapping the executable file as an image (SEC_IMAGE
attribute), combined with the request of page protection that allows execution, we can be sure that the process is about to be launched. At this point we are able to take a decision, and, in case if we don't want the process to be created, make NtCreateSection()
return STATUS_ACCESS_DENIED
. Therefore, in order to gain full control over process creation on the target machine, all we have to do is to hook NtCreateSection()
on a system-wide basis.
Like any other stub from ntdll.dll, NtCreateSection()
loads EAX
with the service index, makes EDX
point to function parameters, and transfers execution to KiDispatchService()
kernel-mode routine (this is done by the INT 0x2E
instruction under Windows NT/2000 and SYSENTER
instruction under Windows XP). After validating function parameters, KiDispatchService()
transfers execution to the actual implementation of the service, the address of which is available from the Service Descriptor Table (pointer to this table is exported by ntoskrnl.exe as the KeServiceDescriptorTable
variable, so it is available to kernel-mode drivers). The Service Descriptor Table is described by the following structure:
struct SYS_SERVICE_TABLE {
void **ServiceTable;
unsigned long CounterTable;
unsigned long ServiceLimit;
void **ArgumentsTable;
};
The ServiceTable
field of this structure points to the array that holds addresses of all the functions that implement system services. Therefore, all we have to do in order to hook any native API function on a system-wide basis is to write the address of our proxy function to the i-th entry (i is the service index) of the array, pointed to by the ServiceTable
field of KeServiceDescriptorTable
.
Looks like now we know everything we need to know in order to monitor and control process creation on a system-wide basis. Let's proceed to the actual work.
Controlling process creation
Our solution consists of a kernel-mode driver and a user-mode application. In order to start monitoring process creation, our application passes the service index, corresponding to NtCreateSection()
, plus the address of the exchange buffer, to our driver. This is done by the following code:
device=CreateFile("\\\\.\\PROTECTOR",GENERIC_READ|GENERIC_WRITE,
0,0,OPEN_EXISTING, FILE_ATTRIBUTE_SYSTEM,0);
DWORD * addr=(DWORD *)
(1+(DWORD)GetProcAddress(GetModuleHandle("ntdll.dll"),
"NtCreateSection"));
ZeroMemory(outputbuff,256);
controlbuff[0]=addr[0];
controlbuff[1]=(DWORD)&outputbuff[0];
DeviceIoControl(device,1000,controlbuff,256,controlbuff,256,&dw,0);
The code is almost self-explanatory - the only thing that deserves a bit of attention is the way we get the service index. All stubs from ntdll.dll start with the line MOV EAX, ServiceIndex
, which applies to any version and flavour of Windows NT. This is a 5-byte instruction, with MOV EAX
opcode as the first byte and the service index as remaining 4 bytes. Therefore, in order to get the service index that corresponds to some particular native API function, all you have to do is to read 4 bytes from the address, located 1 byte away from the beginning of the stub.
Now let's look at what our driver does when it receives IOCTL from our application:
NTSTATUS DrvDispatch(IN PDEVICE_OBJECT device,IN PIRP Irp)
{
UCHAR*buff=0; ULONG a,base;
PIO_STACK_LOCATION loc=IoGetCurrentIrpStackLocation(Irp);
if(loc->Parameters.DeviceIoControl.IoControlCode==1000)
{
buff=(UCHAR*)Irp->AssociatedIrp.SystemBuffer;
memmove(&Index,buff,4);
a=4*Index+(ULONG)KeServiceDescriptorTable->ServiceTable;
base=(ULONG)MmMapIoSpace(MmGetPhysicalAddress((void*)a),4,0);
a=(ULONG)&Proxy;
_asm
{
mov eax,base
mov ebx,dword ptr[eax]
mov RealCallee,ebx
mov ebx,a
mov dword ptr[eax],ebx
}
MmUnmapIoSpace(base,4);
memmove(&a,&buff[4],4);
output=(char*)MmMapIoSpace(MmGetPhysicalAddress((void*)a),256,0);
}
Irp->IoStatus.Status=0;
IoCompleteRequest(Irp,IO_NO_INCREMENT);
return 0;
}
As you can see, there is nothing special here either - we just map the exchange buffer into the kernel address space by MmMapIoSpace()
, plus write the address of our proxy function to the Service Table (certainly, we do it after having saved the address of the actual service implementation in the RealCallee
global variable). In order to overwrite the appropriate entry of the Service Table, we map the target address with MmMapIoSpace()
. Why do we do it? After all, we already have an access to the Service Table, don't we? The problem is that the Service Table may reside in read-only memory. Therefore, we have to check whether we have write access to the target page, and if we don't, we have to change page protection before overwriting the Service Table. Too much work, don't you think? Therefore, we just map our target address with MmMapIoSpace()
, so we don't have to worry about page protection any more - from now on we can take write access to the target page for granted. Now let's look at our proxy function:
ULONG __stdcall check(PULONG arg)
{
HANDLE hand=0;PFILE_OBJECT file=0;
POBJECT_HANDLE_INFORMATION info;ULONG a;char*buff;
ANSI_STRING str; LARGE_INTEGER li;li.QuadPart=-10000;
if((arg[4]&0xf0)==0)return 1;
if((arg[5]&0x01000000)==0)return 1;
hand=(HANDLE)arg[6];
ObReferenceObjectByHandle(hand,0,0,KernelMode,&file,&info);
if(!file)return 1;
RtlUnicodeStringToAnsiString(&str,&file->FileName,1);
a=str.Length;buff=str.Buffer;
while(1)
{
if(buff[a]=='.'){a++;break;}
a--;
}
ObDereferenceObject(file);
if(_stricmp(&buff[a],"exe")){RtlFreeAnsiString(&str);return 1;}
KeWaitForSingleObject(&event,Executive,KernelMode,0,0);
strcpy(&output[8],buff);
RtlFreeAnsiString(&str);
a=1;
memmove(&output[0],&a,4);
while(1)
{
KeDelayExecutionThread(KernelMode,0,&li);
memmove(&a,&output[0],4);
if(!a)break;
}
memmove(&a,&output[4],4);
KeSetEvent(&event,0,0);
return a;
}
_declspec(naked) Proxy()
{
_asm{
pushfd
pushad
mov ebx,esp
add ebx,40
push ebx
call check
cmp eax,1
jne block
popad
popfd
jmp RealCallee
block:popad
mov ebx, dword ptr[esp+8]
mov dword ptr[ebx],0
mov eax,0xC0000022L
popfd
ret 32
}
}
Proxy()
saves registers and flags, pushes a pointer to the service parameters on the stack, and calls check()
. The rest depends on the value check()
returns. If check()
returns TRUE
(i.e. we want to proceed with the request), Proxy()
restores registers and flags, and transfers control to the service implementation. Otherwise, Proxy()
writes STATUS_ACCESS_DENIED
to EAX
, restores ESP
and returns - from the caller's perspective it looks like NtCreateSection()
call had failed with STATUS_ACCESS_DENIED
error status.
How does check()
make its decision? Once it receives a pointer to the service parameters as an argument, it can examine these parameters. First of all, it checks flags and attributes - if a section is not requested to be mapped as an executable image, or if the requested page protection does not allow execution, we can be sure that NtCreateSection()
call has nothing to do with process creation. In such a case check()
returns TRUE
straight away. Otherwise, it checks the extension of the underlying file - after all, the SEC_IMAGE
attribute and the page protection that allows execution may be requested for mapping some DLL file. If the underlying file is not a .exe file, check()
returns TRUE
. Otherwise, it gives the user-mode code a chance to take its decision. Therefore, it just writes the file name and the path to the exchange buffer, and polls it until it gets the response.
Before opening our driver, our application creates a thread that runs the following function:
void thread()
{
DWORD a,x; char msgbuff[512];
while(1)
{
memmove(&a,&outputbuff[0],4);
if(!a){Sleep(10);continue;}
char*name=(char*)&outputbuff[8];
for(x=0;x<stringcount;x++)
{
if(!stricmp(name,strings[x])){a=1;goto skip;}
}
strcpy(msgbuff, "Do you want to run ");
strcat(msgbuff,&outputbuff[8]);
if(IDYES==MessageBox(0, msgbuff,"WARNING",
MB_YESNO|MB_ICONQUESTION|0x00200000L))
{a=1; strings[stringcount]=_strdup(name);stringcount++;}
else a=0;
skip:memmove(&outputbuff[4],&a,4);
a=0;
memmove(&outputbuff[0],&a,4);
}
}
This code is self-explanatory - our thread polls the exchange buffer every 10 ms. If it discovers that our driver has posted its request to the buffer, it checks the file name and path against the list of programs that are allowed to run on the machine. If the match is found, it gives an OK response straight away. Otherwise, it displays a message box, asking the user whether he allows the program in question to be executed. If the response is positive, we add the program in question to the list of software that is allowed to run on the machine. Finally, we write the user response to the buffer, i.e., pass it to our driver. Therefore, the user gets the full control of processes creation on his PC - as long as our program runs, there is absolutely no way to launch any process on the PC without asking user permission.
As you can see, we make the kernel-mode code wait for the user response. Is it really a wise thing to do??? In order to answer this question, you have to ask yourself whether you are blocking any critical system resources -everything depends on the situation. In our case everything happens at IRQL PASSIVE_LEVEL
, dealing with IRPs is not involved, and the thread that has to wait for the user response is not of critical importance. Therefore, in our case everything works fine. However, this sample is written for demonstration purposes only. In order to make any practical use of it, it makes sense to rewrite our application as an auto-start service. In such a case, I suggest we should make an exemption for the LocalSystem account, and, in case if NtCreateSection()
is called in the context of a thread with LocalSystem account privileges, proceed to the actual service implementation without performing any checks -after all, LocalSystem account runs only those executables that are specified in the Registry. Therefore, such an exemption is not going to compromise our security.
Conclusion
In conclusion I must say that hooking the native API is definitely one the most powerful programming techniques that ever existed. This article gives you just one example of what can be achieved by hooking the native API - as you can see, we managed to prevent execution of unauthorized programs by hooking a single(!!!) native API function. You can extend this approach further, and gain full control over hardware devices, file IO operation, network traffic, etc. However, our current solution is not going to work for kernel-mode API callers - once kernel-mode code is allowed to call ntoskrnl.exe's exports directly, these calls don't need to go via the the system service dispatcher. Therefore, in my next article we are going to hook ntoskrnl.exe itself.
This sample has been successfully tested on several machines that run Windows XP SP2. Although I haven't yet tested it under any other environment, I believe that it should work fine everywhere - after all, it does not use any structure that may be system-specific. In order to run the sample, all you have to do is to place protector.exe and protector.sys to the same directory, and run protector.exe. Until protector.exe's application window is closed, you will be prompted every time you attempt running any executable.
I would highly appreciate if you send me an e-mail with your comments and suggestions.