Introduction
This article hopes to introduce live patching a process in Linux. Readers will be able to inject an SO file into a remote process running on Linux (x86-64bit process, tested on Ubuntu 16.04, 4.4.0-22-generic), provided they have the required access rights. We will revisit debugging on Linux platform and step around Linux's version of ASLR (https://en.wikipedia.org/wiki/Address_space_layout_randomization).
This is best required when you don't have the source code but want the process to forcefully load your SO file, in the constructor of a global object in the SO file, you could perform all sorts of operations including API hooking.
Background
I would recommed users to refer to my previous article: http://www.codeproject.com/Articles/1073879/Write-Your-Own-Linux-Debugger. For Windows developers, we can use the API CreateRemoteThread
as mentioned in http://www.codeproject.com/Articles/535677/Memory-Analyzer-x-bit-a-Free-Detour.
Unfortunately for Linux developers, we don't have any direct way to do this, which is where this article comes in :-). Since we are actually writing a debugger of sorts (using ptrace
), you could also achieve this using gdb
.
Users will require Qtcreator: sudo apt-get install -y qtcreator qt5-default
Using the Code
As with all my previous articles, the code must be refered to at all times.
We start by spawing the target process that needs to be injected by the SO file (as in the attached sample). We can also work with providing the PID
. Fork always does the trick.
switch(rpid = fork()) {
case 0:{
int y=execlp("../build-QTUI_App-Desktop-Debug/QTUI_App",0);
break;
}
case -1:
printf("error spawning process\n");exit(-1);
break;
}
The next step is to attach your process to this target process, your process is responsible for manipulating this target process to load the required SO file.
ptrace
is the single most crucial API thats is provided by the OS to assist in debugging.
int status=0;
ptrace(PTRACE_ATTACH,rpid,NULL);printf("error %u\n",errno); pid_t tid=wait(&status);
ptrace(PTRACE_SETOPTIONS, tid, NULL, PTRACE_O_TRACEFORK |
PTRACE_O_TRACEVFORK | PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXIT);
Now comes the best part...
We have to manupilate the target process by injecting opcodes so as to trick it to load our module (lib.so). The opcodes used here are for x86 (64bit) only, I may port it to x86 32-bit.
When the debugger is attached to the process, it sends a SIGSTOP
to the debuggee. When this happens, we get busy. We want the debugger to break with SIGTRAP
, so that the tracee breaks for good and that we can query register state, which is why we need to add additional code.
ptrace(PTRACE_SINGLESTEP, tid, 0, 0);
tid=waitpid(-1, &status, __WALL);
This will cause the processor to execute the next instruction and then SIGTRAP
. This is what the debugger uses for single stepping every instruction when debugging with disassembly.
For x86, calling ptrace
(PTRACE_SINGLESTEP
...) causes the trace flag to be set (https://en.wikipedia.org/wiki/FLAGS_register), when the trace flag is set, debug exception is raised for every instruction executed, when the ISR is called, it doesn't single step for obvious reasons.
We must save the state of the process before changing it:
The API process_vm_readv(rpid,&Originaliovec,1,&remote_iov,1,0)
is used to copy the target process memory so as to restore it later. We must also save the state of the registers, ptrace to the rescue.
ptrace(PTRACE_GETREGS,tid,NULL,&uregs);
Now that we have enough to restore the process to its original state, we can change values in the address space of the target process.
WriteProcessMemory is used for just this.
We move the required parameters to RSI and RDI and then call dlopen
, all this is done via injecting the required opcodes into the target process's address space, the function is well commented using numbering (1),(2),...
ptrace(PTRACE_POKETEXT...)
is used to write memory to target process.
void WriteProcessMemory(const unsigned int rpid,user uregs={})
{
char *str = libName;
memcpy(data_opcodes, str,strlen(str)+1);
unsigned char MovRaxtoRDI[] = { 0x48, 0x8B, 0xf8 }; unsigned char Mov1toRBX[] = { 0x48, 0xc7, 0xc3, 01, 0, 0, 0 }; unsigned char MovRBXtoRSI[] = { 0x48, 0x8B, 0xF3 }; unsigned char CallRax[] = {0xff, 0xd0, 0xcc };
unsigned char opcodes[50];
unsigned char MovtoRax[2 + 8] = { 0x48, 0xb8 };
void *p = uregs.regs.rip+sizeof(MovtoRax) + sizeof(MovRaxtoRDI) +
sizeof(Mov1toRBX) + sizeof(MovRBXtoRSI) + sizeof(MovtoRax) + sizeof(CallRax);
memcpy(&MovtoRax[2], &p, 8);
memcpy(opcodes, MovtoRax, sizeof(MovtoRax));
memcpy(opcodes + sizeof(MovtoRax), MovRaxtoRDI, sizeof(MovRaxtoRDI));
memcpy(opcodes + sizeof(MovtoRax) + sizeof(MovRaxtoRDI),
Mov1toRBX, sizeof(Mov1toRBX)); memcpy(opcodes + sizeof(MovtoRax) + sizeof(MovRaxtoRDI) +
sizeof(Mov1toRBX), MovRBXtoRSI, sizeof(MovRBXtoRSI));
p = FindFuncAddr("libdl",dlopen,rpid); memcpy(&MovtoRax[2], &p, 8); memcpy(opcodes + sizeof(MovtoRax) + sizeof(MovRaxtoRDI) + sizeof(Mov1toRBX) +
sizeof(MovRBXtoRSI), MovtoRax, sizeof(MovtoRax));
memcpy(opcodes + sizeof(MovtoRax) + sizeof(MovRaxtoRDI) +
sizeof(Mov1toRBX) + sizeof(MovRBXtoRSI) + sizeof(MovtoRax), CallRax, sizeof(CallRax));
memcpy(data_opcodes, opcodes, sizeof(MovtoRax) + sizeof(MovRaxtoRDI) +
sizeof(Mov1toRBX) + sizeof(MovRBXtoRSI) + sizeof(MovtoRax) + sizeof(CallRax));
memcpy(data_opcodes+sizeof(MovtoRax) + sizeof(MovRaxtoRDI) +
sizeof(Mov1toRBX) + sizeof(MovRBXtoRSI) + sizeof(MovtoRax) + sizeof(CallRax),
str,strlen(str)+1);
for(int i=0;i<sizeof(data_opcodes);++i){
ptrace(PTRACE_POKETEXT,rpid,uregs.regs.rip+i,data_opcodes[i]);
}
}
As mentioned earlier, to call dlopen
, we need to know its address in that target process. This is done by refering to file /proc/<PID>/maps (since you have access to the target process, you can open it). From here, we get the address of the module which holds the function dlopen
. Now to get the function's address, well you know the function address in your current process and its offset from the address of the module loaded in your process, use the same offset with respect to the module in the remote process.
The below function will find out where libdl
is loaded.
void *FindSoAddress(const char *strLibName,pid_t pid)
The below function will find out the function address of dlopen
in that target process.
void *FindFuncAddr(const char *strLibName,const void *pLocalFuncAddr,pid_t pid)
Now that opcodes to load the SO file are in place (along with the desired break point after call RAX), let the target process continue.
ptrace(PTRACE_CONT, tid, NULL,0);
This will cause the target process to load the SO file and then break (with int 3: 0xcc
), refer to unsigned char CallRax[] = {0xff, 0xd0, 0xcc }; //Call RAX
and then break (int 3) in function WriteProcessMemory
.
Once this is done, commence restoration and then exit your process.
if(siginfo.si_signo==5 && bFirst)
{
bFirst=false;
RestoreMemory(rpid,originalRegs);
ptrace(PTRACE_DETACH,tid,0,0);
exit(0);}
RestoreMemory
is going to make use of ptrace(PTRACE_POKETEXT...)
to write back to target process and use ptrace(PTRACE_SETREGS,rpid,NULL,&originalRegs)
to set the original register context and then detach itself and exit.
Don't forget to build the lib.so file: gcc lib.cpp -shared -fpic -o lib.so
Points of Interest
Armed with this, readers can now implement CreateRemoteThread
on Linux systems, API hooks for remote processes.
History
- 4th June, 2016: Initial version