Table of Contents
In the previous article, we examined the mechanism of dynamic linking of functions in Mach-O. Now let’s move to practice.
We have a program under Mac OS X that is used by a number of third-party dynamically linked libraries, which, in their turn, also use functions of each other.
The task is as follows: we need to intercept the call of a certain function from one library to another and call the original in the handler.
Test Example
Let’s take an imaginary example. Supposing we have a program called «test» written in C language (test.c file) and a shared library (libtest.c file) with constant contents and compiled beforehand. This library implements one libtest()
function. In their implementation, both the program and the library use the puts()
function from a standard library of C language (it is provided together with Mac OS and contains in libSystem.B.dylib
). Let’s look at the schematic view of the described situation:
The task is the following:
- We need to replace the call of the
puts()
function for the libtest.dylib
library with the call of the hooked_puts()
function that is implemented in the main program (test.c file). The last, in its turn, can use the original puts()
function:
- We need to cancel the performed changes, i.e., to make so that the repeated call of
libtest()
lead to the call of the original puts()
.
It is not allowed to change the code or recompile the libraries, only the main program. The call redirection itself should be performed only for a specific library and on the fly, without the program restart.
Redirection Algorithm
Let’s describe all actions in words because the code can turn out not so clear despite the number of comments:
- Find the symbol table and table of
string
s using data from the LC_SYMTAB
loader command. - From the
LC_DYSYMTAB
loader command, find out, from which element of the symbol table a subset of undefined symbols (iundefsym
field) begins. - Find the target symbol by its name among the subset of undefined symbols in the symbol table.
- Remember the index of the target symbol from the beginning of the symbol table.
- Find the table of indirect symbols by data from the
LC_DYSYMTAB
loader command (indirectsymoff
field). - Find out the index, starting from which mapping of the import table (contents of the
__DATA, __la_symbol_ptr
section; or __IMPORT, __jump_table
— there will be one of these) to the table of indirect symbols (reserved1
field) begins. - Starting from this index, we look through the table of indirect symbols and search for the value that corresponds to the index of the target symbol in the symbol table.
- Remember the number of the target symbol from the beginning of the mapping of the import table to the table of indirect symbols. The saved value is the index of the required element in the import table.
- Find the import table (offset field) using data from the
__la_symbol_ptr
section (or __jump_table
). - Having the index of the target element in it, rewrite the address (for
__la_symbol_ptr
) to a required value (or just change the CALL/JMP
instruction to JMP
with an operand — address of the required function (for __jump_table
)).
I will note that you should work with tables of symbols, string
s, and indirect symbols only after loading them from the file. Also, you should read the contents of sections that describe import tables as well as perform the redirection itself in memory. It is connected with the fact that tables of symbols and tables of string
s can be absent or cannot display the real state in the target Mach-O. It is because the dynamic loader worked there before us and it successfully saved all necessary data about symbols without allocating the tables themselves.
Redirection Implementation
It’s time to turn our thoughts into the code. Let’s divide all operation into three stages for the optimization of search of the required Mach-O elements:
1. void *mach_hook_init(char const *library_filename, void const *library_address);
Basing on the Mach-O file and its displaying in memory, this function returns some non-clear descriptor. Behind this descriptor, offsets of the import table, symbol table, table of string
s, and the mapping of indirect symbols from the table of dynamic symbols as well as a number of useful indexes for this module stand. The descriptor is the following:
struct mach_hook_handle
{
void const *library_address; char const *string_table; struct nlist const *symbol_table; uint32_t const *indirect_table; uint32_t undefined_symbols_count; uint32_t undefined_symbols_index; uint32_t indirect_symbols_count; uint32_t indirect_symbols_index; uint32_t import_table_offset; uint32_t jump_table_present; };
2. mach_substitution mach_hook(void const *handle,
char const *function_name, mach_substitution substitution);
This function performs the redirection by the algorithm described above using the existing library descriptor, name of the target symbol, and address of the interceptor.
3. void mach_hook_free(void *handle);
In this way, cleanup of any descriptor returned by mach_hook_init()
is performed.
Taking into account these prototypes, we need to rewrite the test program:
#include <stdio.h>
#include <dlfcn.h>
#include "mach_hook.h"
#define LIBTEST_PATH "libtest.dylib"
void libtest();
int hooked_puts(char const *s)
{
puts(s);
return puts("HOOKED!");
}
int main()
{
void *handle = 0; mach_substitution original; Dl_info info;
if (!dladdr((void const *)libtest, &info)) {
fprintf(stderr, "Failed to get the base address of a library!\n",
LIBTEST_PATH);
goto end;
}
handle = mach_hook_init(LIBTEST_PATH, info.dli_fbase);
if (!handle)
{
fprintf(stderr, "Redirection init failed!\n");
goto end;
}
libtest();
puts("-----------------------------");
original = mach_hook(handle, "puts", (mach_substitution)hooked_puts);
if (!original)
{
fprintf(stderr, "Redirection failed!\n");
goto end;
}
libtest();
puts("-----------------------------");
original = mach_hook(handle, "puts", original);
if (!original)
{
fprintf(stderr, "Restoration failed!\n");
goto end;
}
libtest();
end:
mach_hook_free(handle);
handle = 0;
return 0;
}
Test Start
We can test it in the following way:
user@mac$ arch -i386 ./test
libtest: calls the original puts()
-----------------------------
libtest: calls the original puts()
HOOKED!
-----------------------------
libtest: calls the original puts()
user@mac$ arch -x86_64 ./test
libtest: calls the original puts()
-----------------------------
libtest: calls the original puts()
HOOKED!
-----------------------------
libtest: calls the original puts()
The program output indicates the full execution of the task that was formulated in the beginning.
The full implementation of the test example together with the redirection algorithm and the project file are attached to the article.
History
- 26th April, 2011: Initial post