Static Linking

This is not really a program protection method, though its application provides a mild obfuscation to the reverse engineering process.

An executable is either statically or dynamically linked. A statically linked executable has all used library functions bundled into the executable. An executable that is dynamically linked only contains user code, with all used library functions residing in shared libraries.

From this description we can see that with dynamically linking, listening at the interface between the executable and the shared libraries will allow us to observe which library functions are called by the executable. This is how what the ltrace program does.

With a statically linked executable, library code is bundled inside the executable along with user code. If there is no symbol table, it becomes impossible to distinguish between user code and library code. This makes the problem of program understanding much more difficult as library code must be analysed as well as user code.

The solution to analysing a statically linked executable with no symbol table is to regenerate the symbol table. This can be done with the rsymtab package. This tool takes a statically linked executable plus the libraries that it was statically linked against and renerates a symbol table for the linked in library.

$ cat test.c
int plus(int a, int b) {
    return a + b;
}
int main() {
    printf("1 + 2 = %d\n", plus(1, 2));
    return 0;
}

$ gcc -static -o test test.c
$ strip ./test
$ ./test
1 + 2 = 3

Now if we examine the generated assembly for the main function, we see that there is nothing really distinguishing the call to printf and the call to the application plus function. The missing symbol table really hinders program understanding:

$ objdump --no-show-raw-insn --disassemble ./hello
[snip]
 80481d4:       push   %ebp
 80481d5:       mov    %esp,%ebp
 80481d7:       sub    $0x8,%esp
 80481da:       add    $0xfffffff8,%esp
 80481dd:       add    $0xfffffff8,%esp
 80481e0:       push   $0x2
 80481e2:       push   $0x1
 80481e4:       call   80481c0
 80481e9:       add    $0x10,%esp
 80481ec:       mov    %eax,%eax
 80481ee:       push   %eax
 80481ef:       push   $0x808a5c8
 80481f4:       call   8048580
 80481f9:       add    $0x10,%esp
 80481fc:       xor    %eax,%eax
 80481fe:       jmp    8048200
 8048200:       leave
 8048201:       ret
[snip]

Regenerating the symbol table and reexamining the main function, we see that the call to printf is now easily identified:

$ objgrep ./hello /usr/lib/libc.a | gensymtab
$ objdump --no-show-raw-insn --disassemble ./hello
[snip]
 80481d4:       push   %ebp
 80481d5:       mov    %esp,%ebp
 80481d7:       sub    $0x8,%esp
 80481da:       add    $0xfffffff8,%esp
 80481dd:       add    $0xfffffff8,%esp
 80481e0:       push   $0x2
 80481e2:       push   $0x1
 80481e4:       call   80481c0
 80481e9:       add    $0x10,%esp
 80481ec:       mov    %eax,%eax
 80481ee:       push   %eax
 80481ef:       push   $0x808a5c8
 80481f4:       call   8048580 <_IO_printf>
 80481f9:       add    $0x10,%esp
 80481fc:       xor    %eax,%eax
 80481fe:       jmp    8048200
 8048200:       leave
 8048201:       ret
[snip]