mirror of
https://github.com/fdiskyou/Zines.git
synced 2025-03-09 00:00:00 +01:00
2849 lines
97 KiB
Text
2849 lines
97 KiB
Text
==Phrack Inc.==
|
|
|
|
Volume 0x0d, Issue 0x42, Phile #0x0E of 0x11
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----------------=[ manual binary mangling with radare ]=--------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=--------------------=[ by pancake <at> nopcode.org> ]=-----------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
1 - Introduction
|
|
1.1 - The framework
|
|
1.2 - First steps
|
|
1.3 - Base conversions
|
|
1.4 - The target
|
|
|
|
2 - Injecting code in ELF
|
|
2.1 - Resolving register based branches
|
|
2.2 - Resizing data section
|
|
2.3 - Basics on code injection
|
|
2.4 - Mmap trampoline
|
|
2.4.1 - Call trampoline
|
|
2.4.2 - Extending trampolines
|
|
|
|
3 - Protections and manipulations
|
|
3.1 - Trashing the ELF header
|
|
3.2 - Source level watermarks
|
|
3.3 - Ciphering .data section
|
|
3.4 - Finding differences in binaries
|
|
3.5 - Removing library dependencies
|
|
3.6 - Syscall obfuscation
|
|
3.7 - Replacing library symbols
|
|
3.8 - Checksumming
|
|
|
|
4 - Playing with code references
|
|
4.1 - Finding xrefs
|
|
4.2 - Blind code references
|
|
4.3 - Graphing xrefs
|
|
4.4 - Randomizing xrefs
|
|
|
|
5 - Conclusion
|
|
6 - Future work
|
|
7 - References
|
|
8 - Greetings
|
|
|
|
|
|
--[ 1 - Introduction
|
|
|
|
Reverse engineering is something usually related to w32 environments where
|
|
there is lot of non-free software and where the use of protections is more
|
|
extended to enforce evaluation time periods or protect intellectual (?)
|
|
property, using binary packing and code obfuscation techniques.
|
|
|
|
These kind of protections are also used by viruses and worms to evade
|
|
anti-virus engines in order to detect sandboxes. This makes reverse
|
|
engineering a double-edged sword.
|
|
|
|
The way to do low level analysis does usually involve the development and
|
|
use of a lot of small specific utilities to gather information about the
|
|
contents of firmware or a disk image to carve for keywords and dump its
|
|
blocks.
|
|
|
|
Actually we have a wide variety of software and hardware platforms and
|
|
reverse engineering tools should reflect this.
|
|
|
|
Obviously, reverse engineering, cracking and such are not only related to
|
|
legal tasks. Crackmes, hacking challenges and learning for the heck of it
|
|
are also good reasons to play with binaries. For us, there is a simple
|
|
reason behind them all: fun.
|
|
|
|
|
|
--[ 1.1 - The framework
|
|
|
|
At this point, we can imagine a framework that combines some of the basics
|
|
of *nix, offering a set of actions over an abstracted IO layer.
|
|
|
|
Following these premises I started to write a block based hexadecimal
|
|
editor. It allows to seamlessly open a disc device, file, socket, serial
|
|
or process. You can then both automatize processing actions by scripts and
|
|
perform manual interactive analysis.
|
|
|
|
The framework is composed of various small utilities that can be easily
|
|
integrated between them by using pipes, files or an API:
|
|
|
|
radare: the entrypoint for everything :)
|
|
rahash: block based hashing utility
|
|
radiff: multiple binary diffing algorithms
|
|
rabin: extract information from binaries (ELF,MZ,CLASS,MACH0..)
|
|
rasc: shellcode construction helper
|
|
rasm: commandline assembler/disassembler
|
|
rax: inline multiple base converter
|
|
xrefs: blind search for relative code references
|
|
|
|
The magic of radare resides in the ortogonality of commands and metadata
|
|
which makes easy to implement automatizations.
|
|
|
|
Radare comes with a native debugger that works on many os/arches and offers
|
|
many interesting commands to ease the analysis of binaries without having
|
|
the source.
|
|
|
|
Another interesting feature is supporting high level scripting languages
|
|
like python, ruby or lua. Thanks to a remote protocol that abstracts the
|
|
IO, it is allowed the use of immunity debugger, IDA, gdb, bochs, vmware and
|
|
many other debugging backends only writing a few lines of script or even
|
|
using the remote GDB protocol.
|
|
|
|
In this article I will introduce various topics. It is impossible to
|
|
explain everything in a single article.
|
|
|
|
If you want to learn more about it I recommend you to pull the latest hg
|
|
tip and read the online documentation at:
|
|
|
|
http://www.radare.org [0]
|
|
|
|
I have also written a book [1] in pdf and html available at:
|
|
|
|
http://www.radare.org/get/radare.pdf
|
|
|
|
|
|
--[ 1.2 - First steps
|
|
|
|
To ease the reading of the article I will first introduce the use of radare
|
|
with some of its basics.
|
|
|
|
Before running radare I recommend to setup the ~/.radarerc with this:
|
|
|
|
e scr.color=1 ; enable color screen
|
|
e file.id=1 ; identify files when opened
|
|
e file.flag=1 ; automatically add bookmarks (flags) to syms, strings..
|
|
e file.analyze=1 ; perform basic code analysis
|
|
|
|
Here's a list of points that will help us to better understand how radare
|
|
commands are built.
|
|
|
|
- each command is identified by a single letter
|
|
- subcommands are just concatenated characters
|
|
- e command stands for eval and is used to define configuration variables
|
|
- comments are defined with a ";" char
|
|
|
|
This command syntax offers a flexible input CLI that allows to perform
|
|
temporal seeks, repeat commands multiple times, iterate over file flags,
|
|
use real system pipes and much more.
|
|
|
|
Here are some examples of commands:
|
|
|
|
pdf @@ sym. ; run "pdf" (print disassembled function) over all
|
|
; flags matching "sym."
|
|
wx cc @@.file ; write 0xCC at every offset specified in file
|
|
. script ; interpret file as radare commands
|
|
b 128 ; set block size to 128 bytes
|
|
b 1M ; any math expression is valid here (hex, dec, flags, ..)
|
|
x @ eip ; dump "block size" bytes (see b command) at eip
|
|
pd | grep call ; disassemble blocksize bytes and pipe to system's grep
|
|
pd~call ; same as previous but using internal grep
|
|
pd~call[0] ; get column 0 (offset) after grepping calls in disasm
|
|
3!step ; run 3 steps (system is handled by the wrapped IO)
|
|
!!ls ; run system command
|
|
f* > flags.txt ; dump all flags as radare commands into a file
|
|
|
|
--[ 1.3 - Base conversions
|
|
|
|
rax is an utility that converts values between bases. Given an hexadecimal
|
|
value it returns the decimal conversion and viceversa.
|
|
|
|
It is also possible to convert raw streams or hexpair strings into
|
|
printable C-like strings using the -s parameter:
|
|
|
|
$ rax -h
|
|
Usage: rax [-] | [-s] [-e] [int|0x|Fx|.f|.o] [...]
|
|
int -> hex ; rax 10
|
|
hex -> int ; rax 0xa
|
|
-int -> hex ; rax -77
|
|
-hex -> int ; rax 0xffffffb3
|
|
float -> hex ; rax 3.33f
|
|
hex -> float ; rax Fx40551ed8
|
|
oct -> hex ; rax 035
|
|
hex -> oct ; rax Ox12 (O is a letter)
|
|
bin -> hex ; rax 1100011b
|
|
hex -> bin ; rax Bx63
|
|
-e swap endianness ; rax -e 0x33
|
|
-s swap hex to bin ; rax -s 43 4a 50
|
|
- read data from stdin until eof
|
|
|
|
With it we can convert a raw file into a list of hexpairs:
|
|
|
|
$ cat /bin/true | rax -
|
|
|
|
There is an inverse operation for every input format, so we can do
|
|
things like this:
|
|
|
|
$ cat /bin/true | rax - | rax -s - > /tmp/true
|
|
$ md5sum /bin/true /tmp/true
|
|
20c1598b2f8a9dc44c07de2f21bcc5a6 /bin/true
|
|
20c1598b2f8a9dc44c07de2f21bcc5a6 /tmp/true
|
|
|
|
rax(1) can be also used in interactive mode by reading lines from stdin.
|
|
|
|
|
|
--[ 1.4 - The target
|
|
|
|
This article aims to explain some practical cases where an scriptable
|
|
hexadecimal editor can trivially solve and provide new perspectives to
|
|
solve reverse engineering quests.
|
|
|
|
We will take a x86 [2] GNU/Linux ELF [3] binary as a main target. Radare
|
|
supports many architectures (ppc, arm, mips, java, msil..) and operating
|
|
systems (osx, gnu/ linux, solaris, w32, wine, bsd), You should take in mind
|
|
that commands and debugger features explained in this article can be
|
|
applied to any other platform than gnu/linux-x86.
|
|
|
|
In this article we will focus on the most widely-known platform
|
|
(gnu/linux-x86) to ease the reading.
|
|
|
|
Our victim will suffer various manipulation stages to make reverse
|
|
engineering harder keeping functionality intact.
|
|
|
|
The assembly snippets are written in AT&T syntax [4], which I think it's
|
|
more coherent than Intel so you will have to use GAS, which is a nice multi
|
|
architecture assembler.
|
|
|
|
Radare2 has a full API with pluggable backends to assemble and disassemble
|
|
for many architectures supporting labels and some basic assembler
|
|
directives. This is done in 'rasm2' which is a reimplementation of rasm
|
|
using libr.
|
|
|
|
However, this article covers the more stable radare1 and will only use the
|
|
'rasm' command to assemble simple instructions instead of complete
|
|
snippets.
|
|
|
|
The article exposes multiple situations where low level analysis and
|
|
metadata processing will help us to transform part of the program or simply
|
|
obtain information from a binary blob.
|
|
|
|
Usually the best program to start learning radare is the GNU 'true', which
|
|
can be completely replaced with two asm instructions, leaving the rest of
|
|
the code to play with. However it is not a valid target for this article,
|
|
because we are interested in looking for complex constructions to hide
|
|
control flow with call trampolines, checksumming artifacts, ciphered code,
|
|
relocations, syscall obfuscation, etc.
|
|
|
|
|
|
--[ 2 - Injecting code in ELF
|
|
|
|
The first question that pops up in our minds is: How the hell we will add
|
|
more code in an already compiled program?
|
|
|
|
What real packers do is to load the entire program structures in memory for
|
|
in-memory manipulation, handling all the code relocations and generating a
|
|
completely brand new executable.
|
|
|
|
Program transformation is not an easy task. It requires a complete analysis
|
|
which sometimes is impossible to do automatically, due to the need to mix
|
|
the results of static, dynamic and manual code analysis.
|
|
|
|
Instead of this approach, we will do it in a bottom to top way, taking low
|
|
level code structures as units and doing transformations without the need
|
|
to understand the whole program.
|
|
|
|
This enables ensuring that transformations won't break the program in any
|
|
way because their local and internal dependencies will be easy identified.
|
|
|
|
Because the target program is generated by a compiler we can make some
|
|
assumptions. For instance, there will be a place to inject code, the
|
|
functions will be sequentially defined, access to variables will be
|
|
easier to track and calling conventions will be the same along all the
|
|
program.
|
|
|
|
Our first action will be to find and identify all the parts of the text
|
|
section with unused or noppable code.
|
|
|
|
'noppable' code is defined as code that is never executed, or that can be
|
|
replaced by a smaller implementation, taking benefit of the non-overwritten
|
|
bytes.
|
|
|
|
Compiler-related stubs can be sometimes noppable or replaced, and more of
|
|
this is found in non-C native-compiled programs (haskell, C++, ...)
|
|
|
|
The spaces between functions where no code is executed are called 'function
|
|
paddings'. They exist because the compiler tried to optimize the position
|
|
of the functions at aligned addresses to ease to job on the cpu.
|
|
|
|
Some other compiler-related optimizations will inline the same piece of
|
|
code multiple times. By identifying such patterns, we can patch the
|
|
spaghetti code into a single loop with an end branch and use the rest for
|
|
our own purposes.
|
|
|
|
Following paragraphs will verbosely explain those situations.
|
|
|
|
Never executed code:
|
|
|
|
We can find this code by doing many execution traces of the program and
|
|
finding the uncovered ranges. This option will probably require human
|
|
interaction.
|
|
|
|
Using static code analysis we would not be able to follow runtime pointer
|
|
calculations or call tables. Therefore, this method will also require some
|
|
human interaction and, in the event we get to identify a pattern, we can
|
|
write a script to automatize this task and add N code xrefs from a single
|
|
source address.
|
|
|
|
We can also emulate some parts to resolve register values before reaching
|
|
a register based call/branch instruction.
|
|
|
|
Inlined code and unrolled loops:
|
|
|
|
Programmers and compilers usually prefer to inline (duplicate the code of
|
|
a external function in the middle of another one) to reduce the execution
|
|
cost of a 'call' instruction.
|
|
|
|
Loops are also unrolled and this produces longer, but more optimal code
|
|
because the cpu doesn't need to compute unnecessary branches all the
|
|
time.
|
|
|
|
These are common speedup practices, and they are a good target to place
|
|
our code. This process is done in four steps:
|
|
|
|
- Find/identify inline code or unrolled loops (repeated similar blocks)
|
|
- Patch to uninline or re-roll the loop
|
|
- ...
|
|
- Profit!
|
|
|
|
Radare provides different commands to help to find/identify such kind of
|
|
code blocks. The related commands are not going to do all the job
|
|
automatically. They require some automatization, scripting or manual
|
|
interaction.
|
|
|
|
The basic approach to identify such kind of code is by finding repeated
|
|
sequences of bytes allowing a percentage of similarity. Most inline code
|
|
will come from small functions and unrolled loops will be probably small
|
|
too.
|
|
|
|
There are two basic search commands that can help on this:
|
|
|
|
[0xB7F13810]> /?
|
|
...
|
|
/p len ; search pattern of length 'len'
|
|
/P count ; search patterns of size $$b matching at >N bytes of curblock
|
|
...
|
|
|
|
The first command (/p) will find repeated bytes of a specified length.
|
|
The other one (/P) will find a block that matches at least N bytes of the
|
|
current blocksize compared to the current one.
|
|
|
|
These pattern search must be restricted to certain ranges, like the .text
|
|
section or the function boundaries:
|
|
|
|
The search should be restricted to the text section:
|
|
> e search.from = section._text
|
|
> e search.to = section._text_end
|
|
|
|
All search commands (/) can be configured by the 'search.' eval
|
|
variables.
|
|
|
|
The 'section.' flag names are defined by 'rabin'. Calling rabin with
|
|
the -r flag will dump the binary information to stdout as radare commands.
|
|
|
|
This is done at radare startup when file.id and file.flag are enabled,
|
|
but this information can be reimported from the shell by calling:
|
|
|
|
> .!!rabin -rS ${FILE}
|
|
|
|
This command can be understood as a dot command '.', which interprets the
|
|
output of the following command '!' as radare commands. In the example
|
|
below there are two admiration marks (!!). This is because the single '!'
|
|
will run the command through the io subsystem of radare, falling in the
|
|
debugger layer if trying to run a program in the shell which has a name
|
|
that equals a debugger command's.
|
|
|
|
The double '!!' is used to directly escape from the io subsystem and run
|
|
the program directly in the system.
|
|
|
|
This command can be used to externalize some scripting facilities by
|
|
running other programs, processing data, and feeding radare's core back
|
|
with dynamically generated commands.
|
|
|
|
Once all the inlined code is identified, search hits should be analyzed,
|
|
discarding false positives and making all those blocks work as subroutines
|
|
using a call/ret pattern, nopping the rest of the bytes. Unused bytes can
|
|
now be filled with any code.
|
|
|
|
Another way to identify inlined code can be done by splitting a function
|
|
code analysis into basic blocks without splitting nodes (e
|
|
graph.split=false) and finding similar basic blocks or repeated.
|
|
|
|
Unrolled loops are based on repeated sequences of code with small
|
|
variations based on the iterator variable which must change along the
|
|
repeated blocks in a progressive way (incremental, decremental, etc).
|
|
|
|
Language bindings are recommended for implementing such kind of tasks,
|
|
not only because the higher language representation, but also thanks to
|
|
the high level APIs for code analysis that are distributed with radare
|
|
for python and other scripting languages (perl, ruby, lua).
|
|
|
|
The '/' command is used to perform searches. A radare command can be
|
|
defined to be executed every time a hit is found, or we can later use
|
|
the '@@' foreach iterator can be used over all the flags prefixed with
|
|
'hit0' later.
|
|
|
|
See '/?' for help on search-related commands but, in short, there are
|
|
commands to search in plain text, hexadecimal, apply binary masks to
|
|
keywords, launch multiple keywords at a time, define keywords in a file,
|
|
search+replace, and more.
|
|
|
|
One of the search algorithms accessible through the '/' command is called
|
|
'/p', which performs a pattern search. The command accepts a numeric value
|
|
which determines the minimum size for a pattern to be resolved.
|
|
|
|
Patterns are identified by a series of consecutive bytes, with a minimum
|
|
length, that appear more than once. The output will return a pattern id,
|
|
the pattern bytes and a list of offsets where the pattern is found.
|
|
|
|
This kind of search algorithm can be useful for analyzing huge binary
|
|
files like firmwares, disc images or uncompressed pictures.
|
|
|
|
In order to get a global overview of a big file the 'pO' command can be
|
|
used, which will display the full file represented in 'blocksize' bytes
|
|
in hexadecimal. Each byte will represent the number of printable
|
|
characters in the block, the entropy calculation, the number of
|
|
functions identified there, the number of executed opcodes, or whatever
|
|
is set at the 'zoom.byte' eval variable.
|
|
|
|
zoom view real file
|
|
+-----------+ +----------+
|
|
| 0 |--------------| 0 |
|
|
| 1 | '-.__ | ... |
|
|
| 2 | '-.__ | |
|
|
| 3 | '-.| 128 | (filesize/blocksize)
|
|
| ... | +----------+
|
|
|
|
The number of bytes of the real file represented in a byte of the
|
|
zoom view is the quotient between the filesize and the blocksize.
|
|
|
|
Function padding and preludes:
|
|
|
|
Compilers usually pad functions with nop instructions to align the
|
|
following one. They are usually small, but sometimes they are big enough
|
|
to store trampolines that will help obfuscate the execution flow.
|
|
|
|
In the 'extending trampolines' chapter I will explain a technique to hide
|
|
function preludes inside a 'call trampoline' which runs the initial code
|
|
of each function and then jumps back to the next instruction.
|
|
|
|
By using call tables loaded in .data or memory it will be harder to follow
|
|
for the reverser, because the program can lose all function preludes and
|
|
footers so that it'll be read as a single blob.
|
|
|
|
Compiler stubs:
|
|
|
|
Function preludes and calling conventions can be used as watermarks to
|
|
identify the compiler used. But when compiler optimizations are enabled
|
|
the generated code will be heavily transformed and those preludes will
|
|
probably be lost.
|
|
|
|
Some new versions of gcc are using a new function prelude aligning the
|
|
stack before entering the code to make memory access by the cpu on local
|
|
variables. But we can replace them with a shorter hand made one adapted
|
|
to the concrete function needs and this way ripping some more bytes for
|
|
our own fun.
|
|
|
|
The entrypoint embeds a system dependent loader that acts as a launcher
|
|
for the constructor/main/destructor. We can analyze the constructor and
|
|
the destructor to check if they are void.
|
|
|
|
Then, we can drop all this code by just adding a branch to the main
|
|
pointer, getting about 190 bytes for a C program. You will probably find
|
|
much more unused code in programs written in D, C++, haskell or any other
|
|
high level language.
|
|
|
|
This is how a common gcc's entrypoint looks like:
|
|
|
|
0x08049a00, / xor ebp, ebp
|
|
0x08049a02 | pop esi
|
|
0x08049a03 | mov ecx, esp
|
|
0x08049a05 | and esp, 0xf0
|
|
0x08049a08, | push eax
|
|
0x08049a09 | push esp
|
|
0x08049a0a | push edx
|
|
0x08049a0b | push dword 0x805b630 ; sym.fini
|
|
0x08049a10, | push dword 0x805b640 ; sym.init
|
|
0x08049a15 | push ecx
|
|
0x08049a16 | push esi
|
|
0x08049a17 | push dword 0x804f4e0 ; sym.main
|
|
0x08049a1c, | call 0x80495b4 ; 1 = imp.__libc_start_main
|
|
0x08049a21 \ hlt
|
|
|
|
By pushing 0's instead of fini/init pointers we can ignore the
|
|
constructor and destructor functions of the program, most of the programs
|
|
will work with no problems without them.
|
|
|
|
By analyzing code in both (sym.init and sym.fini) pointers we can
|
|
determine the number of bytes:
|
|
|
|
[0x080482C0]> af @ sym.__libc_csu_fini~size
|
|
size = 4
|
|
[0x080482C0]> af @ sym.__libc_csu_init~size
|
|
size = 89
|
|
|
|
Tracing
|
|
|
|
Execution traces are wonderful utilities to easily cleanup the view
|
|
of a binary by identifying parts of it as tasks or areas. We can for
|
|
example make a fast identification of the GUI code by starting a
|
|
trace and graphically clicking on all the menus and visual options
|
|
of the target application.
|
|
|
|
The most common problem to tracing is the low performance of a !stepall
|
|
operation, which mainly gets stupidly slow on loops, reps and tracing
|
|
already traced code blocks. To speedup this process, radare implements
|
|
a "touchtrace" method (available under the !tt command) which implements
|
|
an idea of MrGadix (thanks!)
|
|
|
|
This method consist in keeping a swapped copy of the targeted process
|
|
memory space on the debugger side and replacing it with
|
|
software breakpoint instructions. On intel we would use CC (aka int3).
|
|
|
|
Once the program runs, it raises an exception because of the trap
|
|
instruction, which is handled by the debugger. Then the debugger checks
|
|
if the instruction at %eip has been swapped and replaces the userspace
|
|
one, storing information about this step (like timestamp, number of times
|
|
it has been executed and step counter).
|
|
|
|
Once the instruction has been replaced, the debugger instructs the
|
|
process to continue execution. Each time an unswapped instruction is
|
|
executed the debugger replaces it.
|
|
|
|
At the end (giving an end-breakpoint) the debugger have a buffer with
|
|
enough information to determine the executed ranges. Using this easy
|
|
method we can easily skip all the reps, loops and already analyzed
|
|
code blocks, speeding up the binary analysis.
|
|
|
|
Using the "at" command (which stands for analyze traces) it is possible
|
|
to take statistics or information about the traces and, for example..
|
|
serialize the program and unroll loops.
|
|
|
|
We can also use !trace, giving a certain debug level as numeric argument,
|
|
to log register changes, executed instructions, buffers, stack changes,
|
|
etc..
|
|
|
|
After these steps we can subtract the resulting ranges against the .text
|
|
section and get how many bytes we can use to inject code.
|
|
|
|
If the resulting space is not enough for our purposes we will have to face
|
|
section resizing to put our code. This is explained later.
|
|
|
|
Radare offers the "ar" (analyze ranges) command to manage range lists.
|
|
It can perform addition, subtraction and boolean operations.
|
|
|
|
> ar+ 0x1000 0x1040 ; manual range addition
|
|
> ar+ 0x1020 0x1080 ; manual add with overlap
|
|
> ari ; import traces from other places
|
|
.....
|
|
> .atr* ; import execution traces information
|
|
> .CF* ; import function ranges
|
|
> .gr* ; import code analysis graph ranges
|
|
.....
|
|
> ar ; display non overlapped ranges
|
|
> ; show booleaned ranges inside text section
|
|
> arb section._text section._text_end
|
|
|
|
We can also import range information from external files or programs. One
|
|
of the nicer characteristics of *nix is the ability to communicate
|
|
applications using pipes by just making one program 'talk' in the other
|
|
program language. So if we write a perl script that parses an IDA database
|
|
(or IDC), we can extract the information we need and print radare commands
|
|
('ar' in this case) to stdout, and then parsing the results with the '.'
|
|
command.
|
|
|
|
This time ar gives us information about the number of bytes that are
|
|
theoretically not going to be executed and its ranges.
|
|
|
|
We can now place some breakpoints on these locations to ensure we are not
|
|
missing anything. Manual revision of the automated code analysis and
|
|
tracing results is recommended.
|
|
|
|
|
|
--[ 2.1 - Resolving register based branchs
|
|
|
|
The 'av' command provides subcommands to analyze opcodes inside
|
|
the native virtual machine.
|
|
|
|
The basic commands are: 'av-' to restart the vm state, 'avr' to manage
|
|
registers, 'ave' to evaluate VM expressions and 'avx' that will execute N
|
|
instructions from the current seek (will set the eip VM register to the
|
|
offset value).
|
|
|
|
Internally, the virtual machine engine translates instructions into
|
|
evaluable strings that are used to manipulate memory and change registers.
|
|
|
|
This simple concept allows to easily implement support for new
|
|
instructions or architectures.
|
|
|
|
Here is an example. The code analysis has reached a call instruction
|
|
addressing with a register. The code looks like:
|
|
|
|
/* --- */
|
|
$ cat callreg.S
|
|
.global main
|
|
.data
|
|
.long foo
|
|
.text
|
|
|
|
foo:
|
|
ret
|
|
main:
|
|
xor %eax, %eax
|
|
mov $foo, %ebx
|
|
add %eax, %ebx
|
|
call *%ebx
|
|
ret
|
|
|
|
$ gcc callreg.S
|
|
$ radare -d ./a.out
|
|
( ... )
|
|
[0xB7FC6810]> !cont sym.main
|
|
Continue until (sym.main) = 0x08048375
|
|
[0x08048375]> pd 5 @ sym.main
|
|
0x08048375 eip,sym.main:
|
|
0x08048375 / xor eax, eax
|
|
0x08048377 | mov ebx, 0x8048374 ; sym.foo
|
|
0x0804837c,| add ebx, eax
|
|
0x0804837e | call ebx
|
|
0x08048380, \ ret
|
|
|
|
[0x08048375]> avx 4
|
|
Emulating 4 opcodes
|
|
MMU: cached
|
|
Importing register values
|
|
0x08048375 eax ^= eax
|
|
; eax ^= eax
|
|
0x08048377 ebx = 0x8048374 ; sym.foo
|
|
; ebx = 0x8048374
|
|
0x0804837c, ebx += eax
|
|
; ebx += eax
|
|
0x0804837e call ebx
|
|
; esp=esp-4
|
|
; [esp]=eip+2
|
|
;==> [0xbf9be388] = 8048382 ((esp]))
|
|
; write 8048382 @ 0xbf9be388
|
|
; eip=ebx
|
|
|
|
[0x08048375]> avr eip
|
|
eip = 0x08048374
|
|
|
|
At this point we can continue the code analysis where the 'eip'
|
|
register of the virtual machine points.
|
|
|
|
[0x08048375]> .afr @ `avr eip~[2]`
|
|
|
|
|
|
--[ 2.2 - Resizing data section
|
|
|
|
There is no simple way to resize a section in radare1. This process is
|
|
completely manual and tricky.
|
|
|
|
In order to solve this limitation, radare2 provides a set of libraries
|
|
(named libr) that reimplement all the functionalities of radare 1.x
|
|
following a modular design instead of the old monolithic approach.
|
|
|
|
Both branches of development are being done in parallel.
|
|
|
|
Here is where radare2 api (libr) comes in action:
|
|
|
|
$ cat rs.c
|
|
#include <stdio.h>
|
|
#include <r_bin.h>
|
|
|
|
int main(int argc, char *argv[])
|
|
{
|
|
struct r_bin_t bin;
|
|
if (argc != 4) {
|
|
printf("Usage: %s <ELF file> <section> <size>\n", argv[0]);
|
|
} else {
|
|
r_bin_init(&bin);
|
|
if (!r_bin_open(&bin, argv[1], 1, NULL)) {
|
|
fprintf(stderr, "Cannot open '%s'.\n", argv[1]);
|
|
} else
|
|
if (!r_bin_resize_section(&bin, argv[2], r_num_math(NULL,argv[3]))) {
|
|
fprintf(stderr, "Cannot resize section.\n");
|
|
} else {
|
|
r_bin_close(&bin);
|
|
return 0;
|
|
}
|
|
}
|
|
return 1;
|
|
}
|
|
|
|
$ gcc -I/usr/include/libr -lr_bin rs.c -o rs
|
|
|
|
$ rabin -S my-target-elf | grep .data
|
|
idx=24 address=0x0805d0c0 offset=0x000150c0 size=00000484 \
|
|
align=0x00000020 privileges=-rw- name=.data
|
|
|
|
$ ./rs my-target-elf .data 0x9400
|
|
|
|
$ rabin -S my-target-elf | grep .data
|
|
idx=24 address=0x0805d0c0 offset=0x000150c0 size=00009400 \
|
|
align=0x00000020 privileges=-rw- name=.data
|
|
|
|
Our 'rs' program will open a binary file (ELF 32/64, PE32/32+, ...) it will
|
|
resolve a section named '.data' to change its size and will shift the rest
|
|
of sections to keep the elf information as is.
|
|
|
|
The reason for resizing the .data section is because this one is commonly
|
|
linked after the text section. There will be no need to reanalyze the
|
|
program code to rebase all the data accesses to the new address.
|
|
|
|
The target program should keep working after the section resizing and we
|
|
already know which places in the program can be used to inject our
|
|
trampoline.
|
|
|
|
+-------+ +-------+
|
|
| .text | | .text |
|
|
| | | | <-- inject trampoline
|
|
|-------| -> |-------|
|
|
| .data | | .data |
|
|
+-------+ | | <-- inject new code or data
|
|
| |
|
|
+-------+
|
|
|
|
Next chapters will explain multiple trampoline methods that will use our
|
|
resized data section to store pointer tables or new code.
|
|
|
|
|
|
--[ 2.3 - Basics on code injection
|
|
|
|
Self modifying code is a nice thing, but usually a dangerous task too. Also
|
|
patches/tools like SElinux will not let us to write to an already
|
|
executable section.
|
|
|
|
To avoid this problem we can change in the ELF headers the .text perms to
|
|
writable but not executable, and include our decryption function in a new
|
|
executable section. When the decryption process is done, it will have to
|
|
change text perms to executable and then transfer control to it.
|
|
|
|
In order to add multiple nested encryption layers the task will be harder
|
|
mostly because of the relocation problems, and if the program is not
|
|
PIC-ed, rebasing an entire application on the fly by allocating and copying
|
|
the program multiple times in memory can be quite complicated.
|
|
|
|
The simplest implementation consists in these steps:
|
|
|
|
- Define from/to ranges
|
|
- Define cipher key and algorithm
|
|
- Inject deciphering code in the entrypoint
|
|
- We will need to use mprotect before looping
|
|
- Statically cipher the section
|
|
|
|
The from/to addresses can be hardcoded in the assembly snippet or written
|
|
in .data or somewhere for later. We can also ask the user for a key and
|
|
make the binary work only with a password.
|
|
|
|
If the encryption key is not found in the binary, the unpacking process
|
|
becomes harder, and we will have to implement a dynamic method to find the
|
|
correct one. This will be explained later.
|
|
|
|
Another way to define ranges is by using watermarks and let our injected
|
|
code walk over the memory looking for these marks.
|
|
|
|
The last step is probably the easier one. We already know the key and the
|
|
algorithm, its time to rewrite the inverse one using radare commands and
|
|
let the script run over these ranges.
|
|
|
|
This is an example:
|
|
|
|
The 'wo' command is a subcommand of the 'w' (write) which offers arithmetic
|
|
write operations over the current block.
|
|
|
|
[0x00000000]> wo?
|
|
Usage: wo[xrlasmd] [hexpairs]
|
|
Example: wox 90 ; xor cur block with 90
|
|
Example: woa 02 03 ; add 2, 3 to all bytes of cur block
|
|
Supported operations:
|
|
woa addition +=
|
|
wos subtraction -=
|
|
wom multiply *=
|
|
wod divide /=
|
|
wox xor ^=
|
|
woo or |=
|
|
woA and &=
|
|
wor shift right >>=
|
|
wol shift left <<=
|
|
|
|
It uses cyclic hexpair byte arrays as arguments. Just by seeking on the
|
|
"from" address and setting blocksize as "to-from" we will be ready to type
|
|
the 'wo' ones.
|
|
|
|
Implementing this process in macros will make the code more maintainable
|
|
and easy to manage. The understanding of functional programming is good
|
|
for writing radare macros, split the code in small functions doing
|
|
minimal tasks and make them interact with each other.
|
|
|
|
Macros are defined like in lisp, but syntax is really closer to a mix
|
|
between perl and brainfuck. Don't get scary at first instance..it doesn't
|
|
hurt as much as you can think and the resulting code is funnier than
|
|
expected ;)
|
|
|
|
(cipher from to key
|
|
s $0
|
|
b $1-$0
|
|
woa $2
|
|
wox $2)
|
|
|
|
We can put it in a single line (by replacing newlines with commas ',')
|
|
and put it in our ~/.radarerc:
|
|
|
|
(cipher from to key,s $0,b $1-$0,woa $2,wox $2,)
|
|
|
|
Now we can call this macro at any time by just prepending a dot before the
|
|
parenthesis and feeding it with the arguments.
|
|
|
|
f from @ 0x1200
|
|
f to @ 0x9000
|
|
.(cipher from to 2970)
|
|
|
|
To ease the code injection we can write a macro like this:
|
|
|
|
(inject hookaddr file addr
|
|
wa call $2 @ $0
|
|
wf $1 @ $2)
|
|
|
|
Feeding this macro with the proper values will assemble a call opcode
|
|
branching to the place where we inject our code (addr or $2) stored in the
|
|
'file' or $1.
|
|
|
|
Radare has the expanded AES key searching algorithm developed by Victor
|
|
Mu~noz. It can be helpful while reversing a program that uses this kind of
|
|
ciphering algorithm. As the IO is abstracted it is possible to launch this
|
|
search against files, program memory, ram dumps, etc..
|
|
|
|
For example:
|
|
|
|
$ radare -u /dev/mem
|
|
[0x00000000]> /a
|
|
...
|
|
|
|
WARNING: Most current GNU/Linux distributions comes with the kernel
|
|
compiled with the CONFIG_STRICT_DEVMEM option which restricts the access
|
|
of /dev/mem to the only first 1.1M.
|
|
|
|
All write operations are stored in a linked list of backups with the
|
|
removed contents. This make possible to undo and redo all changes done
|
|
directly on file. This allows you to try different modifications on the
|
|
binary without having to restore manually the original one.
|
|
|
|
These changes can be diffed later with the "radiff -r" command. The command
|
|
performs a deltified binary diff between two files, The '-r' displays the
|
|
result of the operation as radare commands that transforming the first
|
|
binary into the second one.
|
|
|
|
The output can be piped to a file and used later from radare to patch it
|
|
at runtime or statically.
|
|
|
|
For example:
|
|
|
|
$ cp /bin/true true
|
|
$ radare -w true
|
|
[0x08048AE0]> wa xor eax,eax;inc eax;xor ebx,ebx;int 0x80 @ entrypoint
|
|
|
|
[0x08048AE0]> u
|
|
03 + 2 00000ae0: 31 ed => 31 c0
|
|
02 + 1 00000ae2: 5e => 40
|
|
01 + 2 00000ae3: 89 e1 => 31 db
|
|
00 + 2 00000ae5: 83 e4 => cd 80
|
|
|
|
[0x08048AE0]> u* | tee true.patch
|
|
wx 31 c0 @ 0x00000ae0
|
|
wx 40 @ 0x00000ae2
|
|
wx 31 db @ 0x00000ae3
|
|
wx cd 80 @ 0x00000ae5
|
|
|
|
The 'u' command is used to 'undo' write operations. Appending the '*' we
|
|
force the 'u' command to list the write changes as radare commands, and we
|
|
pipe the output to the 'tee' unix program.
|
|
|
|
Let's generate the diff:
|
|
|
|
$ radiff -r /bin/true true
|
|
e file.insertblock=false
|
|
e file.insert=false
|
|
wx c0 @ 0xae1
|
|
wx 40 @ 0xae2
|
|
wx 31 @ 0xae3
|
|
wx db @ 0xae4
|
|
wx cd @ 0xae5
|
|
wx 80 @ 0xae6
|
|
|
|
This output can be piped into radare through stdin or using the '.'
|
|
command to interpret the contents of a file or the output of a command.
|
|
|
|
[0x08048AE0]> . radiff.patch
|
|
or
|
|
[0x08048AE0]> .!!radiff -r /bin/true $FILE
|
|
|
|
The '$FILE' environment is exported to the shell from inside radare as well
|
|
as other variables like $BLOCK, $BYTES, $ENDIAN, $SIZE, $ARCH, $BADDR, ...
|
|
|
|
Note that in radare there's not really much difference between memory
|
|
addresses and on-disk ones because the virtual addresses are emulated by
|
|
the IO layer thanks to the io.vaddr and io.paddr variables which are used
|
|
to define the virtual and physical addresses for a section or for the whole
|
|
file.
|
|
|
|
The 'S' command is used to configure a fine-grained setup of virtual and
|
|
physical addressing rules for ranged sections of offsets.
|
|
|
|
Do not care too much about it because 'rabin' will configure these
|
|
variables at startup when the file.id eval variable is true.
|
|
|
|
|
|
--[ 2.4 - Mmap trampoline
|
|
|
|
The problem faced when resizing the .text section is that .data section is
|
|
shifted, and the program will try to access it absolutely or relatively.
|
|
|
|
This situation forces us to rebase all the text section and adapt the rest of
|
|
sections.
|
|
|
|
The problem is that we need a deep code analysis to rebase and this is
|
|
something that we will probably do wrong if we try to do only a static code
|
|
analysis. This situation usually require a complex dynamic, and sometimes
|
|
manual, analysis to fix all the possible pointers.
|
|
|
|
To skip this issue we can just resize the .data section which is after the
|
|
.text one and just write a trampoline in the .text section loading code
|
|
from .data and putting it in memory using mmap.
|
|
|
|
The decision to use mmap is because is the only way to get executable pages
|
|
with write permissions in memory even with SELinux enabled.
|
|
|
|
The C code looks like this:
|
|
|
|
---------------------8<--------------------------
|
|
#include <stdio.h>
|
|
#include <string.h>
|
|
#include <sys/mman.h>
|
|
|
|
int run_code(const char *buf, int len)
|
|
{
|
|
unsigned char *ptr;
|
|
int (*fun)();
|
|
ptr = mmap(NULL, len,
|
|
PROT_EXEC | PROT_READ | PROT_WRITE,
|
|
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
|
|
if (ptr == NULL)
|
|
return -1;
|
|
fun = (int(*)(void))ptr;
|
|
memcpy(ptr, buf, len);
|
|
mprotect(ptr, len, PROT_READ | PROT_EXEC);
|
|
return fun();
|
|
}
|
|
|
|
int main()
|
|
{
|
|
unsigned char trap = 0xcc;
|
|
return run_code(&trap, 1);
|
|
}
|
|
---------------------8<--------------------------
|
|
|
|
The manual rewrite in assembly become:
|
|
|
|
---------------------8<--------------------------
|
|
.global main
|
|
.global run_code
|
|
.data
|
|
|
|
retaddr: .long 0
|
|
shellcode: .byte 0xcc
|
|
hello: .string "Hello World\n"
|
|
|
|
.text
|
|
tailcall:
|
|
push %ebp
|
|
mov %esp, %ebp
|
|
push $hello
|
|
call printf
|
|
addl $4, %esp
|
|
pop %ebp
|
|
ret
|
|
|
|
run_code:
|
|
pop (retaddr)
|
|
/* lcall *0 */
|
|
call tailcall
|
|
push (retaddr)
|
|
|
|
/* our snippet */
|
|
push %ebp
|
|
mov %esp, %ebp
|
|
pusha
|
|
|
|
push %ebp
|
|
xor %ebp, %ebp /* 0 */
|
|
mov $-1, %edi /* -1 */
|
|
mov $0x22, %esi /* MAP_ANONYMOUS | MAP_PRIVATE */
|
|
mov $7, %edx /* PROT_EXEC | PROT_READ | PROT_WRITE */
|
|
mov $4096, %ecx /* len = 4096 */
|
|
xor %ebx, %ebx /* 0 */
|
|
mov $192, %eax /* mmap2 */
|
|
int $0x80
|
|
pop %ebp
|
|
|
|
mov %eax, %edi /* dest pointer */
|
|
mov 0x8(%ebp), %esi /* get buf (arg0) */
|
|
mov $128, %ecx
|
|
cld
|
|
rep movsb
|
|
mov %eax, %edi
|
|
|
|
mov $5, %edx /* R-X */
|
|
mov $4096, %ecx /* len */
|
|
mov %edi, %ebx /* addr */
|
|
mov $125, %eax /* mprotect */
|
|
int $0x80
|
|
|
|
call *%edi
|
|
popa
|
|
pop %ebp
|
|
ret
|
|
|
|
main:
|
|
push $shellcode
|
|
call run_code
|
|
add $4, %esp
|
|
ret
|
|
---------------------8<--------------------------
|
|
|
|
Running this little program will result on a trap instruction executed on the
|
|
mmaped area. I decided to do not checks on return values to make the code
|
|
smaller.
|
|
|
|
$ gcc rc.S
|
|
$ ./a.out
|
|
Hello World
|
|
Trace/breakpoint trap
|
|
|
|
The size in bytes of this code can be measured with this line in the shell:
|
|
|
|
$ echo $((`rabin -o d/s a.out |grep run_code|cut -d ' ' -f 2 |wc -c`/2))
|
|
76
|
|
|
|
This graph explains the layout of the patch required
|
|
|
|
+----+ / push dword csu_fini
|
|
.---------------| EP | < push dword csu_init
|
|
| +----+ \ push dword main
|
|
V v
|
|
+------------+ +--------+
|
|
| csu_init | | main | / push 0x80483490
|
|
|------------| |--------| < call getopt
|
|
| mmap | .-| ... | \ mov [esp+30], eax
|
|
| trampoline | | | |
|
|
+------------+ | +--------+ .-----------------.
|
|
||'|| | | this code needs |
|
|
| |- - - - |- - - - - - - - - - - - - -. < the mmap tramp |
|
|
: __|___ +-----------+ . | to be executed |
|
|
. / hook \ ---> | mmap code | . `-----------------'
|
|
. \getopt/ +-----------+ .
|
|
- - - - - - - - - - - - - -|- - - - - - '
|
|
|
|
|
+--------+ / fall to the getopt flow
|
|
| getopt | <-----'
|
|
+--------+
|
|
|
|
|
(ret)
|
|
|
|
|
|
The injection process consists on the following steps:
|
|
|
|
* Resize the .data section to get enought space for our code
|
|
|
|
f newsize @ section._data_end - section._data + 4096
|
|
!rabin2 -o r/.data/`?v newsize` $FILE
|
|
q!
|
|
|
|
* Write the code at the end of the unresized .data section
|
|
|
|
!!gcc our-code.c
|
|
"!!rabin a.out -o d/s a.out | grep ^main | cut -d ' ' -f 2 > inj.bytes"
|
|
wF inj.bytes @ section._data_end
|
|
|
|
* Null the __libc_csu_init pointer at entrypoint (2nd push dword)
|
|
|
|
e asm.profile=simple
|
|
s entrypoint
|
|
wa push 0 @ `pd 20~push dword[0]#1`
|
|
|
|
* Inject the mmap loader at symbol __libc_csu_init
|
|
|
|
# generate mmap.bytes with the mmap trampoline
|
|
!!gcc mmap-tramp.S
|
|
"!!rabin a.out -o d/s a.out | grep ^main | cut -d ' ' -f 2 > mmap.bytes"
|
|
# write hexpair bytes contained in mmap.bytes file at csu_init
|
|
wF mmap.bytes @ sym.__libc_csu_init
|
|
|
|
* Hook a call instruction to bridge the execution of our injected code
|
|
and continue the original call workflow.
|
|
|
|
- Determine the address of the call to the target symbol to be hooked.
|
|
For example. Having 'ping' program, this script will determine an
|
|
xref to the getopt import PLT entry.
|
|
|
|
s entrypoint
|
|
# seek to the third push dword in the entry point (main)
|
|
s `pd 20~push dword[3]#2`
|
|
|
|
# Analyze function: the '.' will interpret the output of the command
|
|
# 'af*' that prints radare commands registering xrefs, variable
|
|
# accesses, stack frame changes, etc.
|
|
.af*
|
|
|
|
# generate a file named 'xrefs' containing the code references
|
|
# to the getopt symbol.
|
|
Cx~`?v imp.getopt`[1] > xrefs
|
|
|
|
# create an eNumerate flag at every offset specified in 'xrefs' file
|
|
fN calladdr @@.xrefs
|
|
|
|
- Create a proxy hook to bridge execution without destroying the stack.
|
|
|
|
This is explained in the following lines.
|
|
|
|
- Change the call of the hook to our proxy function
|
|
|
|
# write assembly code 'call <addr>'
|
|
# quotes will replace the inner string with the result of the
|
|
# command ?v which prints the hexadecimal value of the given expr.
|
|
wa call `?v dstaddr`
|
|
|
|
- Profit
|
|
|
|
Here is an example assembly code that demonstrates the functionality of the
|
|
trampoline for the patched call. This snippet of code can be injected after
|
|
the mmap trampoline.
|
|
|
|
----------8<-----------
|
|
.global main
|
|
|
|
trampoline:
|
|
push $hookstr
|
|
call puts
|
|
add $4, %esp
|
|
jmp puts
|
|
|
|
main:
|
|
push $hello
|
|
call trampoline
|
|
add $4, %esp
|
|
ret
|
|
|
|
.data
|
|
|
|
hello:
|
|
.string "hello world"
|
|
hookstr:
|
|
.string "hook!"
|
|
----------8<-----------
|
|
|
|
The previous code cannot be injected directly in the binary, and one of the
|
|
main conceptual reasons is that the destination address of the trampoline
|
|
will be defined at runtime, when the mmapped region of the data section is
|
|
ready with executable permissions, the destination address must be stored
|
|
somewhere in the data section.
|
|
|
|
|
|
--[ 2.4.1 - Call trampoline
|
|
|
|
Once the holes have been identified we can now modify some parts of the
|
|
program to obfuscate the control flow, symbol access, function calls and
|
|
inlined syscalls in different ways.
|
|
|
|
To walk through the program code blocks we can use iterators or nested
|
|
iterators to run from simple to complex scripted code analysis engines and
|
|
retrieve or manipulate the program by changing instructions or moving code.
|
|
|
|
The control flow of the program can be obfuscated by adding a calltable
|
|
trampoline for calling functions. We will have to patch the calls and store
|
|
the jump addresses in a table in the data section.
|
|
|
|
The same hack can be done for long jumps by placing a different trampoline
|
|
to drop the stacked eip and use jumps instead of calls.
|
|
|
|
Iterators in radare can be implemented as macros appending the macro call
|
|
after the @@ foreach mark.
|
|
|
|
For example: "x @ 0x200"
|
|
will temporally seek to 0x200 and run the "x" command.
|
|
|
|
Using: "x @@ .(iterate-functions)"
|
|
will run the iterator macro at every address returned by the macro.
|
|
|
|
To return values from a macro we can use the null macro body and append the
|
|
resulting offset. If no value is given, null is returned and the iterator
|
|
will stop.
|
|
|
|
Heres the implementation for the function-iterator:
|
|
|
|
"(iterate-functions,()CF~#$@,)"
|
|
|
|
We can also use the search engine with cmd.hit eval variable to run a
|
|
command when a search hit is found. Obviously, the command can be a macro
|
|
or a script file in any language.
|
|
|
|
e cmd.hit=ar- $$ $$+5
|
|
/x xx yy zz
|
|
|
|
But we can also do it after processing the search by just removing the
|
|
false positive hits and running a foreach command.
|
|
|
|
# configure and perform search
|
|
e search.from=section._text
|
|
e search.to=section._text_end
|
|
e cmd.search=
|
|
/x xx yy zz
|
|
|
|
# filtering hit results
|
|
fs search ; enter into the search flagspace
|
|
f ; list flags in search space
|
|
f -hit0_3 ; remove hit #3
|
|
# write on every filtered hit
|
|
wx 909090 @@ hit*
|
|
|
|
Flagspaces are groups of flags. Some of them are automatically created by
|
|
rabin while identifying strings, symbols, sections, etc, and others are
|
|
updated at runtime like by commands like 'regs' (registers) or 'search'
|
|
(search results).
|
|
|
|
You can switch to or create another flagspace using the 'fs' command. When
|
|
a flagspace is selected the 'f' command will only show the flags in the
|
|
current flagspace. 'fs *' is used to enable all flagspaces at the same time
|
|
(global view). See fs? for extended help.
|
|
|
|
The internal metadata database can be managed with the C command and stores
|
|
information about code and data xrefs, comments, function definitions,
|
|
type, data conversions and structure definitions to be used by the
|
|
disassembler engine, the code analysis module and user defined scripts.
|
|
|
|
Data and code xrefs are very useful reference points while reverse
|
|
engineering. It is interesting to obfuscate them all as much as possible
|
|
to fool code analysis engines, forcing the reverser to use runtime or
|
|
emulated environments to determine when a string or a buffer is used.
|
|
|
|
To analyze a function in radare is preferably to go into Visual mode with
|
|
command 'V', and then press the keys "d" and "f".
|
|
|
|
This action will make a code analysis from the current seek or where the
|
|
cursor points (if in cursor mode, toggleable with the lowercase "c" key in
|
|
Visual mode). This code analysis will define multiple things like argument,
|
|
local and global variable accesses, feeding the database with xref
|
|
information that can be later used to determine which points of the program
|
|
are using a certain pointer.
|
|
|
|
By defining few eval vars in your ~/.radarerc most of this basic code
|
|
analysis will be done by default.
|
|
|
|
$ cat ~/.radarerc
|
|
e file.id=true ; identify file type and set arch/baddr...
|
|
e file.analyze=true ; perform basic code analysis at startup
|
|
e file.flag=true ; add basic flags
|
|
e scr.color=true ; use color screen
|
|
|
|
NOTE: By passing the '-n' parameter to radare the startup process will skip
|
|
the interpretation of .radarerc.
|
|
|
|
All this information can be cached, manipulated and restored using project
|
|
files (with the -P program flag at startup or enabling this feature in
|
|
runtime with the 'P' command).
|
|
|
|
After this parenthesis we continue obfuscating the xrefs to make reversers
|
|
life funnier.
|
|
|
|
One simple way to do this would be to create an initialization code,
|
|
copying some pointers required by the program to run at random mmaped
|
|
addresses and defining a way to transform these accesses into a sequence
|
|
jigsaw. This can be achieved turning each pointer "random" based on some
|
|
rules like a shifted array accessed by a mod.
|
|
|
|
# calltable initializer
|
|
(ct-init tramp ctable ctable_end
|
|
f trampoline @ $0
|
|
f calltable @ $1
|
|
f calltable_ptr @ calltable
|
|
wf trampoline.bin @ trampoline
|
|
b $2-$1
|
|
wb 00 ; fill calltable with nops
|
|
)
|
|
|
|
; calltable adder
|
|
(ct-add
|
|
; check if addr is a long call
|
|
? [1:$$]==0xeb
|
|
?!()
|
|
; check if already patched
|
|
? [4:$$+1]==trampoline
|
|
??()
|
|
wv $$+5 @ calltable_ptr
|
|
yt 4 calltable_ptr+4 @ $$+1
|
|
f calltable_ptr @ calltable_ptr+8
|
|
wv trampoline @ $$+1
|
|
)
|
|
|
|
We can either create a third helper macro that automatically iterates over
|
|
every long call instruction in .text and running ct-add on it.
|
|
|
|
(ct-run from to tramp ctable ctable_end
|
|
.(ct-init tramp ctable ctable_end)
|
|
s from
|
|
loop:
|
|
? [1:$$]==0xeb
|
|
?? .(ct-add)
|
|
s +$$$
|
|
? $$ < to
|
|
??.loop:
|
|
)
|
|
|
|
Or we can also write a more fine-grained macro to walk over the function
|
|
|
|
e asm.profile=simple
|
|
pD~call[0] > jmps
|
|
.(ct-init 0x8048000 0x8048200)
|
|
.(ct-add)@@.jmps
|
|
|
|
The trampoline.bin file will be something like this:
|
|
|
|
---------------------------------------------------------------
|
|
/*
|
|
* Calltable trampoline example // --pancake
|
|
* ---------------------------------------------------
|
|
* $ gcc calltrampoline.S -DCTT=0x8048000 -DCTT_SZ 100
|
|
*/
|
|
|
|
#ifndef CTT_SZ
|
|
#define CTT_SZ 100
|
|
#endif
|
|
#ifndef CTT
|
|
#define CTT call_trampoline_table
|
|
#endif
|
|
|
|
.text
|
|
.global call_trampoline
|
|
.global main
|
|
|
|
main:
|
|
/* */
|
|
call ct_init
|
|
call ct_test
|
|
/* */
|
|
pushl $hello
|
|
call puts
|
|
add $4, %esp
|
|
ret
|
|
|
|
call_trampoline:
|
|
pop %esi
|
|
leal CTT, %edi
|
|
movl %esi, 0(%edi) /* store ret addr in ctable[0] */
|
|
0:
|
|
add $8, %edi
|
|
cmpl 0(%edi), %esi /* check ret addr against ctable*/
|
|
jnz 0b
|
|
movl 4(%edi), %edi /* get real pointer */
|
|
call *%edi /* call real pointer */
|
|
jmp *CTT /* back to caller */
|
|
|
|
/* initialize calltable */
|
|
ct_init:
|
|
leal CTT+8, %edi
|
|
movl $retaddr, 0(%edi)
|
|
movl $puts, 4(%edi)
|
|
ret
|
|
|
|
/* hello world using the calltable */
|
|
ct_test:
|
|
push $hello
|
|
call call_trampoline
|
|
retaddr:
|
|
add $4, %esp
|
|
ret
|
|
|
|
.data
|
|
hello:
|
|
.string "Hello World"
|
|
|
|
call_trampoline_table:
|
|
.fill 8*(CTT_SZ+1), 1, 0x00
|
|
---------------------------------------------------------------
|
|
|
|
Note that this trampoline implementation is not thread safe. If two threads
|
|
are use the same trampoline call table at the same time to temporally store
|
|
the return address, the application will crash.
|
|
|
|
To fix this issue you should use the %gs segment explained in the 'syscall
|
|
obfuscation' chapter. But to avoid complexity no thread support will be
|
|
added at this moment.
|
|
|
|
To inject the shellcode into the binary we will have to make some space in
|
|
the .data segment for the calltable. We can do this with gcc -D:
|
|
|
|
[0x8048400]> !!gcc calltrampoline.S -DCTT=0x8048000 -DCTT_SZ=100
|
|
> !! rabin -o d/s a.out|grep call_trampoline|cut -d : -f 2 > tramp.hex
|
|
> wF tramp.hex @ trampoline_addr
|
|
|
|
|
|
--[ 2.4.2 - Extending trampolines
|
|
|
|
We can take some real code from the target program and move it into the
|
|
.data section encrypted with a random key. Then we can extend the call
|
|
trampoline to run an initialization function before running the function.
|
|
|
|
Our call trampoline extension will cover three points:
|
|
|
|
- Function prelude obfuscation - Function mmaping (loading code from data)
|
|
- Unciphering and checksumming
|
|
|
|
At the same time we can re-encrypt the function again after calling the end
|
|
point address. This will be good for our protector because the reverser
|
|
will be forced to manually step into the code because no software
|
|
breakpoints will be able to be used on this code because they will be
|
|
re-encrypted and the checksumming will fail.
|
|
|
|
To do this we need to extend our call trampoline table or just add another
|
|
field in the call trampoline table specifying the status of the destination
|
|
code (if it is encrypted or not) and optionally the checksum.
|
|
|
|
The article aims to explain guidelines and possibilities when playing with
|
|
binaries. But we are not going to implement such kind of extensions to
|
|
avoid enlarge too much this article explaining these advanced techniques.
|
|
|
|
The 'p' command is used to print the bytes in multiple formats, the more
|
|
powerful print format is 'pm' which permits to describe memory contents
|
|
using a format-string like string and name each of the fields.
|
|
|
|
When the format-string starts with a numeric value it is handled as an
|
|
array of structures. The 'am' command manages a list of named 'pm' commands
|
|
to ease the re-use of structure signatures along the execution of radare.
|
|
|
|
We will start introducing this command describing the format of an entry of
|
|
our trampoline call table:
|
|
|
|
[0x08049AD0]> pm XX
|
|
0x00001ad0 = 0x895eed31
|
|
0x00001ad4 = 0xf0e483e1
|
|
|
|
Now adding the name of fields:
|
|
|
|
[0x08049AD0]> pm XX from to
|
|
from : 0x00001ad0 = 0x895eed31
|
|
to : 0x00001ad4 = 0xf0e483e1
|
|
|
|
And now printing a readable array of structures:
|
|
|
|
[0x08049AD0]> pm 3XX from to
|
|
0x00001ad0 [0] {
|
|
from : 0x00001ad0 = 0x895eed31
|
|
to : 0x00001ad4 = 0xf0e483e1
|
|
}
|
|
0x00001ad8 [1] {
|
|
from : 0x00001ad8 = 0x68525450
|
|
to : 0x00001adc = 0x0805b6a0
|
|
}
|
|
0x00001ae0 [2] {
|
|
from : 0x00001ae0 = 0x05b6b068
|
|
to : 0x00001ae4 = 0x68565108
|
|
}
|
|
|
|
There is another command named 'ad' that stands for 'analyze data'. It
|
|
automatically analyzes the contents of the memory (starting from the
|
|
current seek) looking for recognized pointers (following the configured
|
|
endianness), strings, pointers to strings, and more.
|
|
|
|
The easiest way to see how this command works is by starting a program in
|
|
debugger mode (f.ex: radare -d ls) and running 'ad@esp'.
|
|
|
|
The command will walk through the stack contents identifying pointers to
|
|
environment variables, function pointers, etc..
|
|
|
|
Internal grep '~' syntax sugar can be used to retrieve specific fields of
|
|
structures for scripting, displaying or analysis.
|
|
|
|
# register a 'pm' in the 'am' command
|
|
[0x08048000]> am foo 3XX from to
|
|
|
|
# grep for the rows matching 'from'
|
|
[0x08048000]> am foo~from
|
|
from : 0x00001ad0 = 0x895eed31
|
|
from : 0x00001ad8 = 0x68525450
|
|
from : 0x00001ae0 = 0x05b6b068
|
|
|
|
# grep for the 4th column
|
|
[0x08048000]> am foo~from[4]
|
|
0x895eed31
|
|
0x68525450
|
|
0x05b6b068
|
|
|
|
# grep for the first row
|
|
[0x08048000]> am foo~from[4]#0
|
|
0x895eed31
|
|
|
|
|
|
--[ 3 - Protections and manipulations
|
|
|
|
This chapter will face different manipulations that can be done on binaries
|
|
to add some layers of protections to make reverse engineering on the target
|
|
binaries a more complicated.
|
|
|
|
|
|
--[ 3.1 - Trashing the ELF header
|
|
|
|
Corrupting the ELF header is another possible target of a packer for making
|
|
the reverser's life little harder.
|
|
|
|
Just modifying one byte in the ELF header becomes a problem for tools like
|
|
gdb, ltrace, objdump, ... The reason is that they depend on section headers
|
|
to get information about sections, symbols, library dependencies, etc...
|
|
and if they are not able to parse them they just stop with an error. At the
|
|
moment of writing, only IDA and radare are able to bypass this issue.
|
|
|
|
This can be easily done in this way:
|
|
|
|
$ echo wx 99 @ 0x21 | radare -nw a.out
|
|
|
|
A reverser can use the fix-shoff.rs (included in the scripts) directory in
|
|
the radare sources, to reconstruct the 'e_sh_off' dword in the ELF header.
|
|
|
|
$ cat scripts/fix-shoff.rs
|
|
; radare script to fix elf.shoff
|
|
(fix-shoff
|
|
s 0 ; seek to offset 0
|
|
s/ .text ; seek to first occurrence of ".text" string
|
|
loop:
|
|
s/x 00 ; go next word in array of strings
|
|
? [1:$$+1] ; check if this is the last one
|
|
?!.loop: ; loop if buf[1] is == 0
|
|
s +4-$$%4 ; align seek pointer to 4
|
|
f nsht ; store current seek as 'nsht'
|
|
wv nsht @ 0x20 ; write the new section header table at 0x20 (elf.sh_off)
|
|
)
|
|
|
|
The script is used in this way:
|
|
|
|
$ echo ".(fix-shoff) && q" | radare -i scripts/fix-shoff.rs -nw "target-elf"
|
|
|
|
This script works on any generic bin with a trashed shoff. It will not work
|
|
on binaries with stripped section headers (after 'sstrip' f.ex.)
|
|
|
|
This is possible because GCC stores the section header table immediately
|
|
after the string table index, and this section is easily identified because
|
|
the .text section name will appear on any standard binary file.
|
|
|
|
It is not hard to think on different ways to bypass this patch. By just
|
|
stripping all the rest of section after setting shoff to 0 will work pretty
|
|
well and the reverser will have to regenerate the section headers
|
|
completely. This can be done by using sstrip(1) (double 's' is not a typo)
|
|
from the elf-kickers package. This way we eliminate all unnecessary
|
|
sections to run the program.
|
|
|
|
The nice thing of this header bug is that the e_shoff value is directly
|
|
passed to lseek(2) as second argument, and we can either try with values
|
|
like 0 or -1 to get different errors instead of just giving an invalid
|
|
address. This way it is possible to bypass wrong ELF parsers in other ways.
|
|
|
|
Similar bugs are found in MACH-O parsers from GNU software like setting
|
|
number of sections to -1 in the SEGMENT_COMMAND header. These kind of bugs
|
|
don't hit kernels because they usually use minimum input from the program
|
|
headers to map it in memory, and the dynamic linker at user space do the
|
|
rest. But debuggers and file analyzers usually fail when try to extract
|
|
corrupted information from headers.
|
|
|
|
|
|
--[ 3.2 - Source level watermarks
|
|
|
|
If we have the source of the target application there will be more ways to
|
|
play with it. For instance, using defines as volatile asm inline macros,
|
|
the developer can add marks or paddings usable for the binary packer.
|
|
|
|
Some commercial packers offer an SDK which is nothing more than a set of
|
|
include files offering helper functions and watermarks to make the packers
|
|
life easier. And giving to the developer more tools to protect their code.
|
|
|
|
$ cat watermark.h
|
|
#define WATERMARK __asm__ __volatile__ (".byte 3,1,3,3,7,3,1,3,3,7");
|
|
|
|
We will use this CPP macro to watermark the generated binary for later
|
|
post-processing with radare. We can specify multiple different watermarks
|
|
for various purposes. We must be sure that these watermarks are not already
|
|
in the generated binary.
|
|
|
|
We can even have larger watermarks to pad our program with noisy bytes in
|
|
the .text section, so we can later replace these watermarks with our
|
|
assembly snippets.
|
|
|
|
Radare can ease the manual process of finding the watermarks and injecting
|
|
code, by using the following command. It will generate a file named
|
|
'water.list' containing one offset per line.
|
|
|
|
[0x00000000]> /x 03010303070301030307
|
|
[0x00000000]> f~hit[0] > water.list
|
|
[0x00000000]> !!cat water.list
|
|
0x102
|
|
0x13c
|
|
0x1a2
|
|
0x219
|
|
|
|
If we want to run a command at every offset specified in the file:
|
|
|
|
[0x00000000]> .(fill-water-marks) @@. water.list
|
|
|
|
Inside every fill-water-mark macro we can analyze the watermark type
|
|
depending on the 'hit' flag number (we can search more than one keyword at
|
|
a time) or by checking the bytes there and perform the proper action.
|
|
|
|
If those watermarks have a fixed size (or the size is embedded into the
|
|
watermark) we can then write a radare script that write a branch
|
|
instruction at to skip the padding. (This will avoid program crash because
|
|
it will not execute the dummy bytes of our watermark).
|
|
|
|
Here's an ascii-art explanation:
|
|
|
|
|<-- watermark ------------------------------------------>|
|
|
|<-- unused bytes -->|
|
|
+---------------------------------------------------------+
|
|
| jmp $$+size | .. .. .. .. .. .. .. .. .. .. .. .. .. .. | program
|
|
+---------------------------------------------------------+
|
|
| ^
|
|
`------------------------------------------------------'
|
|
|
|
At this point we have some more controlled holes to put our code.
|
|
|
|
Our variable-sized watermark will be defined in this way:
|
|
|
|
.long 0,1,2,3,4,5,6,7,4*20,9,10,11,12,13,14,15,16,17,18,19,20
|
|
|
|
And the watermark filling macro:
|
|
|
|
(fill-water-marks,wa jmp $${io.vaddr}+$$+[$$+8],)
|
|
|
|
# run the macro to patch the watermarks with jumps
|
|
.(fill-water-marks) @@. water.list
|
|
|
|
The io.vaddr specified the virtual base address of the program in memory.
|
|
|
|
|
|
--[ 3.3 - Ciphering .data section
|
|
|
|
A really common protection in binaries consists on ciphering the data
|
|
section of a program to hide the strings and make the static code analysis
|
|
harder.
|
|
|
|
The technique presented here aims to protect the data section. This section
|
|
generally does not contain program strings (because they are usually
|
|
statically defined in the rodata section), but will help to understand some
|
|
characteristics of the ELF loader and how programs work with the rw
|
|
sections.
|
|
|
|
The PT_LOAD program header type specifies where, which and how part of the
|
|
binary should be mapped in memory. By default, the kernel maps the program
|
|
at a base address. More over it will map aligned size (asize) of the
|
|
program starting at 'physical' on 'virtual' address with 'flags'
|
|
permissions.
|
|
|
|
$ rabin -I /bin/ls | grep baddr
|
|
baddr=0x08048000
|
|
|
|
$ rabin -H /bin/ls | grep LOAD
|
|
virtual=0x08060000 physical=0x00018000 asize=0x1000 size=0x0fe0 \
|
|
flags=0x06 type=LOAD name=phdr_0
|
|
|
|
In this case, 4K is the aligned size of 0xfe0 bytes of the binary will be
|
|
loaded at address 0x8060000 starting at address 0x18000 of the file.
|
|
|
|
By running '.!rabin -rIH $FILE' from radare all this information will be
|
|
imported into radare and it will emulate such situation.
|
|
|
|
The problem we face here, is that the size and address of the mapped area
|
|
doesn't match with the offset of the .data section.
|
|
|
|
Some operations are required to get the correct offsets to statically
|
|
modify the program to cipher such range of bytes and inject a snippet
|
|
of assembly to dynamically decipher such bytes.
|
|
|
|
The script looks like:
|
|
|
|
-------8<------------
|
|
# flag from virtual address (the base address where the binary is mapped)
|
|
# this will make all new flags be created at current seek + <addr>
|
|
# we need to do this because we are calculating virtual addresses.
|
|
ff $${io.vaddr}
|
|
|
|
# Display the ranges of the .data section
|
|
?e Data section ranges : `?v section._data`- `?v section._data_end`
|
|
|
|
# set flag space to 'segments'. This way 'f' command will only display the
|
|
# flags contained in this namespace.
|
|
fs segments
|
|
|
|
# Create flags for 'virtual from' and 'virtual to' addresses.
|
|
# We grep for the first row '#0' and the first word '[0]' to retrieve
|
|
# the offset where the PT_LOAD segment is loaded in memory
|
|
f vfrom @ `f~vaddr[0]#0`
|
|
|
|
# 'virtual to' flag points to vfrom+ptload segment size
|
|
f vto @ vfrom+`f~vaddr[1]#0`
|
|
|
|
# Display this information
|
|
?e Range of data decipher : `?v vfrom`- `?v vto`= `?v vto-vfrom`bytes
|
|
|
|
# Create two flags to store the position of the physical address of the
|
|
# PT_LOAD segment.
|
|
f pfrom @ `f~paddr[0]#0`
|
|
f pto @ pfrom+`f~paddr[1]#0`
|
|
|
|
# Adjust deltas against data section
|
|
# ----------------------------------
|
|
# pdelta flag is not an address, we set flag from to 0
|
|
# pdelta = (address of .data section)-(physical address of PTLOAD segment)
|
|
ff 0 && f pdelta @ section._data-pfrom
|
|
?e Delta is `?v pdelta`
|
|
|
|
# reset flag from again, we are calculating virtual and physical addresses
|
|
ff $${io.vaddr}
|
|
f pfrom @ pfrom+pdelta
|
|
f vfrom @ vfrom+pdelta
|
|
ff 0
|
|
|
|
?e Range of data to cipher: `?v pfrom`- `?v pto`= `?v pto-pfrom`bytes
|
|
|
|
# Calculate the new physical size of bytes to be ciphered.
|
|
# we dont want to overwrite bytes not related to the data section
|
|
f psize @ section._data_end - section._data
|
|
?e PSIZE = `?v psize`
|
|
|
|
# 'wox' stands for 'write operation xor' which accepts an hexpair list as
|
|
# a cyclic XOR key to be applied to at 'pfrom' for 'psize' bytes.
|
|
wox a8 @ pfrom:psize
|
|
|
|
# Inject shellcode
|
|
#------------------
|
|
|
|
# Setup the simple profile for the disassembler to simplify the parsing
|
|
# of the code with the internal grep
|
|
e asm.profile=simple
|
|
|
|
# Seek at entrypoint
|
|
s entrypoint
|
|
|
|
# Seek at address (fourth word '[3]') at line'#1' line of the disassembly
|
|
# matching the string 'push dword' which stands for the '__libc_csu_init'
|
|
# symbol. This is the constructor of the program, and we can modify it
|
|
# without many consequences for the target program (it is not widely used).
|
|
s `pd 20~push dword[3]#1`
|
|
|
|
# Compile the 'uxor.S' specifying the virtual from and virtual to addresses
|
|
!gcc uxor.S -DFROM=`?v vfrom` -DTO=`?v vfrom+psize`
|
|
|
|
# Extract the symbol 'main' of the previously compiled program and write it
|
|
# in current seek (libc_csu_init)
|
|
wx `!!rabin -o d/s a.out|grep ^main|cut -d ' ' -f 2`
|
|
|
|
# Cleanup!
|
|
?e Uncipher code: `?v $$+$${io.vaddr}`
|
|
!!rm -f a.out
|
|
q!
|
|
-------8<------------
|
|
|
|
The unciphering snippet of code looks like:
|
|
|
|
-------8<------------
|
|
.global main
|
|
main:
|
|
mov $FROM, %esi
|
|
mov $TO, %edi
|
|
0:
|
|
inc %esi
|
|
xorb $0xa8, 0(%esi)
|
|
cmp %edi, %esi
|
|
jbe 0b
|
|
xor %edi, %edi
|
|
ret
|
|
-------8<------------
|
|
|
|
Finally we run the code:
|
|
|
|
$ cat hello.c
|
|
#include <stdio.h>
|
|
char message[128] = "Hello World";
|
|
|
|
int main()
|
|
{
|
|
message[1]='a';
|
|
if (message[0]!='H')
|
|
printf("Oops\n");
|
|
else
|
|
printf("%s\n", message);
|
|
return 0;
|
|
}
|
|
|
|
$ gcc hello.c -o hello
|
|
$ strings hello | grep Hello
|
|
Hello World
|
|
$ ./hello
|
|
Hallo World
|
|
$ radare -w -i xordata.rs hello
|
|
$ strings hello | grep Hello
|
|
$ ./hello
|
|
Hallo World
|
|
|
|
The 'Hello World' string is no longer in plain text. The ciphering
|
|
algorithm is really stupid. The reconstructed data section can be
|
|
extracted from a running process using the debugger:
|
|
|
|
$ cat unpack-xordata.rs
|
|
e asm.profile=simple
|
|
s entrypoint
|
|
!cont `pd 20~push dword[3]#2`
|
|
f delta @ section._data- segment.phdr_0.rw_.paddr-section.
|
|
s segment.phdr_0.rw_.vaddr+delta
|
|
b section._data_end - section._data
|
|
wt dump
|
|
q!
|
|
$ radare -i unpack-xordata.rs -d ./hello
|
|
$ strings dump
|
|
Hello World
|
|
|
|
Or statically:
|
|
|
|
e asm.profile=simple
|
|
wa ret @ `pd 20~push dword[3]#1`
|
|
wox a8 @ section._data:section._data_end-section.data
|
|
|
|
This is a plain example of how complicated can be to add a protection and
|
|
how easy is to bypass it.
|
|
|
|
As a final touch for the unpacker, just write a 'ret' instruction at symbol
|
|
'__libc_csu_init' where the deciphering loop lives:
|
|
|
|
wa ret @ sym.__libc_csu_init
|
|
|
|
Or just push a null pointer instead of the csu_init pointer in the entrypoint.
|
|
|
|
e asm.profile=simple
|
|
s entrypoint
|
|
wa push 0 @ `pd 20~push dword[0]#1`
|
|
|
|
|
|
--[ 3.4 - Finding differences in binaries
|
|
|
|
Lets draw a scenario where a modified binary (detected by a checksum file
|
|
analyzer) has been found on a server, and the administrator or the forensic
|
|
analyst wants to understand whats going on that file.
|
|
|
|
To simplify the explanation the explanation will be based on the previous
|
|
chapter example.
|
|
|
|
At first point, checksum analysis will determine if there's any collision
|
|
in the checksum algorithm, and if the files are really different:
|
|
|
|
$ rahash -a all hello
|
|
par: 0
|
|
xor: 00
|
|
hamdist: 01
|
|
xorpair: 2121
|
|
entropy: 4.46
|
|
mod255: 84
|
|
crc16: 2654
|
|
crc32: a8a3f265
|
|
md4: cbfc07c65d377596dd27e64e0dcac6cd
|
|
md5: 2b3b691d92e34ad04b1affea0ad2e5ab
|
|
sha1: 8ac3f9165456e3ac1219745031426f54c6997781
|
|
sha256: 749e6a1b06d0cf56f178181c608fc6bbd7452e361b1e8bfbcfac1310662d4775
|
|
sha384: c00abb14d483d25d552c3f881334ba60d77559ef2cb01ff0739121a178b05...
|
|
sha512: 099b42a52b106efea0dc73791a12209854bc02474d312e6d3c3ebf2c272ac...
|
|
|
|
$ rahash -a all hello.orig
|
|
par: 0
|
|
xor: de
|
|
hamdist: 01
|
|
xorpair: a876
|
|
entropy: 4.29
|
|
mod255: 7b
|
|
crc16: 2f1f
|
|
crc32: b8f63e24
|
|
md4: bc9907d433d9b6de7c66730a09eed6f4
|
|
md5: 6085b754b02ec0fc371a0bb7f3d2f34e
|
|
sha1: 6b0d34489c5ad9f3443aadfaf7d8afca6d3eb101
|
|
sha256: 8d58d685cf3c2f55030e06e3a00b011c7177869058a1dcd063ad1e0312fa1de1
|
|
sha384: 6e23826cc1ee9f2a3e5f650dde52c5da44dc395268152fd5c98694ec9afa0...
|
|
sha512: 09b7394c1e03decde83327fdfd678de97696901e6880e779c640ae6b4f3bf...
|
|
|
|
There is the possibility to generate two different files matching until 3
|
|
different checksum algorithms (and maybe more in the future). If any of
|
|
them differs we can determine that the file is different.
|
|
|
|
$ du -bs hello hello.orig
|
|
4913 hello
|
|
4913 hello.orig
|
|
|
|
As both files have the same size it will be worth using a byte-level
|
|
diffing, because it is more probable easy to read the differences than
|
|
trying to understand them as deltified ones (which is the default diffing
|
|
algorithm).
|
|
|
|
$ radiff -b hello.orig hello
|
|
------
|
|
000003ed 00
|
|
000003f0 55 | 60
|
|
000003f1 89 | be
|
|
000003f2 e5 | c0
|
|
...
|
|
00000406 fe | 61
|
|
00000407 ff | c3
|
|
00000408 ff | 90
|
|
00000409 8d | 90
|
|
0000040a bb | 90
|
|
0000040b 18 | 90
|
|
0000040c ff
|
|
------
|
|
000005bd 00
|
|
000005c0 00 | a8
|
|
000005c1 00 | a8
|
|
000005c2 00 | a8
|
|
...
|
|
000005df 00 | a8
|
|
000005e0 48 | e0
|
|
000005e1 65 | cd
|
|
000005e2 6c | c4
|
|
000005e3 6c | c4
|
|
000005e4 6f | c7
|
|
0000065f 00 | a8
|
|
00000660 00
|
|
|
|
Radiff detects two blocks of changes determined by the '------' lines.
|
|
|
|
The first block starts at 0x3f0 and the second one (which is bigger) begins
|
|
at address 0x5c0.
|
|
|
|
While watching the massive 00->a8 conversion it is possible to blindly
|
|
determine that the ciphering algorithm is XOR and the key is 0xa8. But
|
|
let's get a look on the first block of changes and disassemble the code.
|
|
|
|
$ BYTES=$(radiff -f 0x3f0 -t 0x40b -b hello.orig hello | awk '{print $4}')
|
|
$ rasm -d "$(echo ${BYTES})"
|
|
pushad
|
|
mov esi, 0x80495c0
|
|
mov edi, 0x8049660
|
|
inc esi
|
|
xor byte [esi], 0xa8
|
|
cmp esi, edi
|
|
jbe 0x38
|
|
popad
|
|
ret
|
|
...
|
|
|
|
Here it is! this code of assembler is looping from 0x80495c0 to 0x8049660
|
|
and XORing each byte with 0xa8.
|
|
|
|
radiff can also dump the results of the diff in radare commands. The
|
|
execution of the script will result on a binary patch script to convert one
|
|
file to another.
|
|
|
|
$ radiff -rb hello hello.orig > binpatch.rs
|
|
$ cat binpatch.rs | radare -w hello
|
|
$ md5sum hello hello.orig
|
|
6085b754b02ec0fc371a0bb7f3d2f34e hello
|
|
6085b754b02ec0fc371a0bb7f3d2f34e hello.orig
|
|
|
|
|
|
--[ 3.5 - Removing library dependencies
|
|
|
|
In some gnu/linux distributions, all the ELF binaries are linked against
|
|
libselinux, which is an undesired dependency if we want to make our hello
|
|
world portable. SElinux is not available in all gnu/linux distros.
|
|
|
|
This is a simple example, but we can extend the issue to bad pkg-config
|
|
configuration files that can make a binary depend on more libraries than
|
|
the minimum required.
|
|
|
|
Perhaps most users will not care and will probably install those
|
|
dependencies, recompile the kernel or whatever else. It is a pity to make
|
|
steps in backward instead of telling gcc to not link against those
|
|
libraries.
|
|
|
|
But we can avoid these dependencies easily: all the library names are
|
|
stored as zero-terminated strings in the ELF header, and the loader (at
|
|
least the Solaris, Linux and BSD ones) will ignore dependency entries if
|
|
the library name is zero length.
|
|
|
|
The patch consists on finding the non-desired library name (or the prefix,
|
|
if there are more than one libraries with similar names) and then write
|
|
0x00 in each search result.
|
|
|
|
$ radare -nw hello
|
|
[0x00000000]> / libselinux
|
|
... (results) ...
|
|
[0x00000000]> wx 00 @@ hit
|
|
[0x00000000]> q!
|
|
|
|
Using -nw radare will run without interpreting the ~/.radarerc and it will
|
|
not detect symbols or analyze code (we only need to use the basic
|
|
hexadecimal capabilities). The -w flag will open the file in read-write.
|
|
|
|
Each search result will be stored in the 'search' flagspace with names
|
|
predefined by the 'search.flagname' eval variable that will be hit%d_%d by
|
|
default. (This is keyword_index and hit_index).
|
|
|
|
The last command will run 'wx 00' at every '@@' flag matching 'hit' in the
|
|
current flagspace.
|
|
|
|
Use 'q!' if you don't want to be prompted to restore the changes done
|
|
before exiting.
|
|
|
|
|
|
--[ 3.6 - Syscall obfuscation
|
|
|
|
Looking for raw int 0x80 instructions on standard binaries it's going
|
|
probably to be a hard task unless these are statically compiled. This is
|
|
because they usually live in libraries which are embedded in the program.
|
|
|
|
Libraries like libc have loads of symbols that are directly mapped to
|
|
syscalls, and the reverser could use this information to identify code in a
|
|
statically compiled program.
|
|
|
|
To obfuscate symbols we can embed syscalls directly on the host program,
|
|
after trashing the stack to break standard backtraces.
|
|
|
|
If you go deep enough on the new syscalling system of Linux you will notice
|
|
that the int 0x80 syscall mechanism is no longer used in the current x86
|
|
platforms because performance reasons.
|
|
|
|
The new syscall mechanism consists in calling a trampoline function
|
|
installed by the kernel as a 'fake' ELF shared object mapped on the running
|
|
process memory. This trampoline function implements the most suitable
|
|
syscall mechanism for the current platform. This shared object is called
|
|
VDSO.
|
|
|
|
In Linux only two segment registers are used: cs and gs. This is for
|
|
simplicity reasons. The first one allows to access to the whole program
|
|
memory maps, and the second one is used to store Thread related
|
|
information.
|
|
|
|
Here is a simple segment dumper to get the contents from %gs.
|
|
|
|
----------------------8<----------------------------
|
|
$ cat segdump.S
|
|
#define FROM 0 /* 0x8048000 */
|
|
#define LEN 0x940
|
|
#define TIMES 1
|
|
#define SEGMENT %gs /* %ds */
|
|
|
|
.global main
|
|
|
|
.data
|
|
buffer:
|
|
.fill LEN, TIMES, 0x00
|
|
|
|
.text
|
|
main:
|
|
mov $TIMES, %ecx
|
|
loopme:
|
|
push %ecx
|
|
|
|
sub $TIMES, %ecx
|
|
movl $LEN, %eax
|
|
imul %eax, %ecx
|
|
movl %ecx, %esi /* esi = ($TIMES-%ecx)*$LEN */
|
|
addl $FROM, %esi
|
|
lea buffer, %edi
|
|
movl $LEN, %ecx
|
|
|
|
rep movsb SEGMENT:(%esi),%es:(%edi)
|
|
|
|
movl $4, %eax
|
|
movl $1, %ebx
|
|
lea buffer, %ecx
|
|
movl $LEN, %edx
|
|
int $0x80
|
|
|
|
pop %ecx
|
|
xor %eax, %eax
|
|
cmp %eax, %ecx
|
|
dec %ecx
|
|
jnz loopme
|
|
|
|
ret
|
|
----------------------8<----------------------------
|
|
|
|
At this point we disable the random va space to analyze the contents of the
|
|
VDSO map always at the same address during executions.
|
|
|
|
$ sudo sysctl -w kernel.randomize_va_space=0
|
|
|
|
Compile and dump the segment to a file:
|
|
|
|
$ gcc segdump.S
|
|
$ ./a.out > gs.segment
|
|
|
|
And now let's do some little analysis in the debugger:
|
|
|
|
$ radare -d ls ; debug 'ls' program
|
|
...
|
|
; Inside the debugger we write the contents of the gs.segment file.
|
|
; In the initial state, the current seek points to 'eip'.
|
|
|
|
[0x4A13B8C0]> wf gs.segment
|
|
Poke 100 bytes from gs.segment 1 times? (y/N) y
|
|
file poked.
|
|
offset 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
|
|
0x4a13b8c0, c006 e8b7 580b e8b7 c006 e8b7 0000 0000 ....X...........
|
|
0x4a13b8d0, 1414 feb7 ec06 0a93 b305 6beb 0000 0000 ..........k.....
|
|
0x4a13b8e0, 0000 0000 0000 0000 0000 0000 0000 0000 ................
|
|
0x4a13b8f0, 0000 0000 0000 0000 0000 0000 0000 0000 ................
|
|
0x4a13b900, 0000 0000 0000 0000 0000 0000 0000 0000 ................
|
|
0x4a13b910, 0000 0000 0000 0000 0000 0000 0000 0000 ................
|
|
|
|
; Once the file is poked in process memory we have different ways to
|
|
; analyze the contents of the current data block. One possible option
|
|
; is to use the 'pm' command which prints memory contents following the
|
|
; format string we use as argument, it is also possible to specify the
|
|
; element names as in a structure.
|
|
;
|
|
; Another option is the 'ad' command which 'analyzes data' reading the
|
|
; memory as 32bit primitives following pointers and identifying
|
|
; references to code, data, strings or null pointers.
|
|
|
|
[0x4A13B8C0]> pm XXXXXXXXX
|
|
0x4a13b8c0 = 0xb7fe46c0
|
|
0x4a13b8c4 = 0xb7fe4ac8
|
|
0x4a13b8c8 = 0xb7fe46c0
|
|
0x4a13b8cc = 0x00000000 ; eax
|
|
0x4a13b8d0 = 0xb7fff400 ; maps._vdso__0+0x400
|
|
0x4a13b8d4 = 0xff0a0000
|
|
0x4a13b8d8 = 0x856003b1
|
|
0x4a13b8dc = 0x00000000 ; eax
|
|
0x4a13b8e0 = 0x00000000 ; eax
|
|
|
|
[0x4A13B8C0]> ad
|
|
0x4a13b8c0, int be=0xc046feb7 le=0xb7fe46c0
|
|
0x4a13b8c4, int be=0xc84afeb7 le=0xb7fe4ac8
|
|
0x4a13b8c8, int be=0xc046feb7 le=0xb7fe46c0
|
|
0x4a13b8cc, (NULL)
|
|
0x4a13b8d0, int be=0x00f4ffb7 le=0xb7fff400 maps._vdso__0+0x400
|
|
0xb7fff400, string "QRU"
|
|
0xb7fff404, int be=0xe50f3490 le=0x90340fe5
|
|
|
|
; Both analysis result in a pointer to the vdso map at 0x4a13b8d0
|
|
; Disassembling these contents with 'pd' (print disassembly) we get that
|
|
; the code uses sysenter.
|
|
|
|
[0x4A13B8C0]> pd 17 @ [+0x10]
|
|
_vdso__0:0xb7fff400, 51 push ecx
|
|
_vdso__0:0xb7fff401 52 push edx
|
|
_vdso__0:0xb7fff402 55 push ebp
|
|
.-> _vdso__0:0xb7fff403 89e5 mov ebp, esp
|
|
| _vdso__0:0xb7fff405 0f34 sysenter
|
|
| _vdso__0:0xb7fff407 90 nop
|
|
| _vdso__0:0xb7fff408, 90 nop
|
|
| _vdso__0:0xb7fff409 90 nop
|
|
| _vdso__0:0xb7fff40a 90 nop
|
|
| _vdso__0:0xb7fff40b 90 nop
|
|
| _vdso__0:0xb7fff40c, 90 nop
|
|
| _vdso__0:0xb7fff40d 90 nop
|
|
`=< _vdso__0:0xb7fff40e ebf3 jmp 0xb7fff403
|
|
_vdso__0:0xb7fff410, 5d pop ebp
|
|
_vdso__0:0xb7fff411 5a pop edx
|
|
_vdso__0:0xb7fff412 59 pop ecx
|
|
_vdso__0:0xb7fff413 c3 ret
|
|
|
|
After this little analysis we can use a new way to call syscalls and
|
|
transform the "int 0x80" instructions into a bunch of operations.
|
|
|
|
new_syscall:
|
|
push %edi
|
|
push %esi
|
|
movl $0x10, %esi
|
|
movl %gs:(%esi), %edi
|
|
call *%edi
|
|
pop %esi
|
|
pop %edi
|
|
ret
|
|
|
|
Using this 'big' primitive we are able to add more trash code and make
|
|
static analysis harder.
|
|
|
|
Using dynamic analysis it is possible to bypass this protection by writing
|
|
a simple script that dumps the backtrace every time a syscall is executed.
|
|
|
|
(syscall-bt
|
|
e scr.tee=log.txt
|
|
label:
|
|
!contsc
|
|
!bt
|
|
.label:
|
|
)
|
|
|
|
Or with a shell single liner:
|
|
|
|
$ echo '(syscall-bt,e scr.tee=log.txt,label:,!contsc,!bt,.label:,) \
|
|
&&.(syscall-bt)' | radare -d ./a.out
|
|
|
|
A way to make the analysis harder is to trash the stack contents with a
|
|
random offset every time we run a syscall. By decreasing the stack pointer
|
|
and later incrementing it we will make the backtrace fail.
|
|
|
|
To trash the stack we can just increment the stack before the syscall and
|
|
decrement it after calling it. The analyst will have to patch these inc/dec
|
|
opcodes for every syscall trampolines to get proper backtrace and resolve
|
|
the xrefs for each entry.
|
|
|
|
Radare implements a "!contsc" command that makes the debugger stop before
|
|
and after a syscall. This can be scripted together with "!bt" and "!st" to
|
|
get a handmade version of the strace utility.
|
|
|
|
Another trick that we can exploit for writing our homebrew packer is to use
|
|
instructions needed to be patched for proper analysis as cipher keys.
|
|
Making the program crash when trying to decipher a section using invalid
|
|
keys for example.
|
|
|
|
|
|
--[ 3.7 - Replacing library symbols
|
|
|
|
Statically linked binaries are full of unused bytes, we can take use of
|
|
this to inject our code.
|
|
|
|
Lot of library symbols (for example from libc) can be directly replaced by
|
|
system calls at the symbol or directly in-place where it is called. For
|
|
example:
|
|
|
|
call exit
|
|
|
|
can be replaced these (also 5 byte length) instructions:
|
|
|
|
xor eax,eax
|
|
inc eax
|
|
int 0x80
|
|
|
|
$ rasm 'xor eax,eax;inc eax;int 0x80'
|
|
31 c0 40 cd 80
|
|
|
|
This way we can just go into the statified libc's exit implementation and
|
|
overwrite it with our own code.
|
|
|
|
If the binary is statically compiled with stripped symbols we will have to
|
|
look for signatures or directly find the 'int 0x80' instruction and then
|
|
analyze code backward looking for the beginning of the implementation and
|
|
then find for xrefs to this point.
|
|
|
|
Most compilers append libraries at the end of the binary, so we can easily
|
|
'draw' an imaginary line where the program ends and libraries start.
|
|
|
|
There's another tool that may help to identify which libraries and versions
|
|
are embedded in our target program. It is called 'rsc gokolu'. The name
|
|
stands for 'g**gle kode lurker' and it will look for strings in the given
|
|
program in the famous online code browser and give approximated results of
|
|
the source of the string.
|
|
|
|
When we have debugging or symbolic information the process becomes much
|
|
easier and we can directly overwrite the first bytes of the embedded glibc
|
|
code with the small implementation in raw assembly using syscalls.
|
|
|
|
rabin is the utility in radare which extracts information from the headers
|
|
of a given program. This is like a multi-architecture multi-platform and
|
|
multi-format 'readelf' or 'nm' replacement which supports ELF32/64,
|
|
PE32/32+, CLASS and MACH0.
|
|
|
|
The use of this tool is quite simple and the output is quite clean to ease
|
|
the parsing with sed and other shell tools.
|
|
|
|
$ rabin -vi a.out
|
|
[Imports]
|
|
Memory address File offset Name
|
|
0x08049034 0x00001034 __gmpz_mul_ui
|
|
0x08049044 0x00001044 __cxa_atexit
|
|
0x08049054 0x00001054 memcmp
|
|
0x08049064 0x00001064 pcap_close
|
|
0x08049074 0x00001074 __gmpz_set_ui
|
|
...
|
|
|
|
$ rabin -ve a.out
|
|
[Entrypoint]
|
|
Memory address: 0x080494d0
|
|
|
|
$ rabin -vs a.out
|
|
[Symbols]
|
|
Memory address File offset Name
|
|
0x0804d86c 0x0000586c __CTOR_LIST__
|
|
0x0804d874 0x00005874 __DTOR_LIST__
|
|
0x0804d87c 0x0000587c __JCR_LIST__
|
|
0x08049500 0x00001500 __do_global_dtors_aux
|
|
0x0804dae8 0x00005ae8 completed.5703
|
|
0x0804daec 0x00005aec dtor_idx.5705
|
|
0x08049560 0x00001560 frame_dummy
|
|
...
|
|
|
|
$ rabin -h
|
|
rabin [options] [bin-file]
|
|
-e shows entrypoints one per line
|
|
-i imports (symbols imported from libraries)
|
|
-s symbols (exports)
|
|
-c header checksum
|
|
-S show sections
|
|
-l linked libraries
|
|
-H header information
|
|
-L [lib] dlopen library and show address
|
|
-z search for strings in elf non-executable sections
|
|
-x show xrefs of symbols (-s/-i/-z required)
|
|
-I show binary info
|
|
-r output in radare commands
|
|
-o [str] operation action (str=help for help)
|
|
-v be verbose
|
|
|
|
See manpage for more information.
|
|
|
|
The same flags can be used for any type of binary for any architecture.
|
|
|
|
Using the -r flag the output will be formatted in radare commands, this way
|
|
you can import the binary information. This command imports the symbols,
|
|
sections, imports and binary information.
|
|
|
|
[0x00000000]> .!!rabin -rsSziI $FILE
|
|
|
|
This is done automatically when loading a file if you have file.id=true, if
|
|
you want to disable the analysis run 'radare' with '-n' (because this way
|
|
it will not interpret the ~/.radarerc).
|
|
|
|
Programs with symbols or debug information are mostly analyzed by the code
|
|
analysis of 'radare' thanks to the hints given by 'rabin'. So,
|
|
theoretically we will be able to automatically resolve the crossed
|
|
references to functions. But if we are not that lucky we can just setup a
|
|
simple disassembly output format (e asm.profile=simple) and disassemble the
|
|
whole program code grepping for 'call 0x???????' to resolve calls to our
|
|
desired owned symbol.
|
|
|
|
[0x08049A00]> e scr.color=false
|
|
[0x08049A00]> e asm.profile=simple
|
|
[0x08049A00]> s section._text
|
|
[0x08049A00]> b section._text_end-section._text
|
|
[0x08049A00]> pd~call 0x80499e4
|
|
0x08049fd9 call 0x80499e4 ; imp.exit
|
|
0x0804fa89 call 0x80499e4 ; imp.exit
|
|
0x0805047b call 0x80499e4 ; imp.exit
|
|
[0x08049A00]>
|
|
|
|
If we are not lucky and do not have symbols we will have to identify which
|
|
parts of the binary are part of our target binary.
|
|
|
|
$ rsc syms-dump /lib/libc.so.6 24 > symbols
|
|
|
|
|
|
--[ 3.8 - Checksumming
|
|
|
|
The most common procedure to check for undesired runtime or static program
|
|
patches is done by using checksumming operations over sections, functions,
|
|
code or data blocks.
|
|
|
|
These kind of protections are usually easy to patch, but allow the end user
|
|
to get a grateful error message instead of a raw Segmentation Fault.
|
|
|
|
The error message string can be a simple entrypoint for the reverser to
|
|
catch the position of the checksumming code.
|
|
|
|
Radare offers a hashing command under the "h" char and allows you to
|
|
calculate block-based checksums while debugging or statically by opening
|
|
the file from disk.
|
|
|
|
Heres an usage example:
|
|
|
|
hmd5
|
|
|
|
It is also possible to use these hashing instructions to find pieces of the
|
|
program matching a certain checksum value, or just recalculate the new sum
|
|
to not patch the hash check.
|
|
|
|
The 'h' command can be also performed from the shell by using the "rahash"
|
|
program which permits to calculate multiple checksums for a single file
|
|
based on blocks of bytes. This is commonly used on forensics, when you want
|
|
a checksum to not only give you a boolean reply (if the program has been
|
|
modified), because this way you can reduce the problem on a smaller piece
|
|
of the binary because you have multiple checksums at different offsets for
|
|
the same file.
|
|
|
|
This action can be done over hard disks, device firmwares or memory dumps
|
|
of the same size.
|
|
|
|
If you are dealing similar file with deltified contents or you require a
|
|
more fine grained result you should try "radiff" which implements multiple
|
|
bindiffing algorithms that goes from byte level detecting deltas to code
|
|
level diffing using information of basic blocks, variables accessed and
|
|
functions called.
|
|
|
|
We will also have to decide a protection method for the checksumming
|
|
algorithm, because this will be the first patch required to enable software
|
|
breakpoints and user defined patches to be done.
|
|
|
|
To protect the hasher we just put it distributed along the program by
|
|
mixing its code with the rest of the program making more difficult to
|
|
identify the its position. We can also choose to put this code into a
|
|
ciphered area in .data together with other necessary code to make the
|
|
program run.
|
|
|
|
One of the main headaches for reversers is the layering of protections that
|
|
can be represented by lot of entrypoints to the same code making more
|
|
difficult to locate and patch in a proper way.
|
|
|
|
The checksumming of a section can be used as a deciphering key for other
|
|
ciphered sections, to avoid easy patching, the program should check for the
|
|
consistence of the deciphered code. This can be done by checking for
|
|
another checksum or by comparing few bytes at the beginning of the
|
|
deciphered and mmap code.
|
|
|
|
There is another possible design to implement the checksumming algorithm in
|
|
a more flexible way which consists on checksumming the code in runtime
|
|
instead of hardcoding the resulting value somewhere.
|
|
|
|
This method allows us to 'protect' multiple functions without having to
|
|
statically calculate the checksum value of the region. The problem with
|
|
this method is that it doesn't provide us protection against static
|
|
modifications on the binary.
|
|
|
|
This snippet can be compiled with DEBUG=0 to get a fully working
|
|
checksumming function which returns 0 when the checksum matches.
|
|
|
|
---------------------8<--------------------------
|
|
#define CKSUM 0x68
|
|
#define FROM 0x8048000
|
|
#define SIZE 1024
|
|
|
|
str:
|
|
.asciz "Checksum: 0x%08x\n"
|
|
|
|
.global cksum
|
|
cksum:
|
|
pusha
|
|
movl $FROM, %esi
|
|
movl $SIZE, %ecx
|
|
xor %eax, %eax
|
|
1:
|
|
xorb 0(%esi), %al
|
|
add $1, %esi
|
|
loopnz 1b
|
|
#if DEBUG
|
|
pushl %eax
|
|
pushl $str
|
|
call printf
|
|
addl $8, %esp
|
|
#else
|
|
sub $CKSUM, %eax
|
|
jnz 2f
|
|
#endif
|
|
popa
|
|
ret
|
|
|
|
#if DEBUG
|
|
.global main
|
|
main:
|
|
call cksum;
|
|
ret
|
|
#else
|
|
2:
|
|
mov $20, %eax
|
|
int $0x80
|
|
mov $37, %ebx
|
|
mov $9, %ecx
|
|
xchg %eax, %ebx
|
|
int $0x80
|
|
#endif
|
|
---------------------8<--------------------------
|
|
|
|
To get the bytes of the function to be pushed we can use an 'rsc' script that
|
|
dumps the hexpairs contained managed by a symbol in a binary:
|
|
|
|
$ gcc cksum.S -DDEBUG
|
|
$ ./a.out
|
|
Checksum: 0x69
|
|
|
|
$ gcc -c cksum.S
|
|
$ rsc syms-dump cksum.o | grep cksum
|
|
cksum: 60 be 00 80 04 08 b9 00 04 00 00 31 c0 32 06 83 c6 01 e0 f9 2d d2 04 \
|
|
00 00 75 02 61 c3 b8 14 00 00 00 cd 80 bb 25 00 00 00 b9 09 00 00 00 93 cd 80
|
|
|
|
The offsets to patch with our ranged values are:
|
|
|
|
+2: original address (4 bytes in little endian)
|
|
+7: size of buffer (4 bytes in little endian)
|
|
+21: checksum byte (1 byte)
|
|
|
|
We can use this hexdump from a radare macro that automatically patches the
|
|
correct bytes while inserting the code in the target program.
|
|
|
|
$ rsc syms-dump cksum.o | grep cksum | cut -d : -f 2 > cksum.hex
|
|
$ cat cksum.macro
|
|
(cksum-init ckaddr size,f ckaddr @ $0,f cksize @ $1,)
|
|
(cksum hexfile size
|
|
# kidnap some 4 opcodes from the function
|
|
f kidnap @ `aos 4`
|
|
yt kidnap ckaddr+cksize
|
|
wa jmp $$+kidnap @ ckaddr+cksize+kidnap
|
|
wa call `?v ckaddr`
|
|
wF $0 @ ckaddr
|
|
f cksize @ $$w
|
|
f ckaddr_new @ ckaddr+cksize+50
|
|
wv $$ @ ckaddr+2
|
|
wv $1 @ ckaddr+7
|
|
wx `hxor $1` @ ckaddr+21
|
|
f ckaddr @ ckaddr_new
|
|
)
|
|
|
|
$ cat cksum.hooks
|
|
0x8048300
|
|
0x8048400
|
|
0x8048800
|
|
|
|
$ radare -w target
|
|
[0x8048923]> . cksum.macro
|
|
[0x8048923]> .(cksum-init 0x8050300 30)
|
|
[0x8048923]> .(cksum cksum.hex 100) @@. cksum.hooks
|
|
|
|
|
|
--[ 4 - Playing with code references
|
|
|
|
The relationship between parts of code in the program are a good source of
|
|
information for understanding the dependencies between code blocks,
|
|
functions.
|
|
|
|
Following sub-chapters will explain how to generate, retrieve and use the
|
|
code references for retrieving information from a program in memory or
|
|
statically on-disk.
|
|
|
|
|
|
--[ 4.1 - Finding xrefs
|
|
|
|
One of the most interesting information we need when understanding how a
|
|
program works or what it does is the flow graph between functions. This
|
|
give us the big picture of the program and allows us to find faster the
|
|
points of interest.
|
|
|
|
There are different ways to obtain this information in radare, but you will
|
|
finally manage it with the 'Cx' and 'CX' commands (x=code xref, X=data
|
|
xref). This means that you can even write your own program to extract this
|
|
information and import it into radare by just interpreting the output of
|
|
the program or a file.
|
|
|
|
[0x8048000]> !!my-xrefs-analysis a.out > a.out.xrefs
|
|
[0x8048000]> . a.out.xrefs
|
|
or
|
|
[0x8048000]> .!my-xrefs-analysis a.out
|
|
|
|
At the time of writing this, radare implements a basic code analysis
|
|
algorithm that identifies function boundaries, splits basic blocks, finds
|
|
local variables and arguments, tracks their access and also identifies code
|
|
and data references of individual instructions.
|
|
|
|
When 'file.analyze' is 'true', this code analysis is done recursively over
|
|
the entrypoint and all the symbols identified by rabin. This means that it
|
|
will not be able to automatically resolve code references following pointer
|
|
tables, register branches and so on and lot of code will remain unanalyzed.
|
|
|
|
We can manually force to analyze a function at a certain offset by using
|
|
the '.afr@addr' command which will interpret ('.') the output of the
|
|
command 'afr' that stands for 'analyze function recursively' at (@) a
|
|
certain address (addr).
|
|
|
|
When a 'call' instruction is identified, the code analysis will register a
|
|
code xref from the instruction to the endpoint. This is because compilers
|
|
uses these instructions to call into subroutines or symbols.
|
|
|
|
As we did previously, we can disassemble the whole text section looking for
|
|
a 'call' instruction and register code xrefs between the source and
|
|
destination address with these two lines:
|
|
|
|
[0x000074E0]> e asm.profile=simple
|
|
[0x000074E0]> "(xref,Cx `pd 1~[2]`)"
|
|
[0x000074E0]> .(xref) @@=`pd 100~call[0]`
|
|
|
|
We use the simple asm.profile to make the disassembly text parsing easier.
|
|
|
|
The first line will register a macro named 'xref' that creates a code xref
|
|
at address pointed by the 3rd word of the disassembled instruction.
|
|
|
|
In the second line it will run the previous macro at every offset specified
|
|
by the output of the "pd 100~call[0]" instruction. This is the offset of
|
|
every call instruction of the following 100 instructions.
|
|
|
|
When disassembling code (using the 'pd' command (print disasm)) xrefs will
|
|
be displayed if asm.xrefs is enabled. Optionally, we can set asm.xrefsto to
|
|
visualize the xref information at both end-points of the code reference.
|
|
|
|
|
|
--[ 4.2 - Blind code references
|
|
|
|
The xrefs(1) program that comes with radare implements a blind code
|
|
reference search algorithm based on identifying sequences of bytes matching
|
|
a certain pattern.
|
|
|
|
The pattern is calculated depending on the offset. This way the program
|
|
calculates the relative delta between the current offset and the
|
|
destination address and tries to find relative branches from one point to
|
|
another of the program.
|
|
|
|
The current implementation supports x86, arm and powerpc, but allows to
|
|
define templates of branch instructions for other architectures by using
|
|
the option flags to determine the instruction size, endianness, the way to
|
|
calculate the delta, etc.
|
|
|
|
The main purpose of this program was to identify code xrefs on big files
|
|
(like firmwares), where the code analysis takes too long or requires many
|
|
manual interaction, it is not designed to identify absolute calls or jumps.
|
|
|
|
Here's an usage example:
|
|
|
|
$ xrefs -a arm bigarmblob.bin 0x6a390
|
|
0x80490
|
|
0x812c4
|
|
|
|
|
|
--[ 4.3 - Graphing xrefs
|
|
|
|
Radare ships some commands to manage and create interactive graphs to
|
|
display code analysis based on basic blocks, function blocks or whatever we
|
|
want by using the 'gu' command (graph user)
|
|
|
|
The most simple way to use the graphing capabilities is to use the 'ag'
|
|
command that will directly popup a gtk window with a graphed code analysis
|
|
from the current seek, splitting the code by basic blocks. (use the eval
|
|
graph.jmpblocks and graph.callblocks to split blocks by jumps or calls to
|
|
analyze basic blocks or function blocks).
|
|
|
|
The graph is stored internally, but if we want to create our own graphs
|
|
we can use the 'gu' command which provides subcommands to reset a graph,
|
|
create nodes, edges and display them.
|
|
|
|
[0x08049A00]> gu?
|
|
Usage: gu[dnerv] [args]
|
|
gur user graph reset
|
|
gun $$ $$b pd add node
|
|
gue $$F $$t add edge
|
|
gud display graph in dot format
|
|
guv visualize user defined graph
|
|
|
|
This is a radare script that will generate a graph between analyzed
|
|
function by using the xrefs metadata information.
|
|
|
|
After configuring the graph we can display it using the internal viewer
|
|
(guv) or using graphviz's dot.
|
|
|
|
--------8<---------------
|
|
"(xrefs-to from,f XT @ `Cx~$0[3]`,)"
|
|
"(graph-xrefs-init,gur,)"
|
|
"(graph-xrefs,.(xrefs-to `?v $$`),gue $$F XT,)"
|
|
"(graph-viz,gud > dot,!!dot -Tpng -o graph.png dot,!!gqview graph.png,)"
|
|
|
|
.(graph-xrefs-init)
|
|
.(graph-xrefs) @@= `Cx~[1]`
|
|
.(graph-viz)
|
|
--------8<---------------
|
|
|
|
Here's the explanation of the previous script:
|
|
|
|
Note that all commands are quoted '"' because this way the inner special
|
|
characters are ignored ('>', '~' ...)
|
|
|
|
"(xrefs-to from,f XT @ `Cx~$0[3]`,)"
|
|
|
|
Register a macro named 'xrefs-to' that accepts one argument named 'from'
|
|
and aliased as $0 inside the macro body.
|
|
|
|
The macro executes the 'f XT' (create a flag named XT) at ('@') the
|
|
address where the code xref (Cx) points (3rd column of the row
|
|
matching $0 (the first argument of the macro (from)).
|
|
|
|
"(graph-xrefs-init,gur,)"
|
|
|
|
The 'graph-xrefs-init' macro resets the graph user metadata.
|
|
|
|
"(graph-xrefs,.(xrefs-to `?v $$`),gue $$F XT,)"
|
|
|
|
This macro named 'graphs-xrefs' that run commands separated by commas:
|
|
|
|
.(xrefs-to `?v $$`)
|
|
|
|
execute the macro 'xrefs-to' at current seek ($$), the value is
|
|
concatenated to the output of executing '?v $$' which evaluates
|
|
the '$$' special variable and prints the value in hexadecimal.
|
|
|
|
gue XF XT
|
|
|
|
Adds an 'graph-user' edge between the beginning of the function at
|
|
current seek and XT.
|
|
|
|
"(graph-viz,gud > dot,!!dot -Tpng -o graph.png dot,!!rsc view graph.png,)"
|
|
|
|
These commands will generate a dot file, render it in png and display it.
|
|
|
|
.(graph-xrefs-init)
|
|
.(graph-xrefs) @@= `Cx~[1]`
|
|
.(graph-viz)
|
|
|
|
Then, we bundle all those macros and the xrefs graph will be displayed
|
|
|
|
|
|
--[ 4.4 - Randomizing xrefs
|
|
|
|
Another annoying low level manipulation we can do is to repeat functions at
|
|
random places with random modifications to make the program code more
|
|
verbose and hard to analyze because of complex control flow paths.
|
|
|
|
By doing a cross-references (xrefs) analysis between the symbols we will be
|
|
able to determine which symbols are referenced more from than one single
|
|
point.
|
|
|
|
The internal grep syntax can be tricky to retrieve lists of offsets for
|
|
xrefs at a given position.
|
|
|
|
[0x0000c830]> Cx
|
|
Cx 0x00000c59 @ 0x00000790
|
|
Cx 0x00000c44 @ 0x000007b0
|
|
Cx 0x00000c35 @ 0x00000810
|
|
Cx 0x00000c26 @ 0x000008a0
|
|
...
|
|
|
|
These numbers specify the caller address and the called one. That is:
|
|
|
|
0xc59 -(calls)-> 0x790
|
|
|
|
To disassemble the opcodes referencing code to other positions we can use the
|
|
'pd' command with a foreach statement:
|
|
|
|
[0x0000c830]> pd 1 @@= `C*~Cx[1]`
|
|
|
|
This command will run 'pd 1' on every address specified in the 1st column
|
|
of the output of the 'C*~Cx[1]'
|
|
|
|
|
|
--[ 5 - Conclussion
|
|
|
|
This article has covered many aspects of reverse engineering, low level
|
|
program manipulation, binary patching/diffing and more, despite being a
|
|
concrete task it has tried to expose a variety of scenarios where a command
|
|
line hexadecimal editor becomes an essential tool.
|
|
|
|
Several external scripting languages can be used, don't care about the
|
|
cryptic scripting of the radare commands, there are APIs for perl, ruby and
|
|
python for controlling radare and analyze code or connect to remote
|
|
radares.
|
|
|
|
Being used in batch mode enables the ability to automatize many analysis
|
|
tasks, generate graphs of code, determine if it is packed or infected, find
|
|
vulnerabilities, etc.
|
|
|
|
Maybe the more important characteristic of this software is the abstraction
|
|
of the IO which allows you to open hard disk images, serial connections,
|
|
process memory, haret wince devices, gdb-enabled applications like bochs,
|
|
vmware, qemu, shared memory areas, winedbg and more.
|
|
|
|
There are lot of things we have not covered in this article but the main
|
|
concepts have been exposed while explaining multiple protection techniques
|
|
for binaries.
|
|
|
|
Using scriptable tools for implementing binary protection techniques eases
|
|
the development and permits to recycle and accelerate the analysis and
|
|
manipulation of files.
|
|
|
|
|
|
--[ 6 - Future work
|
|
|
|
The project was born in 2006, and it has grown pretty fast.
|
|
|
|
At the moment of writing this paper the development is done on two
|
|
different branches: radare and radare2.
|
|
|
|
The second one is a complete refactoring of the original code aiming to
|
|
reimplement all the features of the radare framework into multiple
|
|
libraries.
|
|
|
|
This design permits to bypass some limitations of the design of r1 enabling
|
|
full scripting support for specific or composed set of libraries. Each
|
|
module extends its functionalities into plugins.
|
|
|
|
While using parrot as a virtual machine, multiple scripting languages can
|
|
be used together. Being scripting language agnostic enables the possibility
|
|
to mix multiple frameworks together (like metasploit [5], ERESI [6], etc)
|
|
|
|
The modular design based on libraries, plugins and scripts will open new
|
|
ways to extend the framework by third party developers and additionally
|
|
make the code much more maintainable and bugfree.
|
|
|
|
New applications will appear on top of the r2 set of libraries, there are
|
|
some projects aiming to integrate it from web frontends and graphical
|
|
frontends.
|
|
|
|
Plugins extend the modules adding support for new architectures, binary
|
|
formats, code analysis modules, debugging backends, remote protocols,
|
|
scripting facilities and more.
|
|
|
|
At the moment of writing this, radare2 (aka libr) is actually under
|
|
development but implements some test programs that try to reimplement the
|
|
applications distributed with radare.
|
|
|
|
|
|
--[ 7 - References
|
|
|
|
[0] radare homepage
|
|
http://www.radare.org
|
|
|
|
[1] radare book (pdf)
|
|
http://www.radare.org/get/radare.pdf
|
|
|
|
[2] x86-32/64 architecture
|
|
http://www.intel.com/Assets/PDF/manual/253666.pdf
|
|
|
|
[3] ELF file format
|
|
http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
|
|
|
|
[4] AT&T syntax
|
|
http://sig9.com/articles/att-syntax
|
|
|
|
[5] Metasploit, ruby based exploit development framework
|
|
http://www.metasploit.com/
|
|
|
|
[6] ERESI
|
|
http://www.eresi-project.org/
|
|
|
|
|
|
--[ 8 - Greetings
|
|
|
|
Thanks to JV for his support during the write-up of this article, and to
|
|
the Phrack staff to make the publication possible.
|
|
|
|
I would also like to greet some people for their support while the
|
|
development of radare and this article:
|
|
|
|
SexyPandas .. for the great quals, ctfs and moments
|
|
nopcode .. for those great cons, feedback and beers
|
|
nibble .. for those pingpong development sessions and support
|
|
killabyte .. for the funny abstract situations we faced everyday
|
|
esteve .. for the talks, knowledge and the arm/wince reversing tips
|
|
wzzx .. for the motivators, feedback and luls
|
|
lia .. my gf which supported me all those nightly hacking hours :)
|
|
..
|
|
|
|
Certainly...
|
|
|
|
Thanks to all the people who managed to read the whole text and also to the
|
|
rest for being part of that ecosystem named 'underground' and computer
|
|
security which bring us lot of funny hours.
|
|
|
|
There is still lot of work to be done, and ideas or contributions are
|
|
always welcome. Feel free to send me comments.
|
|
|
|
--pancake
|
|
|
|
--------[ EOF
|