mirror of https://github.com/fdiskyou/Zines.git
896 lines
42 KiB
Plaintext
896 lines
42 KiB
Plaintext
Locreate: An Anagram for Relocate
|
|
skape
|
|
12/2006
|
|
mmiller@hick.org
|
|
|
|
1) Foreword
|
|
|
|
Abstract: This paper presents a proof of concept executable packer
|
|
that does not use any custom code to unpack binaries at execution time. This
|
|
is different from typical packers which generally rely on packed executables
|
|
containing code that is used to perform the inverse of the packing operation
|
|
at runtime. Instead of depending on custom code, the technique described in
|
|
this paper uses documented behavior of the dynamic loader as a mechanism for
|
|
performing the unpacking operation. This difference can make binaries packed
|
|
using this technique more difficult to signature and analyze, but only when
|
|
presented to an untrained eye. The description of this technique is meant to
|
|
be an example of a fun thought exercise and not as some sort of revolutionary
|
|
packer. In fact, it's been used in the virus world many years prior to this
|
|
paper.
|
|
|
|
Thanks: The author would like to thank Skywing, spoonm, deft,
|
|
intropy, Orlando Padilla, nemo, Richard Johnson, Rolf Rolles, Derek Soeder,
|
|
and Andre Protas for their discussions and feedback.
|
|
|
|
Challenge: Prior to reading this paper, the author recommends that
|
|
the reader attempt to determine the behavior of the packer that was used on
|
|
the binary included in the attached code sample. The binary itself is
|
|
innocuous and just performs a few simple printf operations.
|
|
|
|
Previous Research: This technique has been used in the virus world far in
|
|
advance of this writing. Examples that apply this technique include
|
|
W95/Resurrel and W95/Silcer. Further research indicates that Peter Szor did a
|
|
write-up on this technique entitled ``Tricky Relocations'' in the April 2001
|
|
edition of Virus Bulletin[2,3].
|
|
|
|
2) Locreate
|
|
|
|
Executable packers, such as UPX, are commonly employed by malware as a means
|
|
of delaying or otherwise thwarting the process of static analysis. Packers
|
|
also have perfectly legitimate uses, but these uses fall outside of the scope
|
|
of this paper. The reason packers make static analysis more difficult is
|
|
because they alter the form of the binary to the point that what appears on
|
|
disk is entirely different from what actually ends up executing in memory.
|
|
This alteration is typically accomplished by encapsulating a pre-existing
|
|
binary in a ``host'' binary. The algorithm used to encapsulate the
|
|
pre-existing binary in the host binary is what differs from one packer to the
|
|
next. In most cases, the host binary must contain code that will perform the
|
|
inverse of the packing operation in order to decapsulate the original binary.
|
|
The code that is responsible for performing this operation is typically
|
|
referred to as an unpacker. The process of unpacking the original binary is
|
|
usually done entirely in memory without writing the original version out to
|
|
disk. Once the original binary is unpacked, execution control is transferred
|
|
to the original binary which begins executing as if nothing had changed.
|
|
|
|
This general approach represents an easy way of altering the form of a binary
|
|
without changing its effective behavior. In fact, it's pretty much analagous
|
|
to payload encoders that are used in conjunction with exploits to alter the
|
|
form of a payload in order to satisify some character restrictions without
|
|
changing the payload's effective behavior. In the case of payload encoders,
|
|
some arbitrary code must be prefixed to the encoded payload in order to
|
|
perform the inverse of the encoding operation once the payload is executed.
|
|
However, like payload encoders, the use of custom code to perform the inverse
|
|
of the packing or encoding operation can lead to a few problems.
|
|
|
|
The most apparent of these problems has to do with the fact that while the
|
|
packed form of an executable may be entirely different from its original, the
|
|
code used to perform the unpacking operation may be static. In the event that
|
|
the unpacker consists of static code, either in whole or in part, it may be
|
|
possible to signature or otherwise identify that a particular packing
|
|
algorithm has been used to produce a binary and thus make it easier to restore
|
|
the original form of the binary. This ability is especially important when it
|
|
comes to attempting to heuristically identify malware prior to allowing a user
|
|
to execute it.
|
|
|
|
The use of custom code can also make it possible for tools to be developed
|
|
that attempt to identify unpackers based on their behavior. Ero Carrera has
|
|
provided some excellent illustrations relating to the feasibility of this type
|
|
of attack against unpackers[1]. An understanding of an unpacker's behavior may
|
|
also make it possible to acquire the original binary without allowing it to
|
|
actually execute by simply tracing the unpacker up until the point where it
|
|
transfers execution control to the original binary. In the case of malware,
|
|
this weakness means that benefits gained from packing an executable can be
|
|
completely nullified.
|
|
|
|
Both of these problems are meant to illustrate that even though custom unpacking
|
|
code is often a requirement, its mere presence exposes a potential point of
|
|
weakness. If it were possible to eliminate the custom code required to unpack
|
|
a binary, it could make the two problems described previously much more difficult
|
|
to realize. To that point, the technique described in this paper does not
|
|
rely on the presence of custom code in a packed binary in order to unpack
|
|
itself. Instead, documented behavior of the dynamic loader is used to perform
|
|
the unpacking whenever the packed binary is executed. While this approach has
|
|
its benefits, there are a number of problems with it that will be discussed
|
|
later on. In the interest of brevity, the packer described in this paper will
|
|
simply be referred to as locreate. As was already mentioned,
|
|
locreate leverages a documented feature of most dynamic loaders in order to
|
|
perform its unpacking operation. Given that the process of unpacking
|
|
typically involves transforming the original binary's contents back into its
|
|
original form, there are only a finite number of dynamic loader features that
|
|
might be abused. Perhaps the feature that is best suited for transforming the
|
|
contents of a binary at runtime is the dynamic loader feature that was
|
|
designed to do just that: relocations.
|
|
|
|
In the event that a binary is unable to be loaded at its preferred base
|
|
address at runtime, the dynamic loader is responsible for attempting to move
|
|
the binary to another location in memory. The act of moving a binary from its
|
|
preferred base address to a new base address is more commonly referred to as
|
|
relocating. When a binary is relocated to a new base address, any references
|
|
the binary might have to addresses that are relative to its preferred base
|
|
address will no longer be valid. As such, references that are relative to the
|
|
preferred base address must be updated by the dynamic loader in order to make
|
|
them relative to the new base address. Of course, this presupposes that the
|
|
dynamic loader has some knowledge of where in the binary these address
|
|
references are made. To satisfy this presupposition, binaries will typically
|
|
include relocation information to provide the dynamic loader with a map to the
|
|
locations within the binary that need to be adjusted. When a binary does not
|
|
include relocation information, it's classified as a non-relocatable binary.
|
|
Without relocation information, a binary cannot be relocated to an alternate
|
|
base address in an elegant manner (ignoring position independent executables).
|
|
|
|
The structures used to convey relocation information differs from one binary
|
|
format to the next. For the purpose of this paper, only the structures used
|
|
to describe relocations of Portable Executable (PE) binaries will be
|
|
discussed. However, it should be noted that the approaches described in this
|
|
paper should be equally applicable to other binary formats, such as ELF. In
|
|
fact, other binary formats make the technique used by locreate even easier.
|
|
For example, ELF supports applying relocation fixups with an addend. This
|
|
addend is basically an arbitrary value that is used in conjunction with a
|
|
transformation. The PE binary format conveys relocation information through
|
|
one of the data directories that is included within the optional header
|
|
portion of the NT header. This data directory is symbolically referred to
|
|
through the use of the IMAGE_DIRECTORY_ENTRY_BASERELOC. The base relocation
|
|
data directory consists of zero or more IMAGE_BASE_RELOCATION structures which
|
|
are defined as:
|
|
|
|
typedef struct _IMAGE_BASE_RELOCATION {
|
|
ULONG VirtualAddress;
|
|
ULONG SizeOfBlock;
|
|
// USHORT TypeOffset[1];
|
|
} IMAGE_BASE_RELOCATION, *PIMAGE_BASE_RELOCATION;
|
|
|
|
The base relocation data directory is a little bit different from most other
|
|
data directories. The IMAGE_BASE_RELOCATION structures embedded in the data
|
|
directory do not occur immediately one after the other. Instead, there are a
|
|
variable number of USHORT sized fixup descriptors that separate each
|
|
structure. The SizeOfBlock attribute of each structure describes the entire
|
|
size of a relocation block. Each relocation block consists of the base
|
|
relocation structure and the variable number of fixup descriptors. Therefore,
|
|
enumeration of the base relocation data directory is best performed by using
|
|
the SizeOfBlock attribute of each structure to proceed to the next relocation
|
|
block until none are remaining. The VirtualAddress attribute of each
|
|
relocation block is a page-aligned relative virtual address (RVA) that is used
|
|
as the base address when processing its associated fixup descriptors. In this
|
|
manner, each relocation block describes the relocations that should be applied
|
|
to exactly one page.
|
|
|
|
The fixup descriptors contained within a relocation block describe the address
|
|
of the value that should be transformed and the method that should be used to
|
|
transform it. The PE format describes about 10 different transformations that
|
|
can be used to fixup an address reference. These transformations are conveyed
|
|
through the top 4 bits of each fixup descriptor. The bottom 12 bits are used
|
|
to describe the offset into the VirtualAddress of the containing relocation
|
|
block. Adding the bottom 12 bits of a fixup descriptor to the VirtualAddress
|
|
of a relocation block produces the RVA that contains a value that needs to be
|
|
transformed. Of the transformation methods that exist, the one most commonly
|
|
used on x86 is IMAGE_REL_BASED_HIGHLOW, or 3. This transformation dictates that
|
|
the 32-bit displacement between the original base address and the new base
|
|
address should be added to the value that exists at the RVA described by the
|
|
fixup descriptor. The act of adding the displacement means that the value
|
|
will be transformed to make it relative to the new base address rather than
|
|
the original base address. To better understand how all of these things tie
|
|
together, consider the following source code example:
|
|
|
|
#include <stdlib.h>
|
|
#include <stdio.h>
|
|
|
|
int main(int argc, char **argv)
|
|
{
|
|
printf("Hello World.\n");
|
|
|
|
return 0;
|
|
}
|
|
|
|
When compiled down, this function appears as the following:
|
|
|
|
sample!main:
|
|
00401010 55 push ebp
|
|
00401011 8bec mov ebp,esp
|
|
00401013 6800104200 push offset sample!__rtc_tzz <PERF> (sample+0x21000) (00421000)
|
|
00401018 e80c000000 call sample!printf (00401029)
|
|
0040101d 83c404 add esp,4
|
|
00401020 33c0 xor eax,eax
|
|
00401022 5d pop ebp
|
|
00401023 c3 ret
|
|
|
|
At address 0x00401013, main pushes the address of the string that contains
|
|
``Hello World!'':
|
|
|
|
0:000> db 00421000 L 10
|
|
00421000 48 65 6c 6c 6f 20 57 6f-72 6c 64 2e 0a 00 00 00 Hello World.....
|
|
|
|
In this case, the push instruction is referring to the string using an
|
|
absolute address. If the sample executable must be relocated at runtime, the
|
|
dynamic loader must be provided with the relocation information necessary to
|
|
fixup the reference to the absolute address. The dumpbin.exe utility from
|
|
Visual Studio can be used to confirm that this information exists. The first
|
|
requirement is that the binary must have relocation information. By default,
|
|
all DLLs will contain relocation information, but executables typically do
|
|
not. Executables can be compiled with relocation information by using the
|
|
/fixed:no linker flag. When a binary is compiled with relocations, the
|
|
presence of relocation information is simply indicated by a non-zero
|
|
VirtualAddress and Size for the base relocation data directory. These values
|
|
can be determined through dumpbin.exe /headers:
|
|
|
|
26000 [ EE8] RVA [size] of Base Relocation Directory
|
|
|
|
Since relocation information must be present at runtime, there should also be
|
|
a section, typically named .reloc, that contains the virtual mapping
|
|
information for the relocation information:
|
|
|
|
SECTION HEADER #5
|
|
.reloc name
|
|
1165 virtual size
|
|
26000 virtual address (00426000 to 00427164)
|
|
2000 size of raw data
|
|
24000 file pointer to raw data (00024000 to 00025FFF)
|
|
0 file pointer to relocation table
|
|
0 file pointer to line numbers
|
|
0 number of relocations
|
|
0 number of line numbers
|
|
42000040 flags
|
|
Initialized Data
|
|
Discardable
|
|
Read Only
|
|
|
|
In order to validate that this executable contains relocation information for
|
|
the absolute address reference made to the ``Hello World!'' string, the
|
|
dumpbin.exe /relocations command can be used:
|
|
|
|
File Type: EXECUTABLE IMAGE
|
|
|
|
BASE RELOCATIONS #5
|
|
1000 RVA, A8 SizeOfBlock
|
|
14 HIGHLOW 00421000
|
|
2C HIGHLOW 00420350
|
|
...
|
|
|
|
This output shows the first relocation block which describes the RVA 0x1000.
|
|
Each line below the relocation block header describes the individual fixup
|
|
descriptors. The information displayed includes the offset into the page, the
|
|
type of transformation being performed, and the current value at that location
|
|
in the binary. From the disassembly above, the location of the address
|
|
reference that is being made is 0x00401014. Therefore, the very first fixup
|
|
in this relocation block provides the dynamic loader within the information
|
|
necessary to change the address reference to the new base address when the
|
|
binary is relocated. If this binary were to be relocated to 0x50000000, the
|
|
HIGHLOW transformation would be applied to 0x00401014 as follows. The
|
|
displacement between the new base address and the old address would be
|
|
calculated as 0x50000000 - 0x00400000, or 0x4fc00000. Adding 0x4fc00000 to
|
|
the existing value of 0x00421000 produces 0x50021000 which is subsequently
|
|
stored in 0x00401014. This causes the absolute address reference to become
|
|
relative to the new base address.
|
|
|
|
Based on this basic understanding of how relocations are processed, it's now
|
|
possible to describe how a packer can be implemented that takes advantage of
|
|
the way the dynamic loader processes relocation information. As has been
|
|
illustrated above, relocation information is designed to make it possible to
|
|
fixup absolute address references at runtime when a binary is relocated.
|
|
These fixups are applied by taking into account the displacement between the
|
|
new base address and the original base address. More often than not, this
|
|
displacement isn't known ahead of time, thus making it impossible to reliably
|
|
predict how the content at a specific location in the binary will be altered.
|
|
But what if it were possible to deterministically know the displacement in
|
|
advance? Knowing the displacement in advance would make it possible to alter
|
|
various locations of the binary in a manner that would permit the original
|
|
values to be restored by relocations at runtime. In effect, the on-disk
|
|
version of the binary could be made to appear quite different from the
|
|
in-memory version at runtime. This is the basic concept behind locreate.
|
|
|
|
In order for locreate to work it must be possible to predict the displacement
|
|
reliably. Since the displacement is calculated in relation to the preferred
|
|
base address and the expected base address, both values must be known.
|
|
Furthermore, the binary must be relocated every time it executes in order for
|
|
the relocations to be applied. As it happens, both of these problems can be
|
|
solved at once. Since a binary is only guaranteed to be relocated if its
|
|
preferred base address is in conflict with an existing address, a preferred
|
|
base address must be selected that will always lead to a conflict. This can
|
|
be accomplished by setting the preferred base address to any invalid user-mode
|
|
address (any address above 0x80000000 inclusive). This assumes that the machine
|
|
that the executable will run on is not running with /3GB. If so, a higher
|
|
address would have to be used.. Alternatively, the base address can be set to
|
|
SharedUserData which is guaranteed to be located at 0x7ffe0000 in every
|
|
process. Setting the binary's preferred base address to any of these
|
|
addresses will force it to be relocated every time it executes. The only
|
|
unknown is what address the binary is expected to be relocated to.
|
|
|
|
Determining the address that will be relocated to depends on the state of the
|
|
process' address space at the time that the binary is relocated. If the
|
|
binary that's being relocated is an executable, then the process' address
|
|
space is generally in a pristine state since the executable is one of the
|
|
first things to be mapped into the address space. As such, the first
|
|
available address will always be 0x10000 on default installations of Windows.
|
|
If the binary is a DLL, it's hard to predict what the state of the address
|
|
space will be in all cases. When a conflict does occur, the kernel searches
|
|
for an available address region by traversing from lowest to highest address.
|
|
For the purposes of this paper, it will be assumed that an executable is being
|
|
packed and that the address being relocated to is 0x10000. Further research
|
|
may provide insight into how to better control or alter the expected base
|
|
address.
|
|
|
|
With both the preferred base address and the expected base address known, the
|
|
only thing that remains is to perform the operations that will transform the
|
|
on-disk version of the binary in a manner that causes custom relocations to
|
|
restore the binary to its original form at runtime. This process can be both
|
|
simplistic and complicated. The simplest approach would be to enumerate over
|
|
the contents of each section in the binary, altering the value at each
|
|
location by subtracting the displacement and then creating a relocation fixup
|
|
descriptor that will ensure that the contents are restored to the expected
|
|
value at runtime. This is how the proof of concept works. A more complicated
|
|
approach would be to create multiple relocation fixup descriptors per-address.
|
|
This would mean that the displacement would need to be subtracted once for
|
|
each fixup descriptor. It should also be possible to apply relocations to
|
|
individual bytes within a four byte span rather than applying relocations in
|
|
four byte increments. Even more interesting would be to use some fixup types
|
|
other than HIGHLOW, although this could be seen as something that might make
|
|
generating a signature easier.
|
|
|
|
The end result of this whole process is a functional proof of concept that
|
|
packs a binary in the manner described above. To get a feel for how different
|
|
the binary looks after being packed, consider what the implementation of main
|
|
from earlier in this paper looks like. Notice how the first two instructions
|
|
are the same as they were previously. This has to do with the fact that base
|
|
addresses must align on 64KB boundaries, and thus the lower two bottoms are
|
|
not changed. This could be further improved such as through the strategies
|
|
described above:
|
|
|
|
.text:84011000 loc_84011000:
|
|
.text:84011000 push ebp
|
|
.text:84011001 mov ebp, esp
|
|
.text:84011003 in al, dx
|
|
.text:84011004 add [eax+0], dh
|
|
.text:84011006 add [edi+edi*8+1209C15h], eax
|
|
.text:8401100D test [ebx-3FCCFB3Ch], al
|
|
.text:84011013 loope near ptr 84010FD8h
|
|
.text:84011015
|
|
.text:84011015 loc_84011015:
|
|
.text:84011015 push (offset off_8401139C+1)
|
|
|
|
The locreate proof of concept has been tested on Windows XP and Windows 2003
|
|
Server. Initial testing on Windows Vista indicates that Vista does not
|
|
properly alter the entry point address after relocations have been applied
|
|
when an executable is packed. Even though the proof of concept implementation
|
|
works, there are a number of more fundamental problems with the technique
|
|
itself.
|
|
|
|
The first set of problems has to do with techniques that can be used to
|
|
signature locreate packed executables. Since locreate relies on injecting a
|
|
large number of relocation fixups, it may be possible to heuristically detect
|
|
an increased number of relocation fixups with relation to the size of
|
|
individual segments. This particular attack could be solved by decreasing the
|
|
number of relocation fixups injected by locreate. This would have the effect
|
|
of only partially mangling the binary, but it might be enough to make people
|
|
wonder what's going on without giving things away. Even if it weren't
|
|
possible to heuristically detect an increased number of relocation fixups,
|
|
it's definitely possible to detect the fact that an executable packed by
|
|
locreate will have an invalid preferred base address that will always result
|
|
in a conflict. This fact alone makes it mostly trivial to at least detect
|
|
that something odd is going on.
|
|
|
|
Detection is only the first problem, however. Once a locreate packed
|
|
executable has been detected, the next logical step is to attempt to figure
|
|
out some way of obtaining the original executable. Since locreate relies on
|
|
relocation fixups to do this, the only thing one would have to do in order to
|
|
obtain the original binary would be to relocate the executable to the expected
|
|
base address that was used when the binary was packed, such as 0x10000. While
|
|
it's trivial to develop tools to perform this action, the Interactive
|
|
Disassembler (IDA) already supports it. When opening an executable, the
|
|
``Manual Load'' checkbox can be toggled. This will cause IDA to prompt the
|
|
user to enter the base address that the binary should be loaded at. When the
|
|
base address is entered, IDA processes relocations and presents the relocated
|
|
binary image. The mitigating factor here is that the user must know the
|
|
expected base address, otherwise the binary will still appear completely
|
|
mangled when it's relocated to the wrong base address.
|
|
|
|
In the author's opinion, these problems make locreate a sub-par packer. At
|
|
best it should be viewed as an interesting approach to the problem of packing
|
|
executables, but it should not be relied upon as a means of thwarting static
|
|
analysis. Anyone who reads this paper will have the tools necessary to unpack
|
|
executables that have been packed by locreate. With that said, it should be
|
|
noted that there is still an opportunity for further research that could help
|
|
to identify ways of improving locreate. For instance, a better understanding
|
|
of differences in the way the dynamic loader and existing static analysis
|
|
tools process relocation fixups could provide some opportunity for
|
|
improvement. Results from some of the author's initial tests of these ideas
|
|
are included in appendix A. Here's a brief list of some differences that could
|
|
exist:
|
|
|
|
1. Different behaviors when processing fixups
|
|
|
|
It's possible that the dynamic loader and static analysis tools such as IDA
|
|
may not support the same set of fixup types. Furthermore, they may not
|
|
process fixup types in the same way. If differences do exist, it may be
|
|
possible to create a packed executable that will work correctly when used
|
|
against the dynamic loader but not render properly when relocated using a
|
|
static analysis tool such as IDA.
|
|
|
|
2. Relocation blocks with non-page-aligned VirtualAddress fields
|
|
|
|
It's unknown whether or not the dynamic loader and static analysis tools are
|
|
able to properly handle relocation blocks that have non-page-aligned
|
|
VirtualAddress's. In all normal circumstances, VirtualAddress will be
|
|
page aligned.
|
|
|
|
3. Relocation blocks that modify other relocation blocks
|
|
|
|
An interesting situation that may lead to differences between the dynamic
|
|
loader and static analysis tools has to do with relocation blocks that modify
|
|
other relocation blocks. In this way, the relocation information that exists
|
|
on disk is not what is actually used, in its entirety, when relocating an
|
|
image during runtime.
|
|
|
|
Even if research into these topics doesn't yield any direct improvements to
|
|
locreate, it should nonetheless provide some interesting insight into the way
|
|
that different applications handle relocation processing. And after all,
|
|
gaining knowledge is what it's really all about.
|
|
|
|
Appendix A) Differences in Relocation Processing
|
|
|
|
This appendix attempts to describe some tests that were run on different
|
|
applications that process relocation entries for binary files. Identifying
|
|
differences may make it possible to have a binary that will work correctly
|
|
when executed but not when analyzed by a static analysis tool such as IDA. To
|
|
test out these ideas, the author threw together a small relocation fuzzing
|
|
tool that is aptly named relocfuzz. This tool will take a pre-existing binary
|
|
and create a new one with custom relocations. The code for this tool can be
|
|
found in the other code associated with this paper.
|
|
|
|
The tests included in this appendix were performed against three different
|
|
applications: the dynamic loader (ntdll.dll), IDA, and dumpbin. If the same
|
|
tests are run against other applications, the author would be interested in
|
|
knowing the results.
|
|
|
|
A.1) Non-page-aligned Block VirtualAddress
|
|
|
|
In all normal cases, relocation blocks will be created with a page-aligned
|
|
VirtualAddress. However, it's unclear if non-page-aligned VirtualAddress
|
|
fields will be handled correctly when relocations are processed. There are
|
|
some interesting implications of non-page-aligned VirtualAddress's. In many
|
|
applications, such as the dynamic loader, it's critical that addresses
|
|
referenced through RVAs are validated so as to prevent references being made
|
|
to external addresses. For example, if relocations were processed in
|
|
kernel-mode, it would be critical that checks be performed to ensure that RVAs
|
|
don't end up making it possible to reference kernel-mode addresses. The
|
|
reason why non-page-aligned VirtualAddress's are interesting is because they
|
|
leave open the possibility of this type of attack.
|
|
|
|
Consider the scenario of a binary that is relocated to 0x7ffe0000, ignoring
|
|
for the moment that SharedUserData already exists at this location. Now,
|
|
consider that this binary has a relocation block with a virtual address of
|
|
0x1ffff. This address is not page-aligned. Now, consider that this
|
|
relocation block has a fixup descriptor that indicates that at offset 0x4 into
|
|
this page, a certain type of fixup should be performed. This would equate to
|
|
modifying memory at 0x80000003, a kernel-mode address. If relocations were
|
|
being processed in kernel-mode, like they are on Windows Vista for ASLR, then
|
|
a failure to check that the actual address being written to would result in a
|
|
dangerous condition.
|
|
|
|
Here's an example of some code that attempts to test out this idea:
|
|
|
|
static VOID TestNonPageAlignedBlocks(
|
|
__in PPE_IMAGE Image,
|
|
__in PRELOC_FUZZ_CONTEXT FuzzContext)
|
|
{
|
|
PRELOCATION_BLOCK_CONTEXT KillerBlock = AllocateRelocationBlockContext(1);
|
|
|
|
PrependRelocationBlockContext(
|
|
FuzzContext,
|
|
KillerBlock);
|
|
|
|
KillerBlock->Rva = 0x10001;
|
|
KillerBlock->Fixups[0] = (3 << 12) | 0;
|
|
}
|
|
|
|
In this example, a custom relocation block is created with one fixup
|
|
descriptor. The VirtualAddress associated with the block is set to 0x10001
|
|
and the first fixup descriptor is set to modify offset 0 into that RVA. If
|
|
the binary that is hosting these relocations is relocated to 0x10000, a write
|
|
should occur to 0x20001 when processing the relocations. Here are the results
|
|
from a few initial tests:
|
|
|
|
ntdll.dll: The relocation fixup is processed and results in a write
|
|
to 0x20001.
|
|
|
|
IDA: Ignores the relocation fixup, but only because it writes outside of the
|
|
executable from what it would appear.
|
|
|
|
dumpbin.exe: Parses the relocation block without issue.
|
|
|
|
A.2) Writing to External Addresses
|
|
|
|
Due to the fact that the VirtualAddress associated with each relocation block
|
|
is a 32-bit RVA, it is possible to create relocation blocks that have RVAs
|
|
that actually reside outside of the mapped executable that is being relocated.
|
|
This is important because if steps aren't taken to detect this scenario, the
|
|
application processing the relocation fixups might be tricked into writing to
|
|
memory that is external to the mapped binary. Creating a test-case for this
|
|
example is trivial:
|
|
|
|
static VOID CreateExternalWriteRelocationBlock(
|
|
__in PPE_IMAGE Image,
|
|
__in PRELOC_FUZZ_CONTEXT FuzzContext)
|
|
{
|
|
PRELOCATION_BLOCK_CONTEXT ExtBlock = AllocateRelocationBlockContext(2);
|
|
|
|
ExtBlock->Rva = 0x10000;
|
|
ExtBlock->Fixups[0] = (3 << 12) | 0x0;
|
|
ExtBlock->Fixups[1] = (3 << 12) | 0x1;
|
|
|
|
PrependRelocationBlockContext(
|
|
FuzzContext,
|
|
ExtBlock);
|
|
}
|
|
|
|
In this test, a relocation block is created that has a VirtualAddress of
|
|
0x10000. When the binary is relocated to 0x10000, the actual address of the
|
|
region that will be written to is 0x20000. In almost all versions of Windows
|
|
NT, this address is the location of the process parameters structure. The
|
|
block itself contains two fixup descriptors, each of which will result in a
|
|
write to the first few bytes of the process parameters structure. The results
|
|
after running this test are:
|
|
|
|
ntdll.dll: The relocation fixup is processed and results in two 32-bit writes
|
|
to 0x20000 and 0x20001.
|
|
|
|
IDA: Ignores RVAs outside of the executable.
|
|
|
|
dumpbin.exe: N/A, dumpbin doesn't actually perform relocation fixups.
|
|
|
|
A.3) Self-updating Relocation Blocks
|
|
|
|
One of the more interesting nuisances about the way relocation fixups are
|
|
processed is that it's actually possible to create a relocation block that
|
|
will perform fixups against other relocation blocks. This has the effect of
|
|
making it such that the relocation information that appears on disk is
|
|
actually different than what is processed when relocation fixups are applied.
|
|
The basic idea behind this approach is to prepend certain relocation blocks
|
|
that apply fixups to subsequent relocation blocks. This all works because
|
|
relocation blocks are typically processed in the order that they appear. An
|
|
example of this basic concept is described shown below:
|
|
|
|
static VOID PrependSelfUpdatingRelocations(
|
|
__in PPE_IMAGE Image,
|
|
__in PRELOC_FUZZ_CONTEXT FuzzContext)
|
|
{
|
|
PRELOCATION_BLOCK_CONTEXT SelfBlock;
|
|
PRELOCATION_BLOCK_CONTEXT RealBlock;
|
|
ULONG RelocBaseRva;
|
|
ULONG NumberOfBlocks = FuzzContext->NumberOfBlocks;
|
|
ULONG Count;
|
|
|
|
//
|
|
// Grab the base address that relocations will be loaded at
|
|
//
|
|
RelocBaseRva = FuzzContext->BaseRelocationSection->VirtualAddress;
|
|
|
|
//
|
|
// Grab the first block before we start prepending
|
|
//
|
|
RealBlock = FuzzContext->NewRelocationBlocks;
|
|
|
|
//
|
|
// Prepend self-updating relocation blocks for each block that exists
|
|
//
|
|
for (Count = 0; Count < NumberOfBlocks; Count++)
|
|
{
|
|
PRELOCATION_BLOCK_CONTEXT RelocationBlock;
|
|
|
|
RelocationBlock = AllocateRelocationBlockContext(2);
|
|
|
|
PrependRelocationBlockContext(
|
|
FuzzContext,
|
|
RelocationBlock);
|
|
}
|
|
|
|
//
|
|
// Walk through each self updating block, fixing up the real blocks to
|
|
// account for the amount of displacement that will be added to their Rva
|
|
// attributes.
|
|
//
|
|
for (SelfBlock = FuzzContext->NewRelocationBlocks, Count = 0;
|
|
Count < NumberOfBlocks;
|
|
Count++, SelfBlock = SelfBlock->Next, RealBlock = RealBlock->Next)
|
|
{
|
|
SelfBlock->Rva = RelocBaseRva + RealBlock->RelocOffset;
|
|
|
|
//
|
|
// We'll relocate the two least significant bytes of the real block's RVA
|
|
// and SizeOfBlock.
|
|
//
|
|
SelfBlock->Fixups[0] = (USHORT)((IMAGE_REL_BASED_HIGHLOW << 12) |
|
|
(((RealBlock->RelocOffset - 2) & 0xfff)));
|
|
SelfBlock->Fixups[1] = (USHORT)((IMAGE_REL_BASED_HIGHLOW << 12) |
|
|
(((RealBlock->RelocOffset + 2) & 0xfff)));
|
|
SelfBlock->Rva &= ~(PAGE_SIZE-1);
|
|
|
|
//
|
|
// Account for the amount that will be added by the dynamic loader after
|
|
// the first self-updating relocation blocks are processed.
|
|
//
|
|
*(PUSHORT)(&RealBlock->Rva) -= (USHORT)(FuzzContext->Displacement >> 16) + 2;
|
|
*(PUSHORT)(&RealBlock->SizeOfBlock) -= (USHORT)(FuzzContext->Displacement >> 16) + 2;
|
|
}
|
|
}
|
|
|
|
This test works by prepending a self-updating relocation block for each
|
|
relocation block that exists in the binary. In this way, if there were two
|
|
relocations blocks that already existed, two self-updating relocation blocks
|
|
would be prepended, one for each of the two existing relocation blocks.
|
|
Following that, the self-updating relocation blocks are populated. Each
|
|
self-updating relocation block is created with two fixup descriptors. These
|
|
fixup descriptors are used to apply fixups to the VirtualAddress and
|
|
SizeOfBlock attributes of its corresponding existing relocation block. Since
|
|
a HIGHLOW fixup only applies to two most significant bytes, the RVAs of the
|
|
corresponding fields are adjusted down by two. The end result of this
|
|
operation is that the first n relocation blocks are responsible for fixing up
|
|
the VirtualAddress and SizeOfBlock attributes associated with subsequent
|
|
relocation blocks. When relocations are processed in a linear fashion, the
|
|
subsequent relocation blocks are updated in a way that allows them to be
|
|
processed correctly.
|
|
|
|
Running this test against the set of test applications produces the following
|
|
results:
|
|
|
|
ntdll.dll: The relocation blocks are fixed up accordingly and the application
|
|
executes as expected.
|
|
|
|
IDA: Initial testing indicates that IDA is capable of handling self-updating
|
|
relocation blocks.
|
|
|
|
dumpbin.exe: Crashes as the result of apparently corrupt relocation blocks:
|
|
|
|
DUMPBIN : fatal error LNK1000:
|
|
Internal error during
|
|
DumpBaseRelocations
|
|
|
|
Version 8.00.50727.42
|
|
|
|
ExceptionCode = C0000005
|
|
ExceptionFlags = 00000000
|
|
ExceptionAddress = 00443334
|
|
NumberParameters = 00000002
|
|
ExceptionInformation[ 0] = 00000000
|
|
ExceptionInformation[ 1] = 7FFA2000
|
|
|
|
CONTEXT:
|
|
Eax = 0000000A Esp = 0012E500
|
|
Ebx = 00004F00 Ebp = 00000000
|
|
Ecx = 7FFA2000 Esi = 00000000
|
|
Edx = 781C3B68 Edi = 7FFA2000
|
|
Eip = 00443334 EFlags = 00010293
|
|
SegCs = 0000001B SegDs = 00000023
|
|
SegSs = 00000023 SegEs = 00000023
|
|
SegFs = 0000003B SegGs = 00000000
|
|
Dr0 = 00000000 Dr3 = 00000000
|
|
Dr1 = 00000000 Dr6 = 00000000
|
|
Dr2 = 00000000 Dr7 = 00000000
|
|
|
|
A.4) Integer Overflows in Size Calculations
|
|
|
|
A potential source of mistakes that could be made when processing relocations
|
|
has to do with the handling of the SizeOfBlock attribute of a relocation
|
|
block. There is a potential for an integer overflow to occur in applications
|
|
that don't properly handle situations where the SizeOfBlock attribute is less
|
|
than the size of the base relocation structure (which is 8 bytes). In order
|
|
to calculate the total number of fixups in a section, it's common to see a
|
|
calculation like (Block->SizeOfBlock - 8) / 2. However, if a check isn't made
|
|
to ensure that SizeOfBlock is at least 8, an integer overflow will occur. If
|
|
this happens, the application processing relocations would be tricked into
|
|
processing a very large number of relocations. An example of a test for this
|
|
issue is shown below:
|
|
|
|
static VOID TestIntegerOverflow(
|
|
__in PPE_IMAGE Image,
|
|
__in PRELOC_FUZZ_CONTEXT FuzzContext)
|
|
{
|
|
PRELOCATION_BLOCK_CONTEXT EvilBlock = AllocateRelocationBlockContext(0);
|
|
|
|
EvilBlock->SizeOfBlock = 0;
|
|
EvilBlock->Rva = 0x1000;
|
|
|
|
PrependRelocationBlockContext(
|
|
FuzzContext,
|
|
EvilBlock);
|
|
}
|
|
|
|
In this example, a relocation block is created that has its SizeOfBlock
|
|
attribute set to zero. This is invalid because the minimum size of a block is
|
|
8 bytes. The results of this test against different applications are shown
|
|
below:
|
|
|
|
ntdll.dll: Does not perform appropriate checks which appears to result in an
|
|
integer overflow:
|
|
|
|
(9d4.6dc): Access violation - code c0000005 (first chance)
|
|
First chance exceptions are reported before any exception handling.
|
|
This exception may be expected and handled.
|
|
eax=00000000 ebx=00014008 ecx=00011000 edx=80010000 esi=00015000 edi=ffffffff
|
|
eip=7c91e163 esp=0013fa98 ebp=0013faac iopl=0 nv up ei pl nz na pe nc
|
|
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010206
|
|
ntdll!LdrProcessRelocationBlockLongLong+0x1a:
|
|
7c91e163 0fb706 movzx eax,word ptr [esi] ds:0023:00015000=????
|
|
|
|
IDA: Ignores the relocation block, but may not process relocations correctly
|
|
as a result (unclear at this point).
|
|
|
|
dumpbin.exe: Refuses to show relocations:
|
|
|
|
Microsoft (R) COFF/PE Dumper Version 8.00.50727.42
|
|
Copyright (C) Microsoft Corporation. All rights reserved.
|
|
|
|
Dump of file foo.exe
|
|
|
|
File Type: EXECUTABLE IMAGE
|
|
|
|
BASE RELOCATIONS #4
|
|
|
|
Summary
|
|
|
|
1000 .data
|
|
1000 .rdata
|
|
1000 .reloc
|
|
1000 .text
|
|
|
|
A.5) Consistent Handling of Fixup Types
|
|
|
|
Applications that process relocation fixups may also differ in their level of
|
|
support for different types of fixups. While most binaries today use the
|
|
HIGHLOW fixup exclusively, there are still quite a few other types of fixups
|
|
that can be applied. If differences in the way relocation fixups are
|
|
processed can be identified, it may be possible to create a binary that
|
|
relocates correctly in one application but not in another application. The
|
|
following code demonstrates an example of this type of test:
|
|
|
|
static VOID TestConsistentRelocations(
|
|
__in PPE_IMAGE Image,
|
|
__in PRELOC_FUZZ_CONTEXT FuzzContext)
|
|
{
|
|
PRELOCATION_BLOCK_CONTEXT Block = AllocateRelocationBlockContext(16);
|
|
ULONG Rva = FuzzContext->BaseRelocationSection->VirtualAddress;
|
|
INT Index;
|
|
|
|
PrependRelocationBlockContext(
|
|
FuzzContext,
|
|
Block);
|
|
|
|
Block->Rva = 0x1000;
|
|
|
|
for (Index = 0; Index < 16; Index++)
|
|
{
|
|
//
|
|
// Skip invalid fixup types
|
|
//
|
|
if ((Index >= 6 && Index <= 8) ||
|
|
(Index >= 0xb && Index <= 0x10))
|
|
continue;
|
|
|
|
Block->Fixups[Index] = (Index << 12) | Index;
|
|
}
|
|
}
|
|
|
|
This test works by prepending a relocation block that contains a relocation
|
|
fixup for each different valid fixup type. This results in a relocation block
|
|
that looks something like this:
|
|
|
|
BASE RELOCATIONS #4
|
|
1000 RVA, 28 SizeOfBlock
|
|
0 ABS
|
|
1 HIGH EC8B
|
|
2 LOW 8BEC
|
|
3 HIGHLOW 5008458B
|
|
4 HIGHADJ 0845 (5005)
|
|
0 ABS
|
|
0 ABS
|
|
0 ABS
|
|
9 IMM64
|
|
A DIR64 8000209C15FF8000
|
|
0 ABS
|
|
0 ABS
|
|
0 ABS
|
|
0 ABS
|
|
0 ABS
|
|
|
|
The results for this test are shown below:
|
|
|
|
|
|
ntdll.dll: While not confirmed, it is assumed that the dynamic loader performs
|
|
all fixup types correctly. This results in the following code being produced
|
|
in the test binary:
|
|
|
|
foo+0x1000:
|
|
00011000 55 push ebp
|
|
00011001 8c6c8b46 mov word ptr [ebx+ecx*4+46h],gs
|
|
00011005 895068 mov dword ptr [eax+68h],edx
|
|
00011008 1830 sbb byte ptr [eax],dh
|
|
0001100a 0100 add dword ptr [eax],eax
|
|
0001100c 00b69b200100 add byte ptr foo+0x209b (0001209b)[esi],dh
|
|
00011012 83c408 add esp,8
|
|
|
|
IDA: Appears to handle some relocation fixup types differently than the
|
|
dynamic loader. The result of IDA relocating the same binary results in the
|
|
following being produced:
|
|
|
|
.text:00011000 push ebp
|
|
.text:00011001 mov ebp, esp
|
|
.text:00011003 mov eax, [ebp+9]
|
|
.text:00011006 shr byte ptr [eax+18h], 1 ; "Called TestFunction()\n"
|
|
.text:00011009 xor [ecx], al
|
|
.text:00011009
|
|
.text:0001100B db 0
|
|
.text:0001100C
|
|
.text:0001100C add byte ptr ds:printf[esi], dl
|
|
.text:00011012 add esp, 8
|
|
|
|
Equates to:
|
|
|
|
.text:00011000 55 8B EC 8B 45 09 D0 68 18 30 01 00 00 96 9C 20
|
|
.text:00011010 01 00 83 C4 08 C7 05 50
|
|
|
|
dumpbin.exe: N/A, dumpbin doesn't actually perform relocation fixups.
|
|
|
|
A.6) Hijacking the Dynamic Loader
|
|
|
|
Since the dynamic loader in previous tests proved to be capable of writing to
|
|
areas of memory external to the executable binary, it makes sense to test to
|
|
see if it's possible to hijack execution control. One method of approaching
|
|
this would be to have the dynamic loader apply a relocation to the return
|
|
address of the function used to process relocations. When the function
|
|
returns, it'll transfer control to whatever address the relocations have
|
|
caused it to point to. An example of this code for this test is shown below:
|
|
|
|
static VOID TestHijackLoader(
|
|
__in PPE_IMAGE Image,
|
|
__in PRELOC_FUZZ_CONTEXT FuzzContext)
|
|
{
|
|
PRELOCATION_BLOCK_CONTEXT Block = AllocateRelocationBlockContext(1);
|
|
|
|
PrependRelocationBlockContext(
|
|
FuzzContext,
|
|
Block);
|
|
|
|
//
|
|
// Set the RVA to the address of the return address on the stack taking into
|
|
// account the displacement.
|
|
//
|
|
Block->Rva = 0x0012fab0;
|
|
Block->Fixups[0] = (3 << 12) | 0;
|
|
}
|
|
|
|
When a binary is executed that contains this relocation block, the dynamic
|
|
loader ends up applying a relocation to the return address located at
|
|
0x13fab0. Obviously, this address may be subject to change quite frequently,
|
|
but as a means of illustrating a proof of concept it should be sufficient.
|
|
And, just as one would expect, the dynamic loader does indeed overwrite the
|
|
return address and make it possible to gain control of execution:
|
|
|
|
(c88.184): Access violation - code c0000005 (first chance)
|
|
First chance exceptions are reported before any exception handling.
|
|
This exception may be expected and handled.
|
|
eax=0001400a ebx=00014008 ecx=0013fab0 edx=80010000 esi=00000001
|
|
edi=ffffffff eip=fc92e10b esp=0013fac8 ebp=0013fae4 iopl=0 nv up ei pl zr na pe nc
|
|
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
|
|
fc92e10b ?? ???
|
|
0:000> kv
|
|
ChildEBP RetAddr Args to Child
|
|
WARNING: Frame IP not in any known module. Following frames may be wrong.
|
|
0013fac4 00010000 00261f18 7ffdc000 80010000 0xfc92e10b
|
|
0013fae4 7c91e08c 00010000 00000000 00000000 image00010000
|
|
0013fb08 7c93ecd3 00010000 7c93f584 00000000 ntdll!LdrRelocateImage+0x1d (FPO: [Non-Fpo])
|
|
0013fc94 7c921639 0013fd30 7c900000 0013fce0 ntdll!LdrpInitializeProcess+0xea0 (FPO: [Non-Fpo])
|
|
0013fd1c 7c90eac7 0013fd30 7c900000 00000000 ntdll!_LdrpInitialize+0x183 (FPO: [Non-Fpo])
|
|
00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x7
|
|
|
|
Bibliography
|
|
|
|
[1] Carrera, Ero. Packer Tracing.
|
|
http://nzight.blogspot.com/2006/06/packer-tracing.html;
|
|
accessed Dec 15, 2006.
|
|
|
|
[2] Szor, Peter. Advanced Code Evolution Techniques and Computer Virus Generator Kits.
|
|
http://www.informit.com/articles/article.asp?p=366890&seqNum=3&rl=1;
|
|
accessed Jan 8, 2007.
|
|
|
|
[3] Szor, Peter. Tricky Relocations.
|
|
http://peterszor.com/resurrel.pdf;
|
|
accessed Jan 8, 2007.
|