mirror of https://github.com/fdiskyou/Zines.git
783 lines
42 KiB
Plaintext
783 lines
42 KiB
Plaintext
Preventing the Exploitation of SEH Overwrites
|
|
9/2006
|
|
skape
|
|
mmiller@hick.org
|
|
|
|
|
|
1) Foreword
|
|
|
|
Abstract: This paper proposes a technique that can be used to prevent
|
|
the exploitation of SEH overwrites on 32-bit Windows applications
|
|
without requiring any recompilation. While Microsoft has attempted to
|
|
address this attack vector through changes to the exception dispatcher
|
|
and through enhanced compiler support, such as with /SAFESEH and /GS,
|
|
the majority of benefits they offer are limited to image files that have
|
|
been compiled to make use of the compiler enhancements. This limitation
|
|
means that without all image files being compiled with these
|
|
enhancements, it may still be possible to leverage an SEH overwrite to
|
|
gain code execution. In particular, many third-party applications are
|
|
still vulnerable to SEH overwrites even on the latest versions of
|
|
Windows because they have not been recompiled to incorporate these
|
|
enhancements. To that point, the technique described in this paper does
|
|
not rely on any compile time support and instead can be applied at
|
|
runtime to existing applications without any noticeable performance
|
|
degradation. This technique is also backward compatible with all
|
|
versions of Windows NT+, thus making it a viable and proactive solution
|
|
for legacy installations.
|
|
|
|
Thanks: The author would like to thank all of the people who have helped
|
|
with offering feedback and ideas on this technique. In particular, the
|
|
author would like to thank spoonm, H D Moore, Skywing, Richard Johnson,
|
|
and Alexander Sotirov.
|
|
|
|
|
|
2) Introduction
|
|
|
|
Like other operating systems, the Windows operating system finds itself
|
|
vulnerable to the same classes of vulnerabilities that affect other
|
|
platforms, such as stack-based buffer overflows and heap-based buffer
|
|
overflows. Where the platforms differ is in terms of how these
|
|
vulnerabilities can be leveraged to gain code execution. In the case of
|
|
a conventional stack-based buffer overflow, the overwriting of the
|
|
return address is the most obvious and universal approach. However,
|
|
unlike other platforms, the Windows platform has a unique vector that
|
|
can, in many cases, be used to gain code execution through a stack-based
|
|
overflow that is more reliable than overwriting the return address.
|
|
This vector is known as a Structured Exception Handler (SEH) overwrite.
|
|
This attack vector was publicly discussed for the first time, as far as
|
|
the author is aware, by David Litchfield in his paper entitled Defeating
|
|
the Stack Based Buffer Overflow Prevention Mechanism of Microsoft
|
|
Windows 2003 Server However, exploits had been using this technique
|
|
prior to the publication, so it is unclear who originally found the
|
|
technique.
|
|
|
|
In order to completely understand how to go about protecting against SEH
|
|
overwrites, it's prudent to first spend some time describing the
|
|
intention of the facility itself and how it can be abused to gain code
|
|
execution. To provide this background information, a description of
|
|
structured exception handling will be given in section 2.1. Section 2.2
|
|
provides an illustration of how an SEH overwrite can be used to gain
|
|
code execution. If the reader already understands how structured
|
|
exception handling works and can be exploited, feel free to skip ahead.
|
|
The design of the technique that is the focus of this paper will be
|
|
described in chapter 3 followed by a description of a proof of concept
|
|
implementation in chapter 4. Finally, potential compatibility issues are
|
|
noted in chapter 5.
|
|
|
|
|
|
2.1) Structured Exception Handling
|
|
|
|
|
|
Structured Exception Handling (SEH) is a uninform system for dispatching
|
|
and handling exceptions that occur during the normal course of a
|
|
program's execution. This system is similar in spirit to the way that
|
|
UNIX derivatives use signals to dispatch and handle exceptions, such as
|
|
through SIGPIPE and SIGSEGV. SEH, however, is a more generalized and
|
|
powerful system for accomplishing this task, in the author's opinion.
|
|
Microsoft's integration of SEH spans both user-mode and kernel-mode and
|
|
is a licensed implementation of what is described in a patent owned by
|
|
Borland. In fact, this patent is one of the reasons why open source
|
|
operating systems have not chosen to integrate this style of exception
|
|
dispatching.
|
|
|
|
In terms of implementation, structured exception handling works by
|
|
defining a uniform way of handling all exceptions that occur during the
|
|
normal course of process execution. In this context, an exception is
|
|
defined as an event that occurs during execution that necessitates some
|
|
form of extended handling. There are two primary types of exceptions.
|
|
The first type, known as a hardware exception, is used to categorize
|
|
exceptions that originate from hardware. For example, when a program
|
|
makes reference to an invalid memory address, the processor will raise
|
|
an exception through an interrupt that gives the operating system an
|
|
opportunity to handle the error. Other examples of hardware exceptions
|
|
include illegal instructions, alignment faults, and other
|
|
architecture-specific issues. The second type of exception is known as
|
|
a software exception. A software exception, as one might expect,
|
|
originates from software rather than from the hardware. For example, in
|
|
the event that a process attempts to close an invalid handle, the
|
|
operating system may generate an exception.
|
|
|
|
One of the reasons that the word structured is included in structured
|
|
exception handling is because of the fact that it is used to dispatch
|
|
both hardware and software exceptions. This generalization makes it
|
|
possible for applications to handle all types of exceptions using a
|
|
common system, thus allowing for greater application flexibility when it
|
|
comes to error handling.
|
|
|
|
The most important detail of SEH, insofar as it pertains to this
|
|
document, is the mechanism through which applications can dynamically
|
|
register handlers to be called when various types of exceptions occur.
|
|
The act of registering an exception handler is most easily described as
|
|
inserting a function pointer into a chain of function pointers that are
|
|
called whenever an exception occurs. Each exception handler in the
|
|
chain is given the opportunity to either handle the exception or pass it
|
|
on to the next exception handler.
|
|
|
|
At a higher level, the majority of compiler-generated C/C++ functions
|
|
will register exception handlers in their prologue and remove them in
|
|
their epilogue. In this way, the exception handler chain mirrors the
|
|
structure of a thread's stack in that they are both LIFOs
|
|
(last-in-first-out). The exception handler that was registered last
|
|
will be the first to be removed from the chain, much the same as last
|
|
function to be called will be the first to be returned from.
|
|
|
|
To understand how the process of registering an exception handler
|
|
actually works in practice, it makes sense to analyze code that makes
|
|
use of exception handling. For instance, the code below illustrates what
|
|
would be required to catch all exceptions and then display the type of
|
|
exception that occurred:
|
|
|
|
|
|
__try
|
|
{
|
|
...
|
|
} __except(EXCEPTION_EXECUTE_HANDLER)
|
|
{
|
|
printf("Exception code: %.8x\n", GetExceptionCode());
|
|
}
|
|
|
|
In the event that an exception occurs from code inside of the try / except
|
|
block, the printf call will be issued and GetExceptionCode will return the
|
|
actual exception that occurred. For instance, if code made reference to an
|
|
invalid memory address, the exception code would be 0xc0000005, or
|
|
EXCEPTION_ACCESS_VIOLATION. To completely understand how this works, it is
|
|
necessary to dive deeper and take a look at the assembly that is generated from
|
|
the C code described above. When disassembled, the code looks something like
|
|
what is shown below:
|
|
|
|
|
|
00401000 55 push ebp
|
|
00401001 8bec mov ebp,esp
|
|
00401003 6aff push 0xff
|
|
00401005 6818714000 push 0x407118
|
|
0040100a 68a4114000 push 0x4011a4
|
|
0040100f 64a100000000 mov eax,fs:[00000000]
|
|
00401015 50 push eax
|
|
00401016 64892500000000 mov fs:[00000000],esp
|
|
0040101d 83c4f4 add esp,0xfffffff4
|
|
00401020 53 push ebx
|
|
00401021 56 push esi
|
|
00401022 57 push edi
|
|
00401023 8965e8 mov [ebp-0x18],esp
|
|
00401026 c745fc00000000 mov dword ptr [ebp-0x4],0x0
|
|
0040102d c6050000000001 mov byte ptr [00000000],0x1
|
|
00401034 c745fcffffffff mov dword ptr [ebp-0x4],0xffffffff
|
|
0040103b eb2b jmp ex!main+0x68 (00401068)
|
|
0040103d 8b45ec mov eax,[ebp-0x14]
|
|
00401040 8b08 mov ecx,[eax]
|
|
00401042 8b11 mov edx,[ecx]
|
|
00401044 8955e4 mov [ebp-0x1c],edx
|
|
00401047 b801000000 mov eax,0x1
|
|
0040104c c3 ret
|
|
|
|
0040104d 8b65e8 mov esp,[ebp-0x18]
|
|
00401050 8b45e4 mov eax,[ebp-0x1c]
|
|
00401053 50 push eax
|
|
00401054 6830804000 push 0x408030
|
|
00401059 e81b000000 call ex!printf (00401079)
|
|
0040105e 83c408 add esp,0x8
|
|
00401061 c745fcffffffff mov dword ptr [ebp-0x4],0xffffffff
|
|
00401068 8b4df0 mov ecx,[ebp-0x10]
|
|
0040106b 64890d00000000 mov fs:[00000000],ecx
|
|
00401072 5f pop edi
|
|
00401073 5e pop esi
|
|
00401074 5b pop ebx
|
|
00401075 8be5 mov esp,ebp
|
|
00401077 5d pop ebp
|
|
00401078 c3 ret
|
|
|
|
|
|
The actual registration of the exception handler all occurs behind the scenes
|
|
in the C code. However, in the assembly code, the registration of the
|
|
exception handler starts at 0x0040100a and spans four instructions. It is
|
|
these four instructions that are responsible for registering the exception
|
|
handler for the calling thread. The way that this actually works is by
|
|
chaining an EXCEPTION_REGISTRATION_RECORD to the front of the list of exception
|
|
handlers. The head of the list of already registered exception handlers is
|
|
found in the ExceptionList attribute of the NT_TIB structure. If no exception
|
|
handlers are registered, this value will be set to 0xffffffff. The NT_TIB
|
|
structure makes up the first part of the TEB, or Thread Environment Block,
|
|
which is an undocumented structure used internally by Windows to keep track of
|
|
per-thread state in user-mode. A thread's TEB can be accessed in a
|
|
position-independent fashion by referencing addresses relative to the fs
|
|
segment register. For example, the head of the exception list chain be be
|
|
obtained through fs:[0].
|
|
|
|
To make sense of the four assembly instructions that register the custom
|
|
exception handler, each of the four instructions will be described
|
|
individually. For reference purposes, the layout of the
|
|
EXCEPTION_REGISTRATION_RECORD is described below:
|
|
|
|
|
|
+0x000 Next : Ptr32 _EXCEPTION_REGISTRATION_RECORD
|
|
+0x004 Handler : Ptr32
|
|
|
|
|
|
1. push 0x4011a4
|
|
|
|
The first instruction pushes the address of the CRT generated excepthandler3
|
|
symbol. This routine is responsible for dispatching general exceptions that
|
|
are registered through the except compiler intrinsic. The key thing to note
|
|
here is that the virtual address of a function is pushed onto the stack that is
|
|
excepted to be referenced in the event that an exception is thrown. This push
|
|
operation is the first step in dynamically constructing an
|
|
EXCEPTION_REGISTRATION_RECORD on the stack by first setting the Handler
|
|
attribute.
|
|
|
|
2. mov eax,fs:[00000000]
|
|
|
|
The second instruction takes the current pointer to the first
|
|
EXCEPTION_REGISTRATION_RECORD and stores it in eax.
|
|
|
|
3. push eax
|
|
|
|
The third instruction takes the pointer to the first exception registration
|
|
record in the exception list and pushes it onto the stack. This, in turn, sets
|
|
the Next attribute of the record that is being dynamically generated on the
|
|
stack. Once this instruction completes, a populated
|
|
EXCEPTION_REGISTRATION_RECORD will exist on the stack that takes the following
|
|
form:
|
|
|
|
|
|
+0x000 Next : 0x0012ffb0
|
|
+0x004 Handler : 0x004011a4 ex!_except_handler3+0
|
|
|
|
|
|
4. mov fs:[00000000],esp
|
|
|
|
Finally, the dynamically generated exception registration record is stored as
|
|
the first exception registration record in the list for the current thread.
|
|
This completes the process of inserting a new registration record into the
|
|
chain of exception handlers.
|
|
|
|
|
|
The important things to take away from this description of exception handler
|
|
registration are as follows. First, the registration of exception handlers is
|
|
a runtime operation. This means that whenever a function is entered that makes
|
|
use of an exception handler, it must dynamically register the exception
|
|
handler. This has implications as it relates to performance overhead. Second,
|
|
the list of registered exception handlers is stored on a per-thread basis.
|
|
This makes sense because threads are considered isolated units of execution and
|
|
therefore exception handlers are only relative to a particular thread. The
|
|
final, and perhaps most important, thing to take away from this is that the
|
|
assembly generated by the compiler to register an exception handler at runtime
|
|
makes use of the current thread's stack. This fact will be revisited later in
|
|
this section.
|
|
|
|
In the event that an exception occurs during the course of normal execution,
|
|
the operating system will step in and take the necessary steps to dispatch the
|
|
exception. In the event that the exception occurred in the context of a thread
|
|
that is running in user-mode, the kernel will take the exception information
|
|
and generate an EXCEPTION_RECORD that is used to encapsulate all of the
|
|
exception information. Furthermore, a snapshot of the executing state of the
|
|
thread is created in the form of a populated CONTEXT structure. The kernel
|
|
then passes this information off to the user-mode thread by transferring
|
|
execution from the location that the fault occurred at to the address of
|
|
ntdll!KiUserExceptionDispatcher. The important thing to understand about this
|
|
is that execution of the exception dispatcher occurs in the context of the
|
|
thread that generated the exception.
|
|
|
|
The job of ntdll!KiUserExceptionDispatcher is, as the name implies, to dispatch
|
|
user-mode exceptions. As one might guess, the way that it goes about doing
|
|
this is by walking the chain of registered exception handlers stored relative
|
|
to the current thread. As the exception dispatcher walks the chain, it calls the
|
|
handler associated with each registration record, giving that handler the
|
|
opportunity to handle, fail, or pass on the exception.
|
|
|
|
|
|
While there are other things involved in the exception dispatching process,
|
|
this description will suffice to set the stage for how it might be abused to
|
|
gain code execution.
|
|
|
|
|
|
2.2) Gaining Code Execution
|
|
|
|
There is one important thing to remember when it comes to trying to gain code
|
|
execution through an SEH overwrite. Put simply, the fact that each exception
|
|
registration record is stored on the stack lends itself well to abuse when
|
|
considered in conjunction with a conventional stack-based buffer overflow. As
|
|
described in section , each exception registration record is composed of a Next
|
|
pointer and a Handler function pointer. Of most interest in terms of
|
|
exploitation is the Handler attribute. Since the exception dispatcher makes use
|
|
of this attribute as a function pointer, it makes sense that should this
|
|
attribute be overwritten with attacker controlled data, it would be possible to
|
|
gain code execution. In fact, that's exactly what happens, but with an added
|
|
catch.
|
|
|
|
While typical stack-based buffer overflows work by overwriting the return
|
|
address, an SEH overwrite works by overwriting the Handler attribute of an
|
|
exception registration record that has been stored on the stack. Unlike
|
|
overwriting the return address, where control is gained immediately upon return
|
|
from the function, an SEH overwrite does not actually gain code execution until
|
|
after an exception has been generated. The exception is necessary in order to
|
|
cause the exception dispatcher to call the overwritten Handler.
|
|
|
|
While this may seem like something of a nuisance that would make SEH overwrites
|
|
harder to exploit, it's not. Generating an exception that leads to the calling
|
|
of the Handler is as simple as overwriting the return address with an invalid
|
|
address in most cases. When the function returns, it attempts to execute code
|
|
from an invalid memory address which generates an access violation exception.
|
|
This exception is then passed onto the exception dispatcher which calls the
|
|
overwritten Handler.
|
|
|
|
The obvious question to ask at this point is what benefit SEH overwrites have
|
|
over the conventional practice of overwriting the return address. To
|
|
understand this, it's important to consider one of the common practices
|
|
employed in Windows-based exploits. On Windows, thread stack addresses tend to
|
|
change quite frequently between operating system revisions and even across
|
|
process instances. This differs from most UNIX derivatives where stack
|
|
addresses are typically predictable across multiple operating system revisions.
|
|
Due to this fact, most Windows-based exploits will indirectly transfer control
|
|
into the thread's stack by first bouncing off an instruction that exists
|
|
somewhere in the address space. This instruction must typically reside at an
|
|
address that is less prone to change, such as within the code section of a
|
|
binary. The purpose of this instruction is to transfer control back to the
|
|
stack in a position-independent fashion. For example, a jmp esp instruction
|
|
might be used. While this approach works perfectly fine, it's limited by
|
|
whether or not an instruction can be located that is both portable and reliable
|
|
in terms of the address that it resides at. This is where the benefits of SEH
|
|
overwrites begin to become clear.
|
|
|
|
When simply overwriting the return address, an attacker is often limited to a
|
|
small set of instructions that are not typically common to find at a reliable
|
|
and portable location in the address space. On the other hand, SEH overwrites
|
|
have the advantage of being able to use another set of instructions that are
|
|
far more prevalent in the address space of most every process. This set of
|
|
instructions is commonly referred to as pop/pop/ret. The reason this class of
|
|
instructions can be used with SEH overwrites and not general stack overflows
|
|
has to do with the method in which exception handlers are called by the
|
|
exception dispatcher. To understand this, it is first necessary to know what
|
|
the specific prototype is for the Handler field in the
|
|
EXCEPTION_REGISTRATION_RECORD structure:
|
|
|
|
|
|
typedef EXCEPTION_DISPOSITION (*ExceptionHandler)(
|
|
IN EXCEPTION_RECORD ExceptionRecord,
|
|
IN PVOID EstablisherFrame,
|
|
IN PCONTEXT ContextRecord,
|
|
IN PVOID DispatcherContext);
|
|
|
|
|
|
The field of most importance is the EstablisherFrame. This field actually
|
|
points to the address of the exception registration record that was pushed onto
|
|
the stack. It is also located at [esp+8] when the Handler is called.
|
|
Therefore, if the Handler is overwritten with the address of a pop/pop/ret
|
|
sequence, the result will be that the execution path of the current thread will
|
|
be transferred to the address of the Next attribute for the current exception
|
|
registration record. While this field would normally hold the address of the
|
|
next registration record, it instead can hold four bytes of arbitrary code that
|
|
an attacker can supply when triggering the SEH overwrite. Since there are only
|
|
four contiguous bytes of memory to work with before hitting the Handler field,
|
|
most attackers will use a simple short jump sequence to jump past the handler
|
|
and into the attacker controlled code that comes after it.
|
|
|
|
|
|
3) Design
|
|
|
|
The one basic requirement of any solution attempting to prevent the leveraging
|
|
of SEH overwrites is that it must not be possible for an attacker to be able to
|
|
supply a value for the Handler attribute of an exception registration record
|
|
that is subsequently used in an unchecked fashion by the exception dispatcher
|
|
when an exception occurs. If a solution can claim to have satisfied this
|
|
requirement, then it should be true that the solution is secure.
|
|
|
|
To that point, Microsoft's solution is secure, but only if all of the images
|
|
loaded in the address space have been compiled with /SAFESEH. Even then, it's
|
|
possible that it may not be completely secure For example, it should be
|
|
possible to overwrite the Handler with the address of some non-image associated
|
|
executable region, if one can be found. If there are any images that have not
|
|
been compiled with /SAFESEH, it may be possible for an attacker to overwrite
|
|
the Handler with an address of an instruction that resides within an
|
|
unprotected image. The reason Microsoft's implementation cannot protect
|
|
against this is because SafeSEH works by having the exception dispatcher
|
|
validate handlers against a table of image-specific safe exception handlers
|
|
prior to calling an exception handler. Safe exception handlers are stored in a
|
|
table that is contained in any executable compiled with /SAFESEH. Given this
|
|
limitation, it can also be said that Microsoft's implementation is not secure
|
|
given the appropriate conditions. In fact, for third-party applications, and
|
|
even some Microsoft-provided applications, these conditions are considered by
|
|
the author to be the norm rather than the exception. In the end, it all boils
|
|
down to the fact that Microsoft's solution is a compile-time solution rather
|
|
than a runtime solution. With these limitations in mind, it makes sense to
|
|
attempt to approach the problem from the angle of a runtime solution rather
|
|
than a compile-time solution.
|
|
|
|
When it comes to designing a runtime solution, the important consideration that
|
|
has to be made is that it will be necessary to intercept exceptions before they
|
|
are passed off to the registered exception handlers by the exception
|
|
dispatcher. The particulars of how this can be accomplished will be discussed
|
|
in chapter . Assuming a solution is found to the layering problem, the next
|
|
step is to come up with a solution for determining whether or not an exception
|
|
handler is valid and has not been tampered with. While there are many
|
|
inefficient solutions to this problem, such as coming up with a solution to
|
|
keep a ``secure'' list of registered exception handlers, there is one solution
|
|
in particular that the author feels is bested suited for the problem.
|
|
|
|
One of the side effects of an SEH overwrite is that the attacker will typically
|
|
clobber the value of the Next attribute associated with the exception
|
|
registration record that is overwritten. This occurs because the Next
|
|
attribute precedes the Handler attribute in memory, and therefore must be
|
|
overwritten before the Handler in the case of a typical buffer overflow. This
|
|
has a very important side effect that is the key to facilitating the
|
|
implementation of a runtime solution. In particular, the clobbering of the
|
|
Next attribute means that all subsequent exception registration records would
|
|
not be reachable by the exception dispatcher when walking the chain.
|
|
|
|
Consider for the moment a solution that, during thread startup, places a custom
|
|
exception registration record as the very last exception registration record in
|
|
the chain. This exception registration record will be symbolically referred to
|
|
as the validation frame henceforth. From that point forward, whenever an
|
|
exception is about to be dispatched, the solution could walk the chain prior to
|
|
allowing the exception dispatcher to handle the exception. The purpose of
|
|
walking the chain before hand is to ensure that the validation frame can be
|
|
reached. As such, the validation frame's purpose is similar to that of stack
|
|
canaries. If the validation frame can be reached, then that is evidence of the
|
|
fact that the chain of exception handlers has not been corrupted. As described
|
|
above, the act of overwriting the Handler attribute also requires that the Next
|
|
pointer be overwritten. If the Next pointer is not overwritten with an address
|
|
that ensures the integrity of the exception handler chain, then this solution
|
|
can immediately detect that the integrity of the chain is in question and
|
|
prevent the exception dispatcher from calling the overwritten Handler.
|
|
|
|
Using this technique, the act of ensuring that the integrity of the exception
|
|
handler chain is kept intact results in the ability to prevent SEH overwrites.
|
|
The important questions to ask at this point center around what limitations
|
|
this solution might have. The most obvious question to ask is what's to stop
|
|
an attacker from simply overwriting the Next pointer with the value that was
|
|
already there. There are a few things that stop this. First of all, it will
|
|
be common that the attacker does not know the value of the Next pointer.
|
|
Second, and perhaps most important, is that one of the benefits of using an SEH
|
|
overwrite is that an attacker can make use of a pop/pop/ret sequence. By
|
|
forcing an attacker to retain the value of the Next pointer, the major benefit
|
|
of using an SEH overwrite in the first place is gone. Even conceding this
|
|
point, an attacker who is able to retain the value of the Next pointer would
|
|
find themselves limited to overwriting the Handler with the address of
|
|
instructions that indirectly transfer control back to their code. However, the
|
|
attacker won't simply be able to use an instruction like jmp esp because the
|
|
Handler will be called in the context of the exception dispatcher. It's at
|
|
this point that diminishing returns are reached and an attacker is better off
|
|
simply overwriting the return address, if possible.
|
|
|
|
Another important question to ask is what's to stop the attacker from
|
|
overwriting the Next pointer with the address of the validation frame itself
|
|
or, more easily, with 0xffffffff. The answer to this is much the same as
|
|
described in the above paragraph. Specifically, by forcing an attacker away
|
|
from the pop/pop/ret sequence, the usefulness of the SEH overwrite vector
|
|
quickly degrades to the point of it being better to simply overwrite the return
|
|
address, if possible. However, in order to be sure, the author feels that
|
|
implementations of this solution would be wise to randomize the location of the
|
|
validation frame.
|
|
|
|
It is the author's opinion that the solution described above satisfies the
|
|
requirement outlined in the beginning of this chapter and therefore qualifies
|
|
as a secure solution. However, there's always a chance that something has been
|
|
missed. For that reason, the author is more than happy to be proven wrong on
|
|
this point.
|
|
|
|
|
|
4) Implementation
|
|
|
|
The implementation of the solution described in the previous chapter relies on
|
|
intercepting exceptions prior to allowing the native exception dispatcher to
|
|
handle them such that the exception handler chain can be validated. First and
|
|
foremost, it is important to identify a way of layering prior to the point that
|
|
the exception dispatcher transfers control to the registered exception
|
|
handlers. There are a few different places that this layering could occur at,
|
|
but the one that is best suited to catch the majority of user-mode exceptions
|
|
is at the location that ntdll!KiUserExceptionDispatcher gains control.
|
|
However, by hooking ntdll!KiUserExceptionDispatcher, it is possible that this
|
|
implementation may not be able to intercept all cases of an exception being
|
|
raised, thus making it potentially feasible to bypass the exception handler
|
|
chain validation.
|
|
|
|
The best location would be to layer at would be ntdll!RtlDispatchException. The
|
|
reason for this is that exceptions raised through ntdll!RtlRaiseException, such
|
|
as software exceptions, may be passed directly to ntdll!RtlDispatchException
|
|
rather than going through ntdll!KiUserExceptionDispatcher first. The condition
|
|
that controls this is whether or not a debugger is attached to the user-mode
|
|
process when ntdll!RtlRaiseException is called. The reason
|
|
ntdll!RtlDispatchException is not hooked in this implementation is because it
|
|
is not directly exported. There are, however, fairly reliable techniques that
|
|
could be used to determine its address. As far as the author is aware, the act
|
|
of hooking ntdll!KiUserExceptionDispatcher should mean that it's only possible
|
|
to miss software exceptions which are much harder, and in most cases
|
|
impossible, for an attacker to generate.
|
|
|
|
In order to layer at ntdll!KiUserExceptionDispatcher, the first few
|
|
instructions of its prologue can be overwritten with an indirect jump to a
|
|
function that will be responsible for performing any sanity checks necessary.
|
|
Once the function has completed its sanity checks, it can transfer control back
|
|
to the original exception dispatcher by executing the overwritten instructions
|
|
and then jumping back into ntdll!KiUserExceptionDispatcher at the offset of the
|
|
next instruction to be executed. This is a nice and ``clean'' way of
|
|
accomplishing this and the performance overhead is miniscule Where ``clean'' is
|
|
defined as the best it can get from a third-party perspective.
|
|
|
|
In order to hook ntdll!KiUserExceptionDispatcher, the first n instructions,
|
|
where n is the number of instructions that it takes to cover at least 6 bytes,
|
|
must be copied to a location that will be used by the hook to execute the
|
|
actual ntdll!KiUserExceptionDispatcher. Following that, the first n
|
|
instructions of ntdll!KiUserExceptionDispatcher can then be overwritten with an
|
|
indirect jump. This indirect jump will be used to transfer control to the
|
|
function that will validate the exception handler chain prior to allowing the
|
|
original exception dispatcher to handle the exception.
|
|
|
|
With the hook installed, the next step is to implement the function that will
|
|
actually validate the exception handler chain. The basic steps involved in
|
|
this are to first extract the head of the list from fs:[0] and then iterate
|
|
over each entry in the list. For each entry, the function should validate that
|
|
the Next attribute points to a valid memory location. If it does not, then the
|
|
chain can be assumed to be corrupt. However, if it does point to valid memory,
|
|
then the routine should check to see if the Next pointer is equal to the
|
|
address of the validation frame that was previously stored at the end of the
|
|
exception handler chain for this thread. If it is equal to the validation
|
|
frame, then the integrity of the chain is confirmed and the exception can be
|
|
passed to the actual exception dispatcher.
|
|
|
|
However, if the function reaches an invalid Next pointer, or it reaches
|
|
0xffffffff without encountering the validation frame, then it can assume that
|
|
the exception handler chain is corrupt. It's at this point that the function
|
|
can take whatever steps are necessary to discard the exception, log that a
|
|
potential exploitation attempt occurred, and so on. The end result should be
|
|
the termination of either the thread or the process, depending on
|
|
circumstances. This algorithm is captured by the pseudo-code below:
|
|
|
|
|
|
01: CurrentRecord = fs:[0];
|
|
02: ChainCorrupt = TRUE;
|
|
03: while (CurrentRecord != 0xffffffff) {
|
|
04: if (IsInvalidAddress(CurrentRecord->Next))
|
|
05: break;
|
|
06: if (CurrentRecord->Next == ValidationFrame) {
|
|
07: ChainCorrupt = FALSE;
|
|
08: break;
|
|
09: }
|
|
10: CurrentRecord = CurrentRecord->Next;
|
|
11: }
|
|
12: if (ChainCorrupt == TRUE)
|
|
13: ReportExploitationAttempt();
|
|
14: else
|
|
15: CallOriginalKiUserExceptionDispatcher();
|
|
|
|
|
|
The above algorithm describes how the exception dispatching path should be
|
|
handled. However, there is one important part remaining in order to implement
|
|
this solution. Specifically, there must be some way of registering the
|
|
validation frame with a thread prior to any exceptions being dispatched on that
|
|
thread. There are a few ways that this can be accomplished. In terms of a
|
|
proof of concept, the easiest way of doing this is to implement a DLL that,
|
|
when loaded into a process' address space, catches the creation notification of
|
|
new threads through a mechanism like DllMain or through the use of a TLS
|
|
callback in the case of a statically linked library. Both of these approaches
|
|
provide a location for the solution to establish the validation frame with the
|
|
thread early on in its execution. However, if there were ever a case where the
|
|
thread were to raise an exception prior to one of these routines being called,
|
|
then the solution would improperly detect that the exception handler chain was
|
|
corrupt.
|
|
|
|
One solution to this potential problem is to store state relative to each
|
|
thread that keeps track of whether or not the validation frame has been
|
|
registered. There are certain implications about doing this, however. First,
|
|
it could introduce a security problem in that an attacker might be able to
|
|
bypass the protection by somehow toggling the flag that tracks whether or not
|
|
the validation frame has been registered. If this flag were to be toggled to
|
|
no and an exception were generated in the thread, then the solution would have
|
|
to assume that it can't validate the chain because no validation frame has been
|
|
installed. Another issue with this is that it would require some location to
|
|
store this state on a per-thread basis. A good example of a place to store
|
|
this is in TLS, but again, it has the security implications described above.
|
|
|
|
A more invasive solution to the problem of registering the validation frame
|
|
would be to somehow layer very early on in the thread's execution -- perhaps
|
|
even before it begins executing from its entry point. The author is aware of a
|
|
good way to accomplish this, but it will be left as an exercise to the reader
|
|
on what this might be. This more invasive solution is something that would be
|
|
an easy and elegant way for Microsoft to include support for this, should they
|
|
ever choose to do so.
|
|
|
|
The final matter of how to go about implementing this solution centers around
|
|
how it could be deployed and used with existing applications without requiring
|
|
a recompile. The easiest way to do this in a proof of concept setting would be
|
|
to implement these protection mechanisms in the form of a DLL that can be
|
|
dynamically loaded into the address space of a process that is to be protected.
|
|
Once loaded, the DLL's DllMain can take care of getting everything set up. A
|
|
simple way to cause the DLL to be loaded is through the use of AppInitDLLs,
|
|
although this has some limitations. Alternatively, there are more invasive
|
|
options that can be considered that will accomplish the goal of loading and
|
|
initializing the DLL early on in process creation.
|
|
|
|
One interesting thing about this approach is that while it is targeted at being
|
|
used as a runtime solution, it can also be used as a compile-time solution.
|
|
This means that applications can use this solution at compile-time to protect
|
|
themselves from SEH overwrites. Unlike Microsoft's solution, this will even
|
|
protect them in the presence of third-party images that have not been compiled
|
|
with the support. This can be accomplished through the use of a static library
|
|
that uses TLS callbacks to receive notifications when threads are created, much
|
|
like DllMain is used for DLL implementations of this solution.
|
|
|
|
All things considered, the author believes that the implementation described
|
|
above, for all intents and purposes, is a fairly simplistic way of providing
|
|
runtime protection against SEH overwrites that has minimal overhead. While the
|
|
implementation described in this document is considered more suitable for a
|
|
proof-of-concept or application-specific solution, there are real-world
|
|
examples of more robust implementations, such as in Wehnus's WehnTrust product,
|
|
a commercial side-project of the author's. Apologies for the shameless plug.
|
|
|
|
|
|
5) Compatibility
|
|
|
|
Like most security solutions, there are always compatibility problems that must
|
|
be considered. As it relates to the solution described in this paper, there
|
|
are a couple of important things to keep in mind.
|
|
|
|
The first compatibility issue that might happen in the real world is a scenario
|
|
where an application invalidates the exception handler chain in a legitimate
|
|
fashion. The author is not currently aware of situations where an application
|
|
would legitimately need to do this, but it has been observed that some
|
|
applications, such as cygwin, will do funny things with the exception handler
|
|
chain that are not likely to play nice with this form of protection. In the
|
|
event that an application invalidates the exception handler chain, the solution
|
|
described in this paper may inadvertently detect that an SEH overwrite has
|
|
occurred simply because it is no longer able to reach the validation frame.
|
|
|
|
Another compatibility issue that may occur centers around the fact that the
|
|
implementation described in this paper relies on the hooking of functions. In
|
|
almost every situation it is a bad idea to use function hooking, but there are
|
|
often situations where there is no alternative, especially in closed source
|
|
environments. The use of function hooking can lead to compatibility problems
|
|
with other applications that also hook ntdll!KiUserExceptionDispatcher. There
|
|
may also be instances of security products that detect the hooking of
|
|
ntdll!KiUserExceptionDispatcher and classify it as malware-like behavior. In
|
|
any case, these compatibility concerns center less around the fundamental
|
|
concept and more around the specific implementation that would be required of a
|
|
third-party.
|
|
|
|
|
|
6) Conclusion
|
|
|
|
Software-based vulnerabilities are a common problem that affect a wide array of
|
|
operating systems. In some cases, these vulnerabilities can be exploited with
|
|
greater ease depending on operating system specific features. One particular
|
|
case of where this is possible is through the use of an SEH overwrite on 32-bit
|
|
applications on the Windows platform. An SEH overwrite involves overwriting the
|
|
Handler associated with an exception registration record. Once this occurs, an
|
|
exception is generated that results in the overwritten Handler being called.
|
|
As a result of this, the attacker can more easily gain control of code
|
|
execution due to the context that the exception handler is called in.
|
|
|
|
Microsoft has attempted to address the problem of SEH overwrites with
|
|
enhancements to the exception dispatcher itself and with solutions like SafeSEH
|
|
and the /GS compiler flag. However, these solutions are limited because they
|
|
require a recompilation of code and therefore only protect images that have
|
|
been compiled with these flags enabled. This limitation is something that
|
|
Microsoft is aware of and it was most likely chosen to reduce the potential for
|
|
compatibility issues.
|
|
|
|
To help solve the problem of not offering complete protection against SEH
|
|
overwrites, this paper has suggested a solution that can be used without any
|
|
code recompilation and with negligible performance overhead. The solution
|
|
involves appending a custom exception registration record, known as a
|
|
validation frame, to the end of the exception list early on in thread startup.
|
|
When an exception occurs in the context of a thread, the solution intercepts
|
|
the exception and validates the exception handler chain for the thread by
|
|
making sure that it can walk the chain until it reaches the validation frame.
|
|
If it is able to reach the validation frame, then the exception is dispatched
|
|
like normal. However, if the validation frame cannot be reached, then it is
|
|
assumed that the exception handler chain is corrupt and that it's possible that
|
|
an exploit attempt may have occurred. Since exception registration records are
|
|
always prepended to the exception handler chain, the validation frame is
|
|
guaranteed to always be the last handler.
|
|
|
|
This solution relies on the fact that when an SEH overwrite occurs, the Next
|
|
attribute is overwritten before overwriting the Handler attribute. Due to the
|
|
fact that attackers typically use the Next attribute as the location at which
|
|
to store a short jump, it is not possible for them to both retain the integrity
|
|
of the list and also use it as a location to store code. This important
|
|
consequence is the key to being able to detect and prevent the leveraging of an
|
|
SEH overwrite to gain code execution.
|
|
|
|
Looking toward the future, the usefulness of this solution will begin to wane
|
|
as 64-bit versions of Windows begin to dominate the desktop environment. The
|
|
reason 64-bit versions are not affected by this solution is because exception
|
|
handling on 64-bit versions of Windows is inherently secure due to the way it's
|
|
been implemented. However, this only applies to 64-bit binaries. Legacy
|
|
32-bit binaries that are capable of running on 64-bit versions of Windows will
|
|
continue to use the old style of exception handling, thus potentially leaving
|
|
them vulnerable to the same style of attacks depending on what compiler flags
|
|
were used. On the other hand, this solution will also become less necessary due
|
|
to the fact that modern 32-bit x86 machines support hardware NX and can
|
|
therefore help to mitigate the execution of code from the stack. Regardless of
|
|
these facts, there will always be a legacy need to protect against SEH
|
|
overwrites, and the solution described in this paper is one method of providing
|
|
that protection.
|
|
|
|
A. References
|
|
|
|
Borland. United States Patent: 5628016.
|
|
http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=2Fnetahtml2FPTO2Fsrchnum.htm&r=1&f=G&l=50&s1=5,628,016.PN.&OS=PN/5,628,016&RS=PN/5,628,016;
|
|
accessed Sep 5, 2006.
|
|
|
|
|
|
Litchfield, David. Defeating the Stack based Buffer
|
|
Overflow Prevention Mechanism of Microsoft Windows 2003 Server.
|
|
|
|
http://www.blackhat.com/presentations/bh-asia-03/bh-asia-03-litchfield.pdf;
|
|
accessed Sep 5, 2006.
|
|
|
|
|
|
Microsoft Corporation. Structured Exception Handling.
|
|
|
|
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/debug/base/structured_exception_handling.asp;
|
|
accessed Sep 5, 2006.
|
|
|
|
|
|
Microsoft Corporation. Working with the AppInitDLLs
|
|
registry value.
|
|
|
|
http://support.microsoft.com/default.aspx?scid=kb;en-us;197571;
|
|
accessed Sep 5, 2006.
|
|
|
|
|
|
Microsoft Corporation. /GS (Buffer Security Check)
|
|
|
|
http://msdn2.microsoft.com/en-us/library/8dbf701c.aspx;
|
|
accessed Sep 5, 2006.
|
|
|
|
|
|
Nagy, Ben. SEH (Structured Exception Handling) Security
|
|
Changes in XPSP2 and 2003 SP1.
|
|
|
|
http://www.eeye.com/html/resources/newsletters/vice/VI20060830.html#vexposed;
|
|
accessed Sep 8, 2006.
|
|
|
|
|
|
Pietrek, Matt. A Crash Course on the Depths of Win32
|
|
Structured Exception Handling.
|
|
|
|
http://www.microsoft.com/msj/0197/exception/exception.aspx;
|
|
accessed Sep 8, 2006.
|
|
|
|
|
|
skape. Improving Automated Analysis of Windows x64
|
|
Binaries.
|
|
http://www.uninformed.org/?v=4&a=1&t=sumry; accessed
|
|
Sep 5, 2006.
|
|
|
|
|
|
Wehnus. WehnTrust.
|
|
http://www.wehnus.com/products.pl; accessed Sep 5,
|
|
2006.
|
|
|
|
|
|
Wikipedia. Matryoshka Doll.
|
|
http://en.wikipedia.org/wiki/Matryoshka_doll;
|
|
accessed Sep 18, 2006.
|
|
|
|
|
|
Wine. CompilerExceptionSupport.
|
|
http://wiki.winehq.org/CompilerExceptionSupport;
|
|
accessed Sep 5, 2006.
|
|
|
|
|
|
|