mirror of
https://github.com/fdiskyou/Zines.git
synced 2025-03-09 00:00:00 +01:00
2192 lines
112 KiB
Text
2192 lines
112 KiB
Text
==Phrack Inc.==
|
|
|
|
Volume 0x0d, Issue 0x42, Phile #0x0F of 0x11
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=--------------=[ Linux Kernel Heap Tampering Detection ]=--------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------=[ Larry H. <larry@subreption.com> ]=----------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
------[ Index
|
|
|
|
1 - History and background of the Linux kernel heap allocators
|
|
|
|
1.1 - SLAB
|
|
1.2 - SLOB
|
|
1.3 - SLUB
|
|
1.4 - SLQB
|
|
1.5 - The future
|
|
|
|
2 - Introduction: What is KERNHEAP?
|
|
|
|
3 - Integrity assurance for kernel heap allocators
|
|
|
|
3.1 - Meta-data protection against full and partial overwrites
|
|
3.2 - Detection of arbitrary free pointers and freelist corruption
|
|
3.3 - Overview of NetBSD and OpenBSD kernel heap safety checks
|
|
3.4 - Microsoft Windows 7 kernel pool allocator safe unlinking
|
|
|
|
4 - Sanitizing memory of the look-aside caches
|
|
|
|
5 - Deterrence of IPC based kmalloc() overflow exploitation
|
|
|
|
6 - Prevention of copy_to_user() and copy_from_user() abuse
|
|
|
|
7 - Prevention of vsyscall overwrites on x86_64
|
|
|
|
8 - Developing the right regression testsuite for KERNHEAP
|
|
|
|
9 - The Inevitability of Failure
|
|
|
|
9.1 - Subverting SELinux and the audit subsystem
|
|
9.2 - Subverting AppArmor
|
|
|
|
10 - References
|
|
|
|
11 - Thanks and final statements
|
|
|
|
12 - Source code
|
|
|
|
------[ 1. History and background of the Linux kernel heap allocators
|
|
|
|
Before discussing what is KERNHEAP, its internals and design, we will have
|
|
a glance at the background and history of Linux kernel heap allocators.
|
|
|
|
In 1994, Jeff Bonwick from Sun Microsystems presented the SunOS 5.4
|
|
kernel heap allocator at USENIX Summer [1]. This allocator produced higher
|
|
performance results thanks to its use of caches to hold invariable state
|
|
information about the objects, and reduced fragmentation significantly,
|
|
grouping similar objects together in caches. When memory was under stress,
|
|
the allocator could check the caches for unused objects and let the system
|
|
reclaim the memory (that is, shrinking the caches on demand).
|
|
|
|
We will refer to these units composing the caches as "slabs". A slab
|
|
comprises contiguous pages of memory. Each page in the slab holds chunks
|
|
(objects or buffers) of the same size. This minimizes internal
|
|
fragmentation, since a slab will only contain same-sized chunks, and
|
|
only the 'trailing' or free space in the page will be wasted, until it
|
|
is required for a new allocation. The following diagram shows the
|
|
layout of Bonwick's slab allocator:
|
|
|
|
+-------+
|
|
| CACHE |
|
|
+-------+ +---------+
|
|
| CACHE |----| EMPTY |
|
|
+-------+ +---------+ +------+ +------+
|
|
| PARTIAL |----| SLAB |------| PAGE | (objects)
|
|
+---------+ +------+ +------+ +-------+
|
|
| FULL | ... |-------| CHUNK |
|
|
+---------+ +-------+
|
|
| CHUNK |
|
|
+-------+
|
|
| CHUNK |
|
|
+-------+
|
|
...
|
|
|
|
These caches operated in a LIFO manner: when an allocation was requested
|
|
for a given size, the allocator would seek for the first available free
|
|
object in the appropriate slab. This saved the cost of page allocation
|
|
and creation of the object altogether.
|
|
|
|
"A slab consists of one or more pages of virtually contiguous
|
|
memory carved up into equal-size chunks, with a reference count
|
|
indicating how many of those chunks have been allocated."
|
|
Page 5, 3.2 Slabs. [1]
|
|
|
|
Each slab was managed with a kmem_slab structure, which contained its
|
|
reference count, freelist of chunks and linkage to the associated
|
|
kmem_cache. Each chunk had a header defined as the kmem_bufctl (chunks
|
|
are commonly referred to as buffers in the paper and implementation),
|
|
which contained the freelist linkage, address to the buffer and a
|
|
pointer to the slab it belongs to. The following diagram shows the
|
|
layout of a slab:
|
|
|
|
.-------------------.
|
|
| SLAB (kmem_slab) |
|
|
`-------+--+--------'
|
|
/ \
|
|
+----+---+--+-----+
|
|
| bufctl | bufctl |
|
|
+-.-'----+.-'-----+
|
|
_.-' .-'
|
|
+-.-'------.-'-----------------+
|
|
| | | ':>=jJ6XKNM|
|
|
| buffer | buffer | Unused XQNM|
|
|
| | | ':>=jJ6XKNM|
|
|
+------------------------------+
|
|
[ Page (s) ]
|
|
|
|
For chunk sizes smaller than 1/8 of a page (ex. 512 bytes for x86), the
|
|
meta-data of the slab is contained within the page, at the very end.
|
|
The rest of space is then divided in equally sized chunks. Because all
|
|
buffers have the same size, only linkage information is required,
|
|
allowing the rest of values to be computed at runtime, saving space.
|
|
The freelist pointer is stored at the end of the chunk. Bonwick
|
|
states that this due to end of data structures being less active than
|
|
the beginning, and permitting debugging to work even when an
|
|
use-after-free situation has occurred, overwriting data in the buffer,
|
|
relying on the freelist pointer being intact. In deliberate attack
|
|
scenarios this is obviously a flawed assumption. An additional word was
|
|
reserved too to hold a pointer to state information used by objects
|
|
initialized through a constructor.
|
|
|
|
For larger allocations, the meta-data resides out of the page.
|
|
|
|
The freelist management was simple: each cache maintained a circular
|
|
doubly-linked list sorted to put the empty slabs (all buffers
|
|
allocated) first, the partial slabs (free and allocated buffers) and
|
|
finally the full slabs (reference counter set to zero, all buffers
|
|
free). The cache freelist pointer points to the first non-empty slab,
|
|
and each slab then contains its own freelist. Bonwick chose this
|
|
approach to simplify the memory reclaiming process.
|
|
|
|
The process of reclaiming memory started at the original
|
|
kmem_cache_free() function, which verified the reference counter. If
|
|
its value was zero (all buffers free), it moved the full slab to the
|
|
tail of the freelist with the rest of full slabs. Section 4 explains
|
|
the intrinsic details of hardware cache side effects and optimization.
|
|
It is an interesting read due to the hardware used at the time the
|
|
paper was written. In order to optimize cache utilization and bus
|
|
balance, Bonwick devised 'slab coloring'. Slab coloring is simple: when
|
|
a slab is created, the buffer address starts at a different offset
|
|
(referred to as the color) from the slab base (since a slab is an
|
|
allocated page or pages, this is always aligned to page size).
|
|
|
|
It is interesting to note that Bonwick already studied different
|
|
approaches to detect kernel heap corruption, and implemented them in
|
|
the SunOS 5.4 kernel, possibly predating every other kernel in terms of
|
|
heap corruption detection). Furthermore, Bonwick noted the performance
|
|
impact of these features was minimal.
|
|
|
|
"Programming errors that corrupt the kernel heap - such as
|
|
modifying freed memory, freeing a buffer twice, freeing an
|
|
uninitialized pointer, or writing beyond the end of a buffer — are
|
|
often difficult to debug. Fortunately, a thoroughly instrumented
|
|
ker- nel memory allocator can detect many of these problems."
|
|
page 10, 6. Debugging features. [1]
|
|
|
|
The audit mode enabled storage of the user of every allocation (an
|
|
equivalent of the Linux feature that will be briefly described in
|
|
the allocator subsections) and provided these traces when corruption
|
|
was detected.
|
|
|
|
Invalid free pointers were detected using a hash lookup in the
|
|
kmem_cache_free() function. Once an object was freed, and after the
|
|
destructor was called, it filled the space with 0xdeadbeef. Once this
|
|
object was being allocated again, the pattern would be verified to see
|
|
that no modifications occurred (that is, detection of use-after-free
|
|
conditions, or write-after-free more specifically). Allocated objects
|
|
were filled with 0xbaddcafe, which marked it as uninitialized.
|
|
|
|
Redzone checking was also implemented to detect overwrites past the end
|
|
of an object, adding a guard value at that position. This was verified
|
|
upon free.
|
|
|
|
Finally, a simple but possibly effective approach to detect memory
|
|
leaks used the timestamps from the audit log to find allocations which
|
|
had been online for a suspiciously long time. In modern times, this
|
|
could be implemented using a kernel thread. SunOS did it from userland
|
|
via /dev/kmem, which would be unacceptable in security terms.
|
|
|
|
For more information about the concepts of slab allocation, refer to
|
|
Bonwick's paper at [1] provides an in-depth overview of the theory and
|
|
implementation.
|
|
|
|
---[ 1.1 SLAB
|
|
|
|
The SLAB allocator in Linux (mm/slab.c) was written by Mark Hemment
|
|
in 1996-1997, and further improved through the years by Manfred
|
|
Spraul and others. The design follows closely that presented by Bonwick for
|
|
his Solaris allocator. It was first integrated in the 2.2 series.
|
|
This subsection will avoid describing more theory than the strictly
|
|
necessary, but those interested on a more in-depth overview of SLAB
|
|
can refer to "Understanding the Linux Virtual Memory Manager" by
|
|
Mel Gorman, and its eighth chapter "Slab Allocator" [X].
|
|
|
|
The caches are defined as a kmem_cache structure, comprised of
|
|
(most commonly) page sized slabs, containing initialized objects.
|
|
Each cache holds its own GFP flags, the order of pages per slab
|
|
(2^n), the number of objects (chunks) per slab, coloring offsets
|
|
and range, a pointer to a constructor function, a printable name
|
|
and linkage to other caches. Optionally, if enabled, it can define
|
|
a set of fields to hold statistics an debugging related
|
|
information.
|
|
|
|
Each kmem_cache has an array of kmem_list3 structures, which contain
|
|
the information about partial, full and free slab lists:
|
|
|
|
struct kmem_list3 {
|
|
struct list_head slabs_partial;
|
|
struct list_head slabs_full;
|
|
struct list_head slabs_free;
|
|
unsigned long free_objects;
|
|
unsigned int free_limit;
|
|
unsigned int colour_next;
|
|
...
|
|
unsigned long next_reap;
|
|
int free_touched;
|
|
};
|
|
|
|
These structures are initialized with kmem_list3_init(), setting
|
|
all the reference counters to zero and preparing the list3 to be
|
|
linked to its respective cache nodelists list for the proper NUMA
|
|
node. This can be found in cpuup_prepare() and kmem_cache_init().
|
|
|
|
The "reaping" or draining of the cache free lists is done with the
|
|
drain_freelist() function, which returns the total number of slabs
|
|
released, initiated via cache_reap(). A slab is released using
|
|
slab_destroy(), and allocated with the cache_grow() function for a
|
|
given NUMA node, flags and cache.
|
|
|
|
The cache contains the doubly-linked lists for the partial, full
|
|
and free lists, and a free object count in free_objects.
|
|
|
|
A slab is defined with the following structure:
|
|
|
|
struct slab {
|
|
struct list_head list; /* linkage/pointer to freelist */
|
|
unsigned long colouroff; /* color / offset */
|
|
void *s_mem; /* start address of first object */
|
|
unsigned int inuse; /* num of objs active in slab */
|
|
kmem_bufctl_t free; /* first free chunk (or none) */
|
|
unsigned short nodeid; /* NUMA node id for nodelists */
|
|
};
|
|
|
|
The list member points to the freelist the slab belongs to:
|
|
partial, full or empty. The s_mem is used to calculate the address
|
|
to a specific object with the color offset. Free holds the list of
|
|
objects. The cache of the slab is tracked in the page structure.
|
|
|
|
The functions used to retrieve the cache a potential object belongs
|
|
to is virt_to_cache(), which itself relies on page_get_cache() on a
|
|
page structure pointer. It checks that the Slab page flag is set,
|
|
and takes the lru.next pointer of the head page (to be compatible
|
|
with compound pages, this is no different for normal pages). The
|
|
cache is set with page_set_cache(). The behavior to assign pages to
|
|
a slab and cache can be seen in slab_map_pages().
|
|
|
|
The internal function used for cache shrinking is __cache_shrink(),
|
|
called from kmem_cache_shrink() and during cache destruction. SLAB
|
|
is clearly poor at the scalability side: on NUMA systems with a
|
|
large number of nodes, substantial time will be spent on walking
|
|
the nodelists, drain each freelist, and so forth. In the process,
|
|
it is most likely that some of those nodes won't be under memory
|
|
pressure.
|
|
|
|
slab management data is stored inside the slab itself when the size
|
|
is under 1/8 of PAGE_SIZE (512 bytes for x86, same as Bonwick's
|
|
allocator). This is done by alloc_slabmgmt(), which either stores
|
|
the management structure within the slab, or allocates space for it
|
|
from the kmalloc caches (slabp_cache within the kmem_cache
|
|
structure, assigned with kmem_find_general_cachep() given the slab
|
|
size). Again, this is reflected in slab_destroy() which takes care
|
|
of freeing the off-slab management structure when applicable.
|
|
|
|
The interesting security impact of this logic in managing control
|
|
structures is that slabs with their meta-data stored off-slab, in
|
|
one of the general kmalloc caches, will be exposed to potential
|
|
abuse (ex. in a slab overflow scenario in some adjacent object, the
|
|
freelist pointer could be overwritten to leverage a
|
|
write4-primitive during unlinking). This is one of the loopholes
|
|
which KERNHEAP, as described in this paper, will close or at very
|
|
least do everything feasible to deter reliable exploitation.
|
|
|
|
Since the basic technical aspects of the SLAB allocator are now
|
|
covered, the reader can refer to mm/slab.c in any current kernel
|
|
release for further information.
|
|
|
|
---[ 1.2 SLOB
|
|
|
|
Released in November 2005, it was developed since 2003 by Matt Mackall
|
|
for use in embedded systems due to its smaller memory footprint. It
|
|
lacks the complexity of all other allocators.
|
|
|
|
The granularity of the SLOB allocator supports objects as little as 2
|
|
bytes in size, though this is subject to architecture-dependent
|
|
restrictions (alignment, etc). The author notes that this will
|
|
normally be 4 bytes for 32-bit architectures, and 8 bytes on 64-bit.
|
|
|
|
The chunks (referred as blocks in his comments at mm/slob.c) are
|
|
referenced from a singly-linked list within each page. His approach to
|
|
reduce fragmentation is to place all objects within three distinctive
|
|
lists: under 256 bytes, under 1024 bytes and then any other objects
|
|
of size greater than 1024 bytes.
|
|
|
|
The allocation algorithm is a classic next-fit, returning the first
|
|
slab containing enough chunks to hold the object. Released objects are
|
|
re-introduced into the freelist in address order.
|
|
|
|
The kmalloc and kfree layer (that is, the public API exposed from
|
|
SLOB) places a 4 byte header in objects within page size, or uses the
|
|
lower level page allocator directly if greater in size to allocate
|
|
compound pages. In such cases, it stores the size in the page
|
|
structure (in page->private). This poses a problem when detecting the
|
|
size of an allocated object, since essentially the slob_page and
|
|
page structures are the same: it's an union and the values of the
|
|
structure members overlap. Size is enforced to match, but using the
|
|
wrong place to store a custom value means a corrupted page state.
|
|
|
|
Before put_page() or free_pages(), SLOB clears the Slob bit, resets
|
|
the mapcount atomically and sets the mapping to NULL, then the page
|
|
is released back to the low-level page allocator. This prevents the
|
|
overlapping fields from leading to the aforementioned corrupted
|
|
state situation. This hack allows both SLOB and the page
|
|
allocator meta-data to coexist, allowing a lower memory footprint
|
|
and overhead.
|
|
|
|
---[ 1.3 SLUB aka The Unqueued Allocator
|
|
|
|
The default allocator in several GNU/Linux distributions at the
|
|
moment, including Ubuntu and Fedora. It was developed by
|
|
Christopher Lameter and merged into the -mm tree in early 2007.
|
|
|
|
"SLUB is a slab allocator that minimizes cache line usage
|
|
instead of managing queues of cached objects (SLAB approach).
|
|
Per cpu caching is realized using slabs of objects instead of
|
|
queues of objects. SLUB can use memory efficiently and has
|
|
enhanced diagnostics." CONFIG_SLUB documentation, Linux kernel.
|
|
|
|
The SLUB allocator was the first introducing merging, the concept
|
|
of grouping slabs of similar properties together, reducing the
|
|
number of caches present in the system and internal fragmentation.
|
|
|
|
This, however, has detrimental security side effects which are
|
|
explained in section 3.1. Fortunately even without a patched
|
|
kernel, merging can be disabled on runtime.
|
|
|
|
The debugging facilities are far more flexible than those in SLAB.
|
|
They can be enabled on runtime using a boot command line option,
|
|
and per-cache.
|
|
|
|
DMA caches are created on demand, or not-created at all if support
|
|
isn't required.
|
|
|
|
Another important change is the lack of SLAB's per-node partial
|
|
lists. SLUB has a single partial list, which prevents partially
|
|
free-allocated slabs from being scattered around, reducing
|
|
internal fragmentation in such cases, since otherwise those node
|
|
local lists would only be filled when allocations happen in that
|
|
particular node.
|
|
|
|
Its cache reaping has better performance than SLAB's, especially on
|
|
SMP systems, where it scales better. It does not require walking
|
|
the lists every time a slab is to be pushed into the partial list.
|
|
For non-SMP systems it doesn't use reaping at all.
|
|
|
|
Meta-data is stored using the page structure, instead of withing
|
|
the beginning of each slab, allowing better data alignment and
|
|
again, this reduces internal fragmentation since objects can be
|
|
packed tightly together without leaving unused trailing space in
|
|
the page(s). Memory requirements to hold control structures is much
|
|
lower than SLAB's, as Lameter explains:
|
|
|
|
"SLAB Object queues exist per node, per CPU. The alien cache
|
|
queue even has a queue array that contain a queue for each
|
|
processor on each node. For very large systems the number of
|
|
queues and the number of objects that may be caught in those
|
|
queues grows exponentially. On our systems with 1k nodes /
|
|
processors we have several gigabytes just tied up for storing
|
|
references to objects for those queues This does not include
|
|
the objects that could be on those queues."
|
|
|
|
To sum it up in a single paragraph: SLUB is a clever allocator
|
|
which is designed for modern systems, to scale well, work reliably
|
|
in SMP environments and reduce memory footprint of control and
|
|
meta-data structures and internal/external fragmentation. This
|
|
makes SLUB the best current target for KERNHEAP development.
|
|
|
|
---[ 1.4 SLQB
|
|
|
|
The SLQB allocator was developed by Nick Piggin to provide better
|
|
scalability and avoid fragmentation as much as possible. It makes a
|
|
great deal of an effort to avoid allocation of compound pages,
|
|
which is optimal when memory starts running low. Overall, it is a
|
|
per-CPU allocator.
|
|
|
|
The structures used to define the caches are slightly different,
|
|
and it shows that the allocator has been to designed from ground
|
|
zero to scale on high-end systems. It tries to optimize remote
|
|
freeing situations (when an object is freed in a different node/CPU
|
|
than it was allocated at). This is relevant to NUMA environments,
|
|
mostly. Objects more likely to be subjected to this situation are
|
|
long-lived ones, on systems with large numbers of processors.
|
|
|
|
It defines a slqb_page structure which "overloads" the lower level
|
|
page structure, in the same fashion as SLOB does. Instead of an
|
|
unused padding, it introduces kmem_cache_list ad freelist pointers.
|
|
|
|
For each lookaside cache, each CPU has a LIFO list of the objects
|
|
local to that node (used for local allocation and freeing), a free
|
|
and partial pages lists, a queue for objects being freed remotely
|
|
and a queue of already free objects that come from other CPUs remote
|
|
free queues. Locking is minimal, but sufficient to control
|
|
cross-CPU access to these queues.
|
|
|
|
Some of the debugging facilities include tracking the user of the
|
|
allocated object (storing the caller address, cpu, pid and the
|
|
timestamp). This track structure is stored within the allocated
|
|
object space, which makes it subject to partial or full overwrites,
|
|
thus unsuitable for security purposes like similar facilities in
|
|
other allocators (SLAB and SLUB, since SLOB is impaired for
|
|
debugging).
|
|
|
|
Back on SLQB-specific changes, the use of a kmem_cache_cpu
|
|
structure per CPU can be observed. An article at LWN.net by
|
|
Jonathan Corbet in December 2008, provides a summary about the
|
|
significance of this structure:
|
|
|
|
"Within that per-CPU structure one will find a number of lists
|
|
of objects. One of those (freelist) contains a list of
|
|
available objects; when a request is made to allocate an
|
|
object, the free list will be consulted first. When objects are
|
|
freed, they are returned to this list. Since this list is part
|
|
of a per-CPU data structure, objects normally remain on the
|
|
same processor, minimizing cache line bouncing. More
|
|
importantly, the allocation decisions are all done per-CPU,
|
|
with no bad cache behavior and no locking required beyond the
|
|
disabling of interrupts. The free list is managed as a stack,
|
|
so allocation requests will return the most recently freed
|
|
objects; again, this approach is taken in an attempt to
|
|
optimize memory cache behavior." [5]
|
|
|
|
In order to couple with memory stress situations, the freelists
|
|
can be flushed to return unused partial objects back to the page
|
|
allocator when necessary. This works by moving the object to the
|
|
remote freelist (rlist) from the CPU-local freelist, and keep a
|
|
reference in the remote_free list.
|
|
|
|
The SLQB allocator is well described in depth in the aforementioned
|
|
article and the source code comments. Feel free to refer to these
|
|
sources for more in-depth information about its design and
|
|
implementation. The original RFC and patch can be found at
|
|
http://lkml.org/lkml/2008/12/11/417
|
|
|
|
---[ 1.5 The future
|
|
|
|
As architectures and computing platforms evolve, so will the
|
|
allocators in the Linux kernel. The current development process
|
|
doesn't contribute to a more stable, smaller set of options, and it
|
|
will be inevitable to see new allocators introduced into the kernel
|
|
mainline, possibly specialized for certain environments.
|
|
|
|
In the short term, SLUB will remain the default, and there seems to
|
|
be an intention to remove SLOB. It is unclear if SLBQ will see
|
|
widely spread deployment.
|
|
|
|
Newly developed allocators will require careful assessment, since
|
|
KERNHEAP is tied to certain assumptions about their internals. For
|
|
instance, we depend on the ability to track object sizes properly,
|
|
and it remains untested for some obscure architectures, NUMA
|
|
systems and so forth. Even a simple allocator like SLOB posed a
|
|
challenge to implement safety checks, since the internals are
|
|
greatly convoluted. Thus, it's uncertain if future ones will
|
|
require a redesign of the concepts composing KERNHEAP.
|
|
|
|
------[ 2. Introduction: What is KERNHEAP?
|
|
|
|
As of April 2009, no operating system has implemented any form of
|
|
hardening in its kernel heap management interfaces. Attacks against the
|
|
SLAB allocator in Linux have been documented and made available to the
|
|
public as early as 2005, and used to develop highly reliable exploits
|
|
to abuse different kernel vulnerabilities involving heap allocated
|
|
buffers. The first public exploit making use of kmalloc() exploitation
|
|
techniques was the MCAST_MSFILTER exploit by twiz [10].
|
|
|
|
In January 2009, an obscure, non advertised advisory surfaced about a
|
|
buffer overflow in the SCTP implementation in the Linux kernel, which
|
|
could be abused remotely, provided that a SCTP based service was
|
|
listening on the target host. More specifically, the issue was located
|
|
in the code which processes the stream numbers contained in FORWARD-TSN
|
|
chunks.
|
|
|
|
During a SCTP association, a client sends an INIT chunk specifying a
|
|
number of inbound and outbound streams, which causes the kernel in the
|
|
server to allocate space for them via kmalloc(). After the association
|
|
is made effective (involving the exchange of INIT-ACK, COOKIE and
|
|
COOKIE-ECHO chunks), the attacker can send a FORWARD-TSN chunk with
|
|
more streams than those specified initially in the INIT chunk, leading
|
|
to the overflow condition which can be used to overwrite adjacent heap
|
|
objects with attacker controlled data. The vulnerability itself had
|
|
certain quirks and requirements which made it a good candidate for a
|
|
complex exploit, unlikely to be available to the general public, thus
|
|
restricted to more technically adept circles on kernel exploitation.
|
|
Nonetheless, reliable exploits for this issue were developed and
|
|
successfully used in different scenarios (including all major
|
|
distributions, such as Red Hat with SELinux enabled, and Ubuntu with
|
|
AppArmor).
|
|
|
|
At some point, Brad Spengler expressed interest on a potential protection
|
|
against this vulnerability class, and asked the author what kind of
|
|
measures could be taken to prevent new kernel-land heap related bugs
|
|
from being exploited. Shortly afterwards, KERNHEAP was born.
|
|
|
|
After development started, a fully remote exploit against the SCTP flaw
|
|
surfaced, developed by sgrakkyu [15]. In private discussions with few
|
|
individuals, a technique for executing a successful attack remotely was
|
|
proposed: overwrite a syscall pointer to an attacker controlled
|
|
location (like a hook) to safely execute our payload out of the
|
|
interrupt context. This is exactly what sgrakkyu implemented for
|
|
x86_64, using the vsyscall table, which bypasses CONFIG_DEBUG_RODATA
|
|
(read-only .rodata) restrictions altogether. His exploit exposed not
|
|
only the flawed nature of the vulnerability classification process of
|
|
several organizations, the hypocritical and unethical handling of
|
|
security flaws of the Linux kernel developers, but also the futility of
|
|
SELinux and other security models against kernel vulnerabilities.
|
|
|
|
In order to prevent and detect exploitation of this class of security
|
|
flaws in the kernel, a new set of protections had to be designed and
|
|
implemented: KERNHEAP.
|
|
|
|
KERNHEAP encompasses different concepts to prevent and detect heap
|
|
overflows in the Linux kernel, as well as other well known heap related
|
|
vulnerabilities, namely double frees, partial overwrites, etc.
|
|
|
|
These concepts have been implemented introducing modifications into the
|
|
different allocators, as well as common interfaces, not only
|
|
preventing generic forms of memory corruption but also hardening
|
|
specific areas of the kernel which have been used or could be
|
|
potentially used to leverage attacks corrupting the heap. For instance,
|
|
the IPC subsystem, the copy_to_user() and copy_from_user() APIs and
|
|
others.
|
|
|
|
This is still ongoing research and the Linux kernel is an ever evolving
|
|
project which poses significant challenges. The inclusion of new
|
|
allocators will always pose a risk for new issues to surface, requiring
|
|
these protections to be adapted, or new ones developed for them.
|
|
|
|
------[ 3. Integrity assurance for kernel heap allocators
|
|
|
|
---[ 3.1 Meta-data protection against full and partial overwrites
|
|
|
|
As of the current (yet ever changing) upstream design of the current
|
|
kernel allocators (SLUB, SLAB, SLOB, future SLQB, etc.), we assume:
|
|
|
|
1. A set of caches exist which hold dynamically allocated slabs,
|
|
composed of one of more physically contiguous pages, containing
|
|
same size chunks.
|
|
|
|
2. These are initialized by default or created explicitly, always
|
|
with a known size. For example, multiple default caches exist to
|
|
hold slabs of common sizes which are a multiple of two (32, 64,
|
|
128, 256 and so forth).
|
|
|
|
3. These caches grow or shrink in size as required by the
|
|
allocator.
|
|
|
|
4. At the end of a kmem cache life, it must be destroyed and its
|
|
slabs released. The linked list of slabs is implicitly trusted
|
|
in this context.
|
|
|
|
5. The caches can be allocated contiguously, or adjacent to an
|
|
actual chain of slabs from another cache. Because the current
|
|
kmem_cache structure holds potentially harmful information
|
|
(including a pointer to the constructor of the cache), this
|
|
could be leveraged in an attack to subvert the execution flow.
|
|
|
|
6. The debugging facilities of these allocators provide a merely
|
|
informational value with their error detection mechanisms, which
|
|
are also inherently insecure. They are not enabled by default
|
|
and have a extremely high performance impact (accounting up to
|
|
50 to 70% slowdown). In addition, they leak information which
|
|
could be invaluable for a local attacker (ex. fixed known
|
|
values).
|
|
|
|
We are facing multiple issues in this scenario. First, the kernel
|
|
developers expect the third-party to handle situations like a cache
|
|
being destroyed while an object is being allocated. Albeit highly
|
|
unusual, such circumstances (like {6}) can arise provided the right
|
|
conditions are present.
|
|
|
|
In order to prevent {5} from being abused, we are left with two
|
|
realistic possibilities to deter a potential attack: randomization of
|
|
the allocator routines (see ASLR from the PaX documentation in [7] for
|
|
the concept) or introduce a guard (known in modern times as a 'cookie')
|
|
which contains information to validate the integrity of the kmem_cache
|
|
structure.
|
|
|
|
Thus, a decision was made to introduce a guard which works in
|
|
'cascade':
|
|
|
|
+--------------+
|
|
| global guard |------------------+
|
|
+--------------| kmem_cache guard |------------+
|
|
+------------------| slab guard | ...
|
|
+------------+
|
|
|
|
The idea is simple: break down every potential path of abuse and add
|
|
integrity information to each lower level structure. By deploying a
|
|
check which relies in all the upper level guards, we can detect
|
|
corruption of the data at any stage. In addition, this makes the safety
|
|
checks more resilient against information leaks, since an attacker will
|
|
be forced to access and read a wider range of values than one single
|
|
cookie. Such data could be out of range to the context of the execution
|
|
path being abused.
|
|
|
|
The global guard is initialized at the kernheap_init()
|
|
function, called from init/main.c during kernel start. In order to
|
|
gather entropy for its value, we need to initialize the random32 PRNG
|
|
earlier than in a default, upstream kernel. On x86, this is done with
|
|
the rdtsc xor'd with the jiffies value, and then seeded multiple times
|
|
during different stages of the kernel initialization, ensuring we have
|
|
a decent amount of entropy to avoid an easily predictable result.
|
|
|
|
Unfortunately, an architecture-independent method to seed the PRNG
|
|
hasn't been devised yet. Right now this is specific to platforms with a
|
|
working get_cycles() implementation (otherwise it falls back to a more
|
|
insecure seeding using different counters), though it is intended to
|
|
support all architectures where PaX is currently supported.
|
|
|
|
The slab and kmem_cache structures are defined in mm/slab.c and
|
|
mm/slub.c for the SLAB and SLUB allocators, respectively. The kernel
|
|
developers have chosen to make their type information static to those
|
|
files, and not available in the mm/slab.h header file. Since the
|
|
available allocators have generally different internals, they only
|
|
export a common API (even though few functions remain as no-op, for
|
|
example in SLOB).
|
|
|
|
A guard field has been added at the start of the kmem_cache structure,
|
|
and other structures might be modified to include a similar field
|
|
(depending on the allocator). The approach is to add a guard anywhere
|
|
where it can provide balanced performance (including memory footprint)
|
|
and security results.
|
|
|
|
In order to calculate the final checksum used in each kmem_cache and
|
|
their slabs, a high performance, yet collision resistant hash function
|
|
was required. This instantly left options such as the CRC family, FNV,
|
|
etc. out, since they are inefficient for our purposes. Therefore,
|
|
Murmur2 was chosen [9]. It's an exceptionally fast, yet simple
|
|
algorithm created by Austin Appleby, currently used by libmemcached and
|
|
other software.
|
|
|
|
Custom optimized versions were developed to calculate hashes for the
|
|
slab and cache structures, taking advantage of the fact that only a
|
|
relatively small set of word values need to be hashed.
|
|
|
|
The coverage of the guard checks is obviously limited to the meta-data,
|
|
but yields reliable protection for all objects of 1/8 page size and any
|
|
adjacent ones, during allocation and release operations. The
|
|
copy_from_user() and copy_to_user() functions have been modified to
|
|
include a slab and cache integrity check as well, which is orthogonal
|
|
to the boundary enforcement modifications explained in another section
|
|
of this paper.
|
|
|
|
The redzone approach used by the SLAB/SLUB/SLQB allocators used a fixed
|
|
known value to detect certain scenarios (explained in the next
|
|
subsection). The values are 64-bit long:
|
|
|
|
#define RED_INACTIVE 0x09F911029D74E35BULL
|
|
#define RED_ACTIVE 0xD84156C5635688C0ULL
|
|
|
|
This is clearly suitable for debugging purposes, but largely
|
|
inefficient for security. An immediate improvement would be to generate
|
|
these values on runtime, but then it is still possible to avoid writing
|
|
over them and still modify the meta-data. This is exactly what is being
|
|
prevented by using a checksum guard, which depends on a runtime
|
|
generated cookie (at boot time). The examples below show an overwrite
|
|
of an object in the kmalloc-64 cache:
|
|
|
|
slab error in verify_redzone_free(): cache `size-64': memory outside
|
|
object was overwritten
|
|
Pid: 6643, comm: insmod Not tainted 2.6.29.2-grsec #1
|
|
Call Trace:
|
|
[<c0889a81>] __slab_error+0x1a/0x1c
|
|
[<c088aee9>] cache_free_debugcheck+0x137/0x1f5
|
|
[<c088ba14>] kfree+0x9d/0xd2
|
|
[<c0802f22>] syscall_call+0x7/0xb
|
|
df271338: redzone 1:0xd84156c5635688c0, redzone 2:0x4141414141414141.
|
|
|
|
|
|
Slab corruption: size-64 start=df271398, len=64
|
|
Redzone: 0x4141414141414141/0x9f911029d74e35b.
|
|
Last user: [<c08d1da5>](free_rb_tree_fname+0x38/0x6f)
|
|
000: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41
|
|
010: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41
|
|
020: 41 41 41 41 41 41 41 41 6b 6b 6b 6b 6b 6b 6b 6b
|
|
Prev obj: start=df271340, len=64
|
|
|
|
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
|
|
Last user: [<c08d1e55>](ext3_htree_store_dirent+0x34/0x124)
|
|
000: 48 8e 78 08 3b 49 86 3d a8 1f 27 df e0 10 27 df
|
|
010: a8 14 27 df 00 00 00 00 62 d3 03 00 0c 01 75 64
|
|
Next obj: start=df2713f0, len=64
|
|
|
|
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
|
|
Last user: [<c08d1da5>](free_rb_tree_fname+0x38/0x6f)
|
|
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
|
|
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
|
|
|
|
The trail of 0x6B bytes can be observed in the output above. This is
|
|
the SLAB_POISON feature. Poisoning is the approach that will be
|
|
described in the next subsection. It's basically overwriting the object
|
|
contents with a known value to detect modifications post-release or
|
|
uninitialized usage. The values are defined (like the redzone ones) at
|
|
include/linux/poison.h:
|
|
|
|
#define POISON_INUSE 0x5a
|
|
#define POISON_FREE 0x6b
|
|
#define POISON_END 0xa5
|
|
|
|
KERNHEAP performs validation of the cache guards at allocation and
|
|
release related functions. This allows detection of corruption in the
|
|
chain of guards and results in a system halt and a stack dump.
|
|
|
|
The safety checks are triggered from kfree() and kmem_cache_free(),
|
|
kmem_cache_destroy() and other places. Additional checkpoints are being
|
|
considered, since taking a wrong approach could lead to TOCTOU issues,
|
|
again depending on the allocator. In SLUB, merging is disabled to avoid
|
|
the potentially detrimental effects (to security) of this feature. This
|
|
might kill one of the most attractive points of SLUB, but merging comes
|
|
at the cost of letting objects be neighbors to other objects which
|
|
would have been placed elsewhere out of reach, allowing overflow
|
|
conditions to produce likely exploitable conditions. Even with guard
|
|
checks in place, this is still a scenario to be avoided.
|
|
|
|
One additional change, first introduced by PaX, is to change the
|
|
address of the ZERO_SIZE_PTR. In mainline kernel, this address points
|
|
to 0x00000010. An address reachable in userland is clearly a bad idea
|
|
in security terms, and PaX wisely solves this by setting it to
|
|
0xfffffc00, and modifying the ZERO_OR_NULL_PTR macro. This protects
|
|
against a situation in which kmalloc is called with a zero size (for
|
|
example due to an integer overflow in a length parameter) and the
|
|
pointer is used to read or write information from or to userland.
|
|
|
|
---[ 3.2 Detection of arbitrary free pointers and freelist corruption
|
|
|
|
In the history of heap related memory corruption vulnerabilities, a
|
|
more obscure class of flaws has been long time known, albeit less
|
|
publicized: arbitrary pointer and double free issues.
|
|
|
|
The idea is simple: a programming mistake leads to an exploitable
|
|
condition in which the state of the heap allocator can be made
|
|
inconsistent when an already freed object is being released again, or
|
|
an arbitrary pointer is passed to the free function. This is a strictly
|
|
allocator internals-dependent scenario, but generally the goal is to
|
|
control a function pointer (for example, a constructor/destructor
|
|
function used for object initialization, which is later called) or a
|
|
write-n primitive (a single byte, four bytes and so forth).
|
|
|
|
In practice, these vulnerabilities can pose a true challenge for
|
|
exploitation, since thorough knowledge of the allocator and state of
|
|
the heap is required. Manipulating the freelist (also known as
|
|
freelist in the kernel) might cause the state of the heap to be
|
|
unstable post-exploitation and thwart cleanup efforts or graceful
|
|
returns. In addition, another thread might try to access it or perform
|
|
operations (such as an allocation) which yields a page fault.
|
|
|
|
In an environment with 2.6.29.2 (grsecurity patch applied, full PaX
|
|
feature set enabled except for KERNEXEC, RANDKSTACK and UDEREF) and the
|
|
SLAB allocator, the following scenarios could be observed:
|
|
|
|
1. An object is allocated and shortly afterwards, the object is
|
|
released via kfree(). Another allocation follows, and a pointer
|
|
referencing to the previous allocation is passed to kfree(),
|
|
therefore the newly allocated object is released instead due to the
|
|
LIFO nature of the allocator.
|
|
|
|
void *a = kmalloc(64, GFP_KERNEL);
|
|
foo_t *b = (foo_t *) a;
|
|
|
|
/* ... */
|
|
kfree(a);
|
|
a = kmalloc(64, GFP_KERNEL);
|
|
/* ... */
|
|
kfree(b);
|
|
|
|
2. An object is allocated, and two successive calls to kfree() take
|
|
place with no allocation in-between.
|
|
|
|
void *a = kmalloc(64, GFP_KERNEL);
|
|
foo_t *b = (foo_t *) a;
|
|
|
|
kfree(a);
|
|
kfree(b);
|
|
|
|
In both cases we are releasing an object twice, but the state of the
|
|
allocator changes slightly. Also, there could be more than just a
|
|
single allocation in-between (for example, if this condition existed
|
|
within filesystem or network stack code) leading to less predictable
|
|
results. The more obvious result of the first scenario is corruption of
|
|
the freelist, and a potential information leak or arbitrary access to
|
|
memory in the second (for instance, if an attacker could force a new
|
|
allocation before the incorrectly released object is used, he could
|
|
control the information stored there).
|
|
|
|
The following output can be observed in a system using the SLAB
|
|
allocator with is debugging facilities enabled:
|
|
|
|
slab error in verify_redzone_free(): cache `size-64': double free detected
|
|
Pid: 4078, comm: insmod Not tainted 2.6.29.2-grsec #1
|
|
Call Trace:
|
|
[<c0889a81>] __slab_error+0x1a/0x1c
|
|
[<c088aee9>] cache_free_debugcheck+0x137/0x1f5
|
|
[<c088ba14>] kfree+0x9d/0xd2
|
|
[<c0802f22>] syscall_call+0x7/0xb
|
|
df2e42e0: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b.
|
|
|
|
The debugging facilities of SLAB and SLUB provide a redzone-based
|
|
approach to detect the first scenario, but introduce a performance
|
|
impact while being useless security-wise, since the system won't halt
|
|
and the state of the allocator will be left unstable. Therefore, their
|
|
value is only informational and useful for debugging purposes, not as a
|
|
security measure. The redzone values are also static.
|
|
|
|
The other approach taken by the debugging facilities is poisoning, as
|
|
mentioned in the previous subsection. An object is 'poisoned' with a
|
|
value, which can be checked at different places to detect if the object
|
|
is being used uninitialized or post-release. This rudimentary but
|
|
effective method is implemented upstream in a manner which makes it
|
|
inefficient for security purposes.
|
|
|
|
Currently, upstream poisoning is clearly oriented to debugging. It
|
|
writes a single-byte pattern in the whole object space, marking the end
|
|
with a known value. This incurs in a significant performance impact.
|
|
|
|
KERNHEAP performs the following safety checks at the time of this
|
|
writing:
|
|
|
|
1. During cache destruction:
|
|
|
|
a) The guard value is verified.
|
|
|
|
b) The entire cache is walked, verifying the freelists for
|
|
potential corruption. Reference counters, guards, validity of
|
|
pointers and other structures are checked. If any mismatch is
|
|
found, a system halt ensues.
|
|
|
|
c) The pointer to the cache itself is changed to ZERO_SIZE_PTR.
|
|
This should not affect any well behaving (that is, not broken)
|
|
kernel code.
|
|
|
|
2. After successful kfree, a word value is written to the memory
|
|
and pointer location is changed to ZERO_SIZE_PTR. This will
|
|
trigger a distinctive page fault if the pointer is accessed
|
|
again somewhere. Currently this operation could be invasive for
|
|
drivers or code with dubious coding practices.
|
|
|
|
3. During allocation, if the word value at the start of the
|
|
to-be-returned object doesn't match our post-free value, a
|
|
system halt ensues.
|
|
|
|
The object-level guard values (equivalent to the redzoning) are
|
|
calculated on runtime. This deters bypassing of the checks via fake
|
|
objects, resulting from a slab overflow scenario. It does introduce a
|
|
low performance impact on setup and verification, minimized by the use
|
|
of inline functions, instead of external definitions like those used
|
|
for some of the more general cache checks.
|
|
|
|
The effectiveness of the reference counter checks is orthogonal
|
|
to the deployment of PaX's REFCOUNT, which protects many object
|
|
reference counters against overflows (including SLAB/SLUB).
|
|
|
|
Safe unlinking is enforced in all LIST_HEAD based linked lists, which
|
|
obviously includes the partial/empty/full lists for SLAB and several
|
|
other structures (including the freelists) in other allocators. If a
|
|
corrupted entry is being unlinked, a system halt is forced. The values
|
|
used for list pointer poisoning have been changed to point
|
|
non-userland-reachable addresses (this change has been taken from PaX).
|
|
|
|
The use-after-free and double-free detection mechanisms in KERNHEAP are
|
|
still under development, and it's very likely that substantial design
|
|
changes will occur after the release of this paper.
|
|
|
|
---[ 3.3 Overview of NetBSD and OpenBSD kernel heap safety checks
|
|
|
|
At the moment KERNHEAP exclusively covers the Linux kernel, but it is
|
|
interesting to observe the approaches taken by other projects to detect
|
|
kernel heap integrity issues. In this section we will briefly analyze
|
|
the NetBSD and OpenBSD kernels, which are largely the same code base in
|
|
regards of kernel malloc implementation and diagnostic checks.
|
|
|
|
Both currently implement rudimentary but effective measures to detect
|
|
use-after-free and double-free scenarios, albeit these are only enabled as
|
|
part of the DIAGNOSTIC and DEBUG configurations.
|
|
|
|
The following source code is taken from NetBSD 4.0 and should be almost
|
|
identical to OpenBSD. Their approach to detect use-after-free relies on
|
|
copying a known 32-bit value (WEIRD_ADDR, from kern/kern_malloc.c):
|
|
|
|
/*
|
|
* The WEIRD_ADDR is used as known text to copy into free objects so
|
|
* that modifications after frees can be detected.
|
|
*/
|
|
#define WEIRD_ADDR ((uint32_t) 0xdeadbeef)
|
|
...
|
|
|
|
void *malloc(unsigned long size, struct malloc_type *ksp, int flags)
|
|
...
|
|
{
|
|
...
|
|
#ifdef DIAGNOSTIC
|
|
/*
|
|
* Copy in known text to detect modification
|
|
* after freeing.
|
|
*/
|
|
end = (uint32_t *)&cp[copysize];
|
|
for (lp = (uint32_t *)cp; lp < end; lp++)
|
|
*lp = WEIRD_ADDR;
|
|
freep->type = M_FREE;
|
|
#endif /* DIAGNOSTIC */
|
|
|
|
The following checks are the counterparts in free(), which call panic() when
|
|
the checks fail, causing a system halt (this obviously has a better security
|
|
benefit than just the information approach taken by Linux's SLAB
|
|
diagnostics):
|
|
|
|
#ifdef DIAGNOSTIC
|
|
...
|
|
if (__predict_false(freep->spare0 == WEIRD_ADDR)) {
|
|
for (cp = kbp->kb_next; cp;
|
|
cp = ((struct freelist *)cp)->next) {
|
|
if (addr != cp)
|
|
continue;
|
|
printf("multiply freed item %p\n", addr);
|
|
panic("free: duplicated free");
|
|
}
|
|
}
|
|
...
|
|
copysize = size < MAX_COPY ? size : MAX_COPY;
|
|
end = (int32_t *)&((caddr_t)addr)[copysize];
|
|
for (lp = (int32_t *)addr; lp < end; lp++)
|
|
*lp = WEIRD_ADDR;
|
|
freep->type = ksp;
|
|
#endif /* DIAGNOSTIC */
|
|
|
|
Once the object is released, the 32-bit value is copied, along the type
|
|
information to detect the potential origin of the problem. This should be
|
|
enough to catch basic forms of freelist corruption.
|
|
|
|
It's worth noting that the freelist_sanitycheck() function provides
|
|
integrity checking for the freelist, but is enclosed in an ifdef 0 block.
|
|
|
|
The problem affecting these diagnostic checks is the use of known values, as
|
|
much as Linux's own SLAB redzoning and poisoning might be easily bypassed in
|
|
a deliberate attack scenario. It still remains slightly more effective due
|
|
to the system halt enforcing upon detection, which isn't present in Linux.
|
|
|
|
Other sanity checks are done with the reference counters in free():
|
|
|
|
if (ksp->ks_inuse == 0)
|
|
panic("free 1: inuse 0, probable double free");
|
|
|
|
And validating (with a simple address range test) if the pointer being
|
|
freed looks sane:
|
|
|
|
if (__predict_false((vaddr_t)addr < vm_map_min(kmem_map) ||
|
|
(vaddr_t)addr >= vm_map_max(kmem_map)))
|
|
panic("free: addr %p not within kmem_map", addr);
|
|
|
|
Ultimately, users of either NetBSD or OpenBSD might want to enable
|
|
KMEMSTATS or DIAGNOSTIC configurations to provide basic protection against
|
|
heap corruption in those systems.
|
|
|
|
---[ 3.4 Microsoft Windows 7 kernel pool allocator safe unlinking
|
|
|
|
In 26 May 2009, a suspiciously timed article was published by Peter
|
|
Beck from the Microsoft Security Engineering Center (MSEC) Security
|
|
Science team, about the inclusion of safe unlinking into the Windows 7
|
|
kernel pool (the equivalent to the slab allocators in Linux).
|
|
|
|
This has received a deal of publicity for a change which accounts up to
|
|
two lines of effective code, and surprisingly enough, was already
|
|
present in non-retail versions of Vista. In addition, safe unlinking
|
|
has been present in other heap allocators for a long time: in the GNU
|
|
libc since at least 2.3.5 (proposed by Stefan Esser originally to Solar
|
|
Designer for the Owl libc) and the Linux kernel since 2006
|
|
(CONFIG_DEBUG_LIST).
|
|
|
|
While it is out of scope for this paper to explain the internals of the
|
|
Windows kernel pool allocator, this section will provide a short
|
|
overview of it. For true insight the slides by Kostya Kortchinsky,
|
|
"Exploiting Kernel Pool Overflows" [14], can provide a through look at
|
|
it from a sound security perspective.
|
|
|
|
The allocator is very similar to SLAB and the API to obtain allocations
|
|
and release them is straightforward (nt!ExAllocatePool(WithTag),
|
|
nt!ExFreePool(WithTag) and so forth). The default pools (sort of a
|
|
kmem_cache equivalent) are the (two) paged, non-paged and session paged
|
|
ones. Non-paged for physical memory allocations and paged for pageable
|
|
memory. The structure defining a pool can be seen below:
|
|
|
|
kd> dt nt!_POOL_DESCRIPTOR
|
|
+0x000 PoolType : _POOL_TYPE
|
|
+0x004 PoolIndex : Uint4B
|
|
+0x008 RunningAllocs : Uint4B
|
|
+0x00c RunningDeAllocs : Uint4B
|
|
+0x010 TotalPages : Uint4B
|
|
+0x014 TotalBigPages : Uint4B
|
|
+0x018 Threshold : Uint4B
|
|
+0x01c LockAddress : Ptr32 Void
|
|
+0x020 PendingFrees : Ptr32 Void
|
|
+0x024 PendingFreeDepth : Int4B
|
|
+0x028 ListHeads : [512] _LIST_ENTRY
|
|
|
|
The most important member in the structure is ListHeads, which contains
|
|
512 linked lists, to hold the free chunks. The granularity of
|
|
the allocator is 8 bytes for Windows XP and up, and 32 bytes for
|
|
Windows 2000. The maximum allocation size possible is 4080 bytes.
|
|
LIST_ENTRY is exactly the same as LIST_HEAD in Linux.
|
|
|
|
Each chunk contains a 8 byte header. The chunk header is defined as
|
|
follows for Windows XP and up:
|
|
|
|
kd> dt nt!_POOL_HEADER
|
|
+0x000 PreviousSize : Pos 0, 9 Bits
|
|
+0x000 PoolIndex : Pos 9, 7 Bits
|
|
+0x002 BlockSize : Pos 0, 9 Bits
|
|
+0x002 PoolType : Pos 9, 7 Bits
|
|
+0x000 Ulong1 : Uint4B
|
|
+0x004 ProcessBilled : Ptr32 _EPROCESS
|
|
+0x004 PoolTag : Uint4B
|
|
+0x004 AllocatorBackTraceIndex : Uint2B
|
|
+0x006 PoolTagHash : Uint2B
|
|
|
|
|
|
The PreviousSize contains the value of the BlockSize of the previous
|
|
chunk, or zero if it's the first. This value could be checked during
|
|
unlinking for additional safety, but this isn't the case (their checks
|
|
are limited to validity of prev/next pointers relative to the entry
|
|
being deleted). PooType is zero if free, and PoolTag contains four
|
|
printable characters to identify the user of the allocation. This isn't
|
|
authenticated nor verified in any way, therefore it is possible to
|
|
provide a bogus tag to one of the allocation or free APIs.
|
|
|
|
For small allocations, the pool allocator uses lookaside caches, with a
|
|
maximum BlockSize of 256 bytes.
|
|
|
|
Kostya's approach to abuse pool allocator overflows involves the
|
|
classic write-4 primitive through unlinking of a fake chunk under his
|
|
control. For the rest of information about the allocator internals,
|
|
please refer to his excellent slides [14].
|
|
|
|
The minimal change introduced by Microsoft to enable safe unlinking in
|
|
Windows 7 was already present in Vista non-retail builds, thus it is
|
|
likely that the announcement was merely a marketing exercise.
|
|
Furthermore, Beck states that this allows to detect "memory corruption
|
|
at the earliest opportunity", which isn't necessarily correct if they
|
|
had pursued a more complete solution (for example, verifying that
|
|
pointers belong to actual freelist chunks). Those might incur in a
|
|
higher performance overhead, but provide far more consistent
|
|
protection.
|
|
|
|
The affected API is RemoveEntryList(), and the result of unlinking an
|
|
entry with incorrect prev/next pointers will be a BugCheck:
|
|
|
|
Flink = Entry->Flink;
|
|
Blink = Entry->Blink;
|
|
if (Flink->Blink != Entry) KeBugCheckEx(...);
|
|
if (Blink->Flink != Entry) KeBugCheckEx(...);
|
|
|
|
It's unlikely that there will be further changes to the pool allocator
|
|
for Windows 7, but there's still time for this to change before release
|
|
date.
|
|
|
|
------[ 4. Sanitizing memory of the look-aside caches
|
|
|
|
The objects and data contained in slabs allocated within the kmem
|
|
caches could be of sensitive nature, including but not limited to:
|
|
cryptographic secrets, PRNG state information, network information,
|
|
userland credentials and potentially useful internal kernel state
|
|
information to leverage an attack (including our guards or cookie
|
|
values).
|
|
|
|
In addition, neither kfree() nor kmalloc() zero memory, thus allowing
|
|
the information to stay there for an indefinite time, unless they are
|
|
overwritten after the space is claimed in an allocation procedure. This
|
|
is a security risk by itself, since an attacker could essentially rely
|
|
on this condition to "spray" the kernel heap with his own fake
|
|
structures or machine instructions to further improve the reliability
|
|
of his attack.
|
|
|
|
PaX already provides a feature to sanitize memory upon release, at a
|
|
performance cost of roughly 3%. This an opt-all policy, thus it
|
|
is not possible to choose in a fine-grained manner what memory is
|
|
sanitized and what isn't. Also, it works at the lowest level possible,
|
|
the page allocator. While this is a safe approach and ensures that all
|
|
allocated memory is properly sanitized, it is desirable to be able to
|
|
opt-in voluntarily to have your newly allocated memory treated as
|
|
sensitive.
|
|
|
|
Hence, a GFP_SENSITIVE flag has been introduced. While a security
|
|
conscious developer could zero memory on his own, the availability of a
|
|
flag to assure this behavior (as well as other enhancements and safety
|
|
checks) is convenient. Also, the performance cost is negligible, if
|
|
any, since the flag could be applied to specific allocations or caches
|
|
altogether.
|
|
|
|
The low level page allocator uses a PF_sensitive flag internally, with
|
|
the associated SetPageSensitive, ClearPagesensitiv and PageSensitive
|
|
macros. These changes have been introduced in the linux/page-flags.h
|
|
header and mm/page_alloc.c.
|
|
|
|
SLAB / kmalloc layer Low-level page allocator
|
|
include/linux/slab.h include/linux/page-flags.h
|
|
|
|
+----------------. +--------------+
|
|
| SLAB_SENSITIVE | ->| PG_sensitive |
|
|
+----------------. | +--------------+
|
|
| | |-> SetPageSensitive
|
|
| +---------------+ | |-> ClearPageSensitive
|
|
\---> | GFP_SENSITIVE |-/ |-> PageSensitive
|
|
+---------------+ ...
|
|
|
|
This will prevent the aforementioned leak of information post-release,
|
|
and provide an easy to use mechanism for third-party developers to take
|
|
advantage of the additional assurance provided by this feature.
|
|
|
|
In addition, another loophole that has been removed is related with
|
|
situations in which successive allocations are done via kmalloc(), and
|
|
the information is still accessible through the newly allocated object.
|
|
This happens when the slab is never released back to the page
|
|
allocator, since slabs can live for an indefinite amount of time
|
|
(there's no assurance as to when the cache will go through shrinkage or
|
|
reaping). Upon release, the cache can be checked for the SLAB_SENSITIVE
|
|
flag, the page can be checked for the PG_sensitive bit, and the
|
|
allocation flags can be checked for GFP_SENSITIVE.
|
|
|
|
Currently, the following interfaces have been modified to operate with
|
|
this flag when appropriate:
|
|
|
|
- IPC kmem cache
|
|
- Cryptographic subsystem (CryptoAPI)
|
|
- TTY buffer and auditing API
|
|
- WEP encryption and decryption in mac80211 (key storage only)
|
|
- AF_KEY sockets implementation
|
|
- Audit subsystem
|
|
|
|
The RBAC engine in grsecurity can be modified to add support for
|
|
enabling the sensitive memory flag per-process. Also, a group id based
|
|
check could be added, configurable via sysctl. This will allow
|
|
fine-grained policy or group based deployment of the current and future
|
|
benefits of this flag. SELinux and any other policy based security
|
|
frameworks could benefit from this feature as well.
|
|
|
|
This patchset has been proposed to the mainline kernel developers as of
|
|
May 21st 2009 (see http://patchwork.kernel.org/patch/25062). It
|
|
received feedback from Alan Cox and Rik van Riel and a different
|
|
approach was used after some developers objected to the use of a page
|
|
flag, since the functionality can be provided to SLAB/SLUB allocators
|
|
and the VMA interfaces without the use of a page flag. Also, the naming
|
|
changed to CONFIDENTIAL, to avoid confusion with the term 'sensitive'.
|
|
|
|
Unfortunately, without a page bit, it's impossible to track down what
|
|
pages shall be sanitized upon release, and provide fine-grained control
|
|
over these operations, making the gfp flag almost useless, as well as
|
|
other interesting features, like sanitizing pages locked via mlock().
|
|
The mainline kernel developers oppose the introduction of a new page
|
|
flag, even though SLUB and SLOB introduced their own flags when they
|
|
were merged, and this wasn't frowned upon in such cases. Hopefully this
|
|
will change in the future, and allow a more complete approach to be
|
|
merged in mainline at some point.
|
|
|
|
Despite the fact that Ingo Molnar, Pekka Enberg and Peter Zijlstra
|
|
completely missed the point about the initially proposed patches,
|
|
new ones performing selective sanitization were sent following up their
|
|
recommendations of a completely flawed approach. This case serves as a
|
|
good example of how kernel developers without security knowledge nor
|
|
experience take decisions that negatively impact conscious users of the
|
|
Linux kernel as a whole.
|
|
|
|
Hopefully, in order to provide a reliable protection, the upstream
|
|
approach will finally be selective sanitization using kzfree(),
|
|
allowing us to redefine it to kfree() in the appropriate header file,
|
|
and use something that actually works. Fixing a broken implementation
|
|
is an undesirable burden often found when dealing with the 2.6 branch
|
|
of the kernel, as usual.
|
|
|
|
------[ 5. Deterrence of IPC based kmalloc() overflow exploitation
|
|
|
|
In addition to the rest of the features which provide a generic
|
|
protection against common scenarios of kernel heap corruption, a
|
|
modification has been introduced to deter a specific local attack for
|
|
abusing kmalloc() overflows successfully. This technique is currently
|
|
the only public approach to kernel heap buffer overflow exploitation
|
|
and relies on the following circumstances:
|
|
|
|
1. The attacker has local access to the system and can use the IPC
|
|
subsystem, more specifically, create, destroy and perform
|
|
operations on semaphores.
|
|
|
|
2. The attacker is able to abuse a allocate-overflow-free situation
|
|
which can be leveraged to overwrite adjacent objects, also
|
|
allocated via kmalloc() within the same kmem cache.
|
|
|
|
3. The attacker can trigger the overflow in the right timing to
|
|
ensure that the adjacent object overwritten is under his
|
|
control. In this case, the shmid_kernel structure (used
|
|
internally within the IPC subsystem), leading to a userland
|
|
pointer dereference, pointing at attacker controlled structures.
|
|
|
|
4. Ultimately, when these attacker controlled structures are used
|
|
by the IPC subsystem, a function pointer is called. Since the
|
|
attacker controls this information, this is essentially a
|
|
game-over scenario. The kernel will execute arbitrary code of
|
|
the attacker's choice and this will lead to elevation of
|
|
privileges.
|
|
|
|
Currently, PaX UDEREF [8] on x86 provides solid protection against
|
|
(3) and (4). The attacker will be unable to force the kernel into
|
|
executing instructions located in the userland address space. A
|
|
specific class of vulnerabilities, kernel NULL pointer deferences
|
|
(which were, for a long time, overlooked and not considered exploitable
|
|
by most of the public players in the security community, with few
|
|
exceptions) were mostly eradicated (thanks to both UDEREF and further
|
|
restrictions imposed on mmap(), later implemented by Red Hat and
|
|
accepted into mainline, albeit containing flaws which made the
|
|
restriction effectively useless).
|
|
|
|
On systems where using UDEREF is unbearable for performance or
|
|
functionality reasons (for example, virtualization), a workaround to
|
|
harden the IPC subsystem was necessary. Hence, a set of simple safety
|
|
checks were devised for the shmid_kernel structure, and the allocation
|
|
helper functions have been modified to use their own private cache.
|
|
|
|
The function pointer verification checks if the pointers located within
|
|
the file structure, are actually addresses within the kernel text range
|
|
(including modules).
|
|
|
|
The internal allocation procedures of the IPC code make use of both
|
|
vmalloc() and kmalloc(), for sizes greater than a page or lower than a
|
|
page, respectively. Thus, the size for the cache objects is PAGE_SIZE,
|
|
which might be suboptimal in terms of memory space, but does not impact
|
|
performance. These changes have been tested using the IBM ipc_stress
|
|
test suite distributed in the Linux Test Project sources, with
|
|
successful results (can be obtained from http://ltp.sourceforge.net).
|
|
|
|
------[ 6. Prevention of copy_to_user() and copy_from_user() abuse
|
|
|
|
A vast amount of kernel vulnerabilities involving information leaks to
|
|
userland, as well as buffer overflows when copying data from userland,
|
|
are caused by signedness issues (meaning integer overflows, reference
|
|
counter overflows, et cetera). The common scenario is an invalid
|
|
integer passed to the copy_to_user() or copy_from_user() functions.
|
|
|
|
During the development of KERNHEAP, a question was raised about these
|
|
functions: Is there a existent, reliable API which allows retrieval of
|
|
the target buffer information in both copy-to and copy-from scenarios?
|
|
|
|
Introducing size awareness in these functions would provide a simple,
|
|
yet effective method to deter both information leaks and buffer
|
|
overflows through them. Obviously, like in every security system, the
|
|
effectiveness of this approach is orthogonal to the deployment of other
|
|
measures, to prevent potential corner cases and rare situations useful
|
|
for an attacker to bypass the safety checks.
|
|
|
|
The current kernel heap allocators (including SLOB) provide a function
|
|
to retrieve the size of a slab object, as well as testing the validity
|
|
of a pointer to see if it's within the known caches (excluding SLOB
|
|
which required this function to be written since it's essentially a
|
|
no-op in upstream sources). These functions are ksize() and
|
|
kmem_validate_ptr() respectively (in each pertinent allocator source:
|
|
mm/slab.c, mm/slub.c and mm/slob.c).
|
|
|
|
In order to detect whether a buffer is stack or heap based in the
|
|
kernel, the object_is_on_stack() function (from include/linux/sched.h)
|
|
can be used. The drawback of these functions is the computational cost
|
|
of looking up the page where this buffer is located, checking its
|
|
validity wherever applicable (in the case of kmem_validate_ptr() this
|
|
involves validating against a known cache) and performing other tasks
|
|
to determine the validity and properties of the buffer. Nonetheless,
|
|
the performance impact might be negligible and reasonable for the
|
|
additional assurance provided with these changes.
|
|
|
|
Brad Spengler devised this idea, developed and introduced the checks
|
|
into the latest test patches as of April 27th (test10 to test11 from
|
|
PaX and the grsecurity counterparts for the current kernel stable
|
|
release, 2.6.29.1).
|
|
|
|
A reliable method to detect stack-based objects is still being
|
|
considered for implementation, and might require access to meta-data
|
|
used for debuggers or future GCC built-ins.
|
|
|
|
------[ 7. Prevention of vsyscall overwrites on x86_64
|
|
|
|
This technique is used in sgrakkyu's exploit for CVE-2009-0065. It
|
|
involves overwriting a x86_64 specific location within a top memory
|
|
allocated page, containing the vsyscall mapping. This mapping is used
|
|
to implement a high performance entry point for the gettimeofday()
|
|
system call, and other functionality.
|
|
|
|
An attacker can target this mapping by means of an arbitrary write-N
|
|
primitive and overwrite the machine instructions there to produce a
|
|
reliable return vector, for both remote and local attacks. For remote
|
|
attacks the attacker will likely use an offset-aware approach for
|
|
reliability, but locally it can be used to execute an offset-less
|
|
attack, and force the kernel into dereferencing userland memory. This
|
|
is problematic since presently PaX does not support UDEREF on x86_64
|
|
and the performance cost of its implementation could be significant,
|
|
making abuse a safe bet even against hardened environments.
|
|
|
|
Therefore, contrary to past popular belief, x86_64 systems are more
|
|
exposed than i386 in this regard.
|
|
|
|
During conversations with the PaX Team, some difficulties came to
|
|
attention regarding potential approaches to deter this technique:
|
|
|
|
1. Modifying the location of the vsyscall mapping will break
|
|
compatibility. Thus, glibc and other userland software would
|
|
require further changes. See arch/x86/kernel/vmlinux_64.lds.S
|
|
and arch/x86/kernel/vsyscall_64.c
|
|
|
|
2. The vsyscall page is defined within the ld linked script for
|
|
x86_64 (arch/x86/kernel/vmlinux_64.lds.S). It is defined by
|
|
default (as of 2.6.29.3) within the boundaries of the .data
|
|
section, thus writable for the kernel. The userland mapping
|
|
is read-execute only.
|
|
|
|
3. Removing vsyscall support might have a large performance impact
|
|
on applications making extensive use of gettimeofday().
|
|
|
|
4. Some data has to be written in this region, therefore it can't
|
|
be permanently read-only.
|
|
|
|
PaX provides a write-protect mechanism used by KERNEXEC, together with
|
|
its definition for an actual working read-only .rodata implementation.
|
|
Moving the vsyscall within the .rodata section provides reliable
|
|
protection against this technique. In order to prevent sections from
|
|
overlapping, some changes had to be introduced, since the section has
|
|
to be aligned to page size. In non-PaX kernels, .rodata is only
|
|
protected if the CONFIG_DEBUG_RODATA option is enabled.
|
|
|
|
The PaX Team solved {4} using pax_open_kernel() and pax_close_kernel()
|
|
to allow writes temporarily. This has some performance impact but is
|
|
most likely far lower than removing vsyscall support completely.
|
|
|
|
This deters abuse of the vsyscall page on x86_64, and prevents
|
|
offset-based remote and offset-less local exploits from leveraging a
|
|
reliable attack against a kernel vulnerability. Nonetheless, protection
|
|
against this venue of attack is still work in progress.
|
|
|
|
------[ 8. Developing the right regression testsuite for KERNHEAP
|
|
|
|
Shortly after the initial development process started, it became
|
|
evident that a decent set of regression tests was required to check if
|
|
the implementation worked as expected. While using single loadable
|
|
modules for each test was a straightforward solution, in the longterm,
|
|
having a real tool to perform thorough testing seemed the most logical
|
|
approach.
|
|
|
|
Hence, KHTEST has been developed. It's composed of a kernel module
|
|
which communicates to a userland Python program over Netlink sockets.
|
|
The ctypes API is used to handle the low level structures that define
|
|
commands and replies. The kernel module exposes internal APIs to the
|
|
userland process, such as:
|
|
|
|
- kmalloc
|
|
- kfree
|
|
- memset and memcpy
|
|
- copy_to_user and copy_from_user
|
|
|
|
Using this interface, allocation and release of kernel memory can be
|
|
controlled with a simple Python script, allowing efficient development
|
|
of testcases:
|
|
|
|
e = KernHeapTester()
|
|
addr = e.kmalloc(size)
|
|
e.kfree(addr)
|
|
e.kfree(addr)
|
|
|
|
When this test runs on an unprotected 2.6.29.2 system (SLAB as
|
|
allocator, debugging capabilities enabled) the following output can be
|
|
observed in the kernel message buffer, with a subsequent BUG on cache
|
|
reaping:
|
|
|
|
KERNHEAP test-suite loaded.
|
|
run_cmd_kmalloc: kmalloc(64, 000000b0) returned 0xDF1BEC30
|
|
run_cmd_kfree: kfree(0xDF1BEC30)
|
|
run_cmd_kfree: kfree(0xDF1BEC30)
|
|
slab error in verify_redzone_free(): cache `size-64': double free detected
|
|
Pid: 3726, comm: python Not tainted 2.6.29.2-grsec #1
|
|
Call Trace:
|
|
[<c0889a81>] __slab_error+0x1a/0x1c
|
|
[<c088aee9>] cache_free_debugcheck+0x137/0x1f5
|
|
[<e082f25c>] ? run_cmd_kfree+0x1e/0x23 [kernheap_test]
|
|
[<c088ba14>] kfree+0x9d/0xd2
|
|
[<e082f25c>] run_cmd_kfree+0x1e/0x23
|
|
|
|
kernel BUG at mm/slab.c:2720!
|
|
invalid opcode: 0000 [#1] SMP
|
|
last sysfs file: /sys/kernel/uevent_seqnum
|
|
Pid: 10, comm: events/0 Not tainted (2.6.29.2-grsec #1) VMware Virtual Platform
|
|
EIP: 0060:[<c088ac00>] EFLAGS: 00010092 CPU: 0
|
|
EIP is at slab_put_obj+0x59/0x75
|
|
EAX: 0000004f EBX: df1be000 ECX: c0828819 EDX: c197c000
|
|
ESI: 00000021 EDI: df1bec28 EBP: dfb3deb8 ESP: dfb3de9c
|
|
DS: 0068 ES: 0068 FS: 00d8 GS: 0000 SS: 0068
|
|
Process events/0 (pid: 10, ti=dfb3c000 task=dfb3ae30 task.ti=dfb3c000)
|
|
Stack:
|
|
c0bc24ee c0bc1fd7 df1bec28 df800040 df1be000 df8065e8 df800040 dfb3dee0
|
|
c088b42d 00000000 df1bec28 00000000 00000001 df809db4 df809db4 00000001
|
|
df809d80 dfb3df00 c088be34 00000000 df8065e8 df800040 df8065e8 df800040
|
|
Call Trace:
|
|
[<c088b42d>] ? free_block+0x98/0x103
|
|
[<c088be34>] ? drain_array+0x85/0xad
|
|
[<c088beba>] ? cache_reap+0x5e/0xfe
|
|
[<c083586a>] ? run_workqueue+0xc4/0x18c
|
|
[<c088be5c>] ? cache_reap+0x0/0xfe
|
|
[<c0838593>] ? kthread+0x0/0x59
|
|
[<c0803717>] ? kernel_thread_helper+0x7/0x10
|
|
|
|
The following code presents a more complex test to evaluate a
|
|
double-free situation which will put a random kmalloc cache into an
|
|
unpredictable state:
|
|
|
|
e = KernHeapTester()
|
|
addrs = []
|
|
kmalloc_sizes = [ 32, 64, 96, 128, 196, 256, 1024, 2048, 4096]
|
|
|
|
i = 0
|
|
while i < 1024:
|
|
addr = e.kmalloc(random.choice(kmalloc_sizes))
|
|
addrs.append(addr)
|
|
i += 1
|
|
|
|
random.seed(os.urandom(32))
|
|
random.shuffle(addrs)
|
|
e.kfree(random.choice(addrs))
|
|
random.shuffle(addrs)
|
|
|
|
for addr in addrs:
|
|
e.kfree(addr)
|
|
|
|
On a KERNHEAP protected host:
|
|
|
|
Kernel panic - not syncing: KERNHEAP: Invalid kfree() in (objp
|
|
df38e000) by python:3643, UID:0 EUID:0
|
|
|
|
The testsuite sources (including both the Python module and the LKM for
|
|
the 2.6 series, tested with 2.6.29) are included along this paper.
|
|
Adding support for new kernel APIs should be a trivial task, requiring
|
|
only modification of the packet handler and the appropriate addition of
|
|
a new command structure. Potential improvements include the use of a
|
|
shared memory page instead of Netlink responses, to avoid impacting the
|
|
allocator state or conflict with our tests.
|
|
|
|
------[ 9. The Inevitability of Failure
|
|
|
|
In 1998, members (Loscocco, Smalley et. al) of the Information Assurance
|
|
Group at the NSA published a paper titled "The Inevitability of Failure:
|
|
The Flawed Assumption of Security in Modern Computing Environments"
|
|
[12].
|
|
|
|
The paper explains how modern computing systems lacked the necessary
|
|
features and capabilities for providing true assurance, to prevent
|
|
compromise of the information contained in them. As systems were
|
|
becoming more and more connected to networks, which were growing
|
|
exponentially, the exposure of these systems grew proportionally.
|
|
Therefore, the state of art in security had to progress in a similar
|
|
pace.
|
|
|
|
From an academic standpoint, it is interesting to observe that more
|
|
than 10 years later, the state of art in security hasn't evolved
|
|
dramatically, but threats have gone well beyond the initial
|
|
expectations.
|
|
|
|
"Although public awareness of the need for security
|
|
in computing systems is growing rapidly, current
|
|
efforts to provide security are unlikely to succeed.
|
|
Current security efforts suffer from the flawed
|
|
assumption that adequate security can be provided in
|
|
applications with the existing security mechanisms of
|
|
mainstream operating systems. In reality, the need for
|
|
secure operating systems is growing in today's computing
|
|
environment due to substantial increases in
|
|
connectivity and data sharing." Page 1, [12]
|
|
|
|
Most of the authors of this paper were involved in the development of
|
|
the Flux Advanced Security Kernel (FLASK), at the University of Utah.
|
|
Flask itself has its roots in an original joint project of the then
|
|
known as Secure Computing Corporation (SCC) (acquired by McAfee in
|
|
2008) and the National Security Agency, in 1992 and 1993, the
|
|
Distributed Trusted Operating System (DTOS). DTOS inherited the
|
|
development and design ideas of a previous project named DTMach
|
|
(Distributed Trusted Match) which aimed to introduce a flexible access
|
|
control framework into the GNU Mach microkernel. Type Enforcement was
|
|
first introduced in DTMach, superseded in Flask with a more flexible
|
|
design which allowed far greater granularity (supporting mixing of
|
|
different types of labels, beyond only types, such as sensitivity,
|
|
roles and domains).
|
|
|
|
Type Enforcement is a simple concept: a Mandatory Access Control (MAC)
|
|
takes precedence over a Discretionary Access Control (DAC) to contain
|
|
subjects (processes, users) from accessing or manipulating objects
|
|
(files, sockets, directories), based on the decision made by the
|
|
security system upon a policy and subject's attached security context.
|
|
A subject can undergo a transition from one security context to another
|
|
(for example, due to role change) if it's explicitly allowed by the
|
|
policy. This design allows fine-grained, albeit complex, decision
|
|
making.
|
|
|
|
Essentially, MAC means that everything is forbidden unless explicitly
|
|
allowed by a policy. Moreover, the MAC framework is fully integrated
|
|
into the system internals in order to catch every possible data access
|
|
situation and store state information.
|
|
|
|
The true benefits of these systems could be exercised mostly in
|
|
military or government environments, where models such as Multi-Level
|
|
Security (MLS) are far more applicable than for the general public.
|
|
|
|
Flask was implemented in the Fluke research operating system (using the
|
|
OSKit framework) and ultimately lead to the development of SELinux, a
|
|
modification of the Linux kernel, initially standalone and ported
|
|
afterwards to use the Linux Security Modules (LSM) framework when its
|
|
inclusion into mainline was rejected by Linus Tordvals. Flask is also
|
|
the basis for TrustedBSD and OpenSolaris FMAC. Apple's XNU kernel,
|
|
albeit being largely based off FreeBSD (which includes TrustedBSD
|
|
modifications since 6.0) decided to implement its own security
|
|
mechanism (non-MAC) known as Seatbelt, with its own policy language.
|
|
|
|
While the development of these systems represents a significant step
|
|
towards more secure operating systems, without doubt, the real-world
|
|
perspective is of a slightly more bleak nature. These systems have
|
|
steep learning curves (their policy languages are powerful but complex,
|
|
their nature is intrinsically complicated and there's little freely
|
|
available support for them, plus the communities dedicated to them are
|
|
fairly small and generally oriented towards development), impose strict
|
|
restrictions to the system and applications, and in several cases,
|
|
might be overkill to the average user or administrator.
|
|
|
|
A security system which requires (expensive, length) specialized
|
|
training is dramatically prone to being disabled by most of its
|
|
potential users. This is the reality of SELinux in Fedora and other
|
|
systems. The default policies aren't realistic and users will need to
|
|
write their own modules if they want to use custom software. In
|
|
addition, the solution to this problem was less then suboptimal: the
|
|
targeted (now modular) policy was born.
|
|
|
|
The SELinux targeted policy (used by default in Fedora 10) is
|
|
essentially a contradiction of the premises of MAC altogether. Most
|
|
applications run under the unconfined_t domain, while a small set of
|
|
daemons and other tools run confined under their own domains. While
|
|
this allows basic, usable security to be deployed (on a related note,
|
|
XNU Seatbelt follows a similar approach, although unsuccessfully), its
|
|
effectiveness to stop determined attackers is doubtful.
|
|
|
|
For instance, the Apache web server daemon (httpd) runs under the
|
|
httpd_t domain, and is allowed to access only those files labeled with
|
|
the httpd_sys_content_t type. In a PHP local file include scenario this
|
|
will prevent an attacker from loading system configuration files, but
|
|
won't prevent him from reading passwords from a PHP configuration file
|
|
which could provide credentials to connect to the back-end database
|
|
server, and further compromise the system by obtaining any access
|
|
information stored there. In a relatively more complex scenario, a PHP
|
|
code execution vulnerability could be leveraged to access the apache
|
|
process file descriptors, and perhaps abuse a vulnerability to leak
|
|
memory or inject code to intercept requests. Either way, if an attacker
|
|
obtains unconfined_t access, it's a game over situation. This is
|
|
acknowledged in [13], along an interesting citation about the managerial
|
|
decisions that lead to the targeted policy being developed:
|
|
|
|
"SELinux can not cause the phones to ring"
|
|
"SELinux can not cause our support costs to rise."
|
|
Strict Policy Problems, slide 5. [13]
|
|
|
|
---[ 9.1 Subverting SELinux and the audit subsystem
|
|
|
|
Fedora comes with SELinux enabled by default, using the targeted
|
|
policy. In remote and local kernel exploitation scenarios, disabling
|
|
SELinux and the audit framework is desirable, or outright necessary if
|
|
MLS or more restrictive policies are used.
|
|
|
|
In March 2007, Brad Spengler sent a message to a public mailing-list,
|
|
announcing the availability of an exploit abusing a kernel NULL pointer
|
|
dereference (more specifically, an offset from NULL) which disabled all
|
|
LSM modules atomically, including SELinux. tee42-24tee.c exploited a
|
|
vulnerability in the tee() system call, which was silently fixed by
|
|
Jens Axboe from SUSE (as "[patch 25/45] splice: fix problems with
|
|
sys_tee()").
|
|
|
|
Its approach to disable SELinux locally was extremely reliable and
|
|
simplistic at the same. Once the kernel continues execution at the code
|
|
in userland, using shellcode is unnecessary. This applies only to local
|
|
exploits normally, and allows offset-less exploitation, resulting in
|
|
greater reliability. All the LSM disabling logic in tee42-24tee.c is
|
|
written in C which can be easily integrated in other local exploits.
|
|
|
|
The disable_selinux() function has two different stages independent
|
|
of each other. The first finds the selinux_enabled 32-bit integer,
|
|
through a linear memory search that seeks for a cmp opcode within the
|
|
selinux_ctxid_to_string() function (defined in selinux/exports.c and
|
|
present only in older kernels). In current kernels, a suitable
|
|
replacement is the selinux_string_to_sid() function.
|
|
|
|
Once the address to selinux_enabled is found, its value is set to zero.
|
|
this is the first step towards disabling SELinux. Currently, additional
|
|
targets should be selinux_enforcing (to disable enforcement mode) and
|
|
selinux_mls_enabled.
|
|
|
|
The next step is the atomic disabling of all LSM modules. This stage
|
|
also relies on an finding an old function of the LSM framework,
|
|
unregister_security(), which replaced the security_ops with
|
|
dummy_security_ops (a set of default hooks that perform simple DAC
|
|
without any further checks), given that the current security_ops
|
|
matched the ops parameter.
|
|
|
|
This function has disappeared in current kernels, but setting the
|
|
security_ops to default_security_ops achieves the same effect, and it
|
|
should be reasonably easy to find another function to use as reference
|
|
in the memory search. This change was likely part of the facelift that
|
|
LSM underwent to remove the possibility of using the framework in
|
|
loadable kernel modules.
|
|
|
|
With proper fine-tuning and changes to perform additional opcode
|
|
checks, recent kernels should be as easy to write a SELinux/LSM
|
|
disabling functionality that works across different architectures.
|
|
|
|
For remote exploitation, a typical offset-based approach like that used
|
|
in sgraykku's sctp_houdini.c exploit (against x86_64) should be reliable
|
|
and painless. Simply write a zero value to selinux_enforcing,
|
|
selinux_enabled and selinux_mls_enabled (albeit the first is well
|
|
enough). Further more, if we already know the address of security_ops
|
|
and default_security_ops, we can disable LSMs altogether that way too.
|
|
|
|
If an attacker has enough permissions to control a SCTP listener or run
|
|
his own, then remote exploitation on x86_64 platforms can be made
|
|
completely reliable against unknown kernels through the use of the
|
|
vsyscall exploitation technique, to return control to the attacker
|
|
controller listener in a previous mapped -fixed- address of his choice.
|
|
In this scenario, offset-less SELinux/LSM disabling functionality can
|
|
be used.
|
|
|
|
Fortunately, this isn't even necessary since most Linux distributions
|
|
still ship with world-readable /boot mount points, and their package
|
|
managers don't do anything to solve this when new kernel packages are
|
|
installed:
|
|
|
|
Ubuntu 8.04 (Hardy Heron)
|
|
-rw-r--r-- 1 root 413K /boot/abi-2.6.24-24-generic
|
|
-rw-r--r-- 1 root 79K /boot/config-2.6.24-24-generic
|
|
-rw-r--r-- 1 root 8.0M /boot/initrd.img-2.6.24-24-generic
|
|
-rw-r--r-- 1 root 885K /boot/System.map-2.6.24-24-generic
|
|
-rw-r--r-- 1 root 62M /boot/vmlinux-debug-2.6.24-24-generic
|
|
-rw-r--r-- 1 root 1.9M /boot/vmlinuz-2.6.24-24-generic
|
|
|
|
Fedora release 10 (Cambridge)
|
|
-rw-r--r-- 1 root 84K /boot/config-2.6.27.21-170.2.56.fc10.x86_64
|
|
-rw------- 1 root 3.5M /boot/initrd-2.6.27.21-170.2.56.fc10.x86_64.img
|
|
-rw-r--r-- 1 root 1.4M /boot/System.map-2.6.27.21-170.2.56.fc10.x86_64
|
|
-rwxr-xr-x 1 root 2.6M /boot/vmlinuz-2.6.27.21-170.2.56.fc10.x86_64
|
|
|
|
Perhaps, one easy step before including complex MAC policy based
|
|
security frameworks, would be to learn how to use DAC properly. Contact
|
|
your nearest distribution security officer for more information.
|
|
|
|
---[ 9.2 Subverting AppArmor
|
|
|
|
Ubuntu and SUSE decided to bundle AppArmor (aka SubDomain) instead
|
|
(Novell acquired Immunix in May 2005, only to lay off their developers
|
|
in September 2007, leaving AppArmor development "open for the
|
|
community"). AppArmor is completely different than SELinux in both
|
|
design and implementation.
|
|
|
|
It uses pathname based security, instead of using filesystem object
|
|
labeling. This represents a significant security drawback itself, since
|
|
different policies can apply to the same object when it's accessed by
|
|
different names. For example, through a symlink. In other words, the
|
|
security decision making logic can be forced into using a less secure
|
|
policy by accessing the object through a pathname that matches to an
|
|
existent policy. It's been argued that labeling-based approaches are
|
|
due to requirements of secrecy and information containment, but in
|
|
practice, security itself equals to information containment.
|
|
Theory-related discussions aside, this section will provide a basic
|
|
overview on how AppArmor policy enforcement works, and some techniques
|
|
that might be suitable in local and remote exploitation scenarios to
|
|
disable it.
|
|
|
|
The most simple method to disable AppArmor is to target the 32-bit
|
|
integers used to determine if it's initialized or enabled. In case
|
|
the system being targeted runs a stock kernel, the task of accessing
|
|
these symbols is trivial, although an offset-dependent exploit is
|
|
certainly suboptimal:
|
|
|
|
c03fa7ac D apparmorfs_profiles_op
|
|
c03fa7c0 D apparmor_path_max
|
|
(Determines the maximum length of paths before access is rejected
|
|
by default)
|
|
|
|
c03fa7c4 D apparmor_enabled
|
|
(Determines if AppArmor is currently enabled - used on runtime)
|
|
|
|
c04eb918 B apparmor_initialized
|
|
(Determines if AppArmor was enabled on boot time)
|
|
|
|
c04eb91c B apparmor_complain
|
|
(The equivalent to SELinux permissive mode, no enforcement)
|
|
|
|
c04eb924 B apparmor_audit
|
|
(Determines if the audit subsystem will be used to log messages)
|
|
|
|
c04eb928 B apparmor_logsyscall
|
|
(Determines if system call logging is enabled - used on runtime)
|
|
|
|
A NULL-write primitive suffices to overwrite the values of any of those
|
|
integers. But for local or shellcode based exploitation, a function
|
|
exists that can disable AppArmor on runtime, apparmor_disable(). This
|
|
function is straightforward and reasonably easy to fingerprint:
|
|
|
|
0xc0200e60 mov eax,0xc03fad54
|
|
0xc0200e65 call 0xc031bcd0 <mutex_lock>
|
|
0xc0200e6a call 0xc0200110 <aa_profile_ns_list_release>
|
|
0xc0200e6f call 0xc01ff260 <free_default_namespace>
|
|
0xc0200e74 call 0xc013e910 <synchronize_rcu>
|
|
0xc0200e79 call 0xc0201c30 <destroy_apparmorfs>
|
|
0xc0200e7e mov eax,0xc03fad54
|
|
0xc0200e83 call 0xc031bc80 <mutex_unlock>
|
|
0xc0200e88 mov eax,0xc03bba13
|
|
0xc0200e8d mov DWORD PTR ds:0xc04eb918,0x0
|
|
0xc0200e97 jmp 0xc0200df0 <info_message>
|
|
|
|
It sets a lock to prevent modifications to the profile list, and
|
|
releases it. Afterwards, it unloads the apparmorfs and releases the
|
|
lock, resetting the apparmor_initialized variable. This method is
|
|
not stealth by any means. A message will be printed to the kernel
|
|
message buffer notifying that AppArmor has been unloaded and the lack
|
|
of the apparmor directory within /sys/kernel (or the mount-point of the
|
|
sysfs) can be easily observed.
|
|
|
|
The apparmor_audit variable should be preferably reset to turn off
|
|
logging to the audit subsystem (which can be disabled itself as
|
|
explained in the previous section).
|
|
|
|
Both AppArmor and SELinux should be disabled together with their
|
|
logging facilities, since disabling enforcement alone will turn off
|
|
their effective restrictions, but denied operations will still get
|
|
recorded. Therefore, it's recommended to reset apparmor_logsyscall,
|
|
apparmor_audit, apparmor_enabled and apparmor_complain altogether.
|
|
|
|
Another viable option, albeit slightly more complex, is to target the
|
|
internals of AppArmor, more specifically, the profile list. The main
|
|
data structure related to profiles in AppArmor is 'aa_profile' (defined
|
|
in apparmor.h):
|
|
|
|
struct aa_profile {
|
|
char *name;
|
|
struct list_head list;
|
|
struct aa_namespace *ns;
|
|
|
|
int exec_table_size;
|
|
char **exec_table;
|
|
struct aa_dfa *file_rules;
|
|
struct {
|
|
int hat;
|
|
int complain;
|
|
int audit;
|
|
} flags;
|
|
int isstale;
|
|
|
|
kernel_cap_t set_caps;
|
|
kernel_cap_t capabilities;
|
|
kernel_cap_t audit_caps;
|
|
kernel_cap_t quiet_caps;
|
|
|
|
struct aa_rlimit rlimits;
|
|
unsigned int task_count;
|
|
|
|
struct kref count;
|
|
struct list_head task_contexts;
|
|
spinlock_t lock;
|
|
unsigned long int_flags;
|
|
u16 network_families[AF_MAX];
|
|
u16 audit_network[AF_MAX];
|
|
u16 quiet_network[AF_MAX];
|
|
};
|
|
|
|
The definition in the header file is well commented, thus we will look
|
|
only at the interesting fields from an attacker's perspective. The
|
|
flags structure contains relevant fields:
|
|
|
|
1. audit: checked by the PROFILE_AUDIT macro, used to determine if
|
|
an event shall be passed to the audit subsystem.
|
|
|
|
2. hat: checked by the PROFILE_IS_HAT macro, used to determine if
|
|
this profile is a subprofile ('hat').
|
|
|
|
3. complain: checked by the PROFILE_COMPLAIN macro, used to
|
|
determine if this profile is in complain/non-enforcement mode
|
|
(for example in aa_audit(), from main.c). Events are logged but
|
|
no policy is enforced.
|
|
|
|
From the flags, the immediately useful ones are audit and complain, but
|
|
the hat flag is interesting nonetheless. AppArmor supports 'hats',
|
|
being subprofiles which are used for transitions from a different
|
|
profile to enable different permissions for the same subject. A
|
|
subprofile belongs to a profile and has its hat flag set. This is worth
|
|
looking at if, for example, altering the hat flag leads to a subprofile
|
|
being handled differently (ex. it remains set despite the normal
|
|
behavior would be to fall back to the original profile). Investigating
|
|
this possibility in depth is out of the scope of this article.
|
|
|
|
The task_contexts holds a list of the tasks confined by the profile
|
|
(the number of tasks is stored in task_count). This is an interesting
|
|
target for overwrites, and a look at the aa_unconfine_tasks() function
|
|
shows the logic to unconfine all tasks associated for a given profile.
|
|
The change itself is done by aa_change_task_context() with NULL
|
|
parameters. Each task has an associated context (struct
|
|
aa_task_context) which contains references to the applied profile, the
|
|
magic cookie, the previous profile, its task struct and other
|
|
information. The task context is retrieved using an inlined function:
|
|
|
|
static inline struct aa_task_context
|
|
*aa_task_context(struct task_struct *task)
|
|
{
|
|
return (struct aa_task_context *) rcu_dereference(task->security);
|
|
}
|
|
|
|
And after this dissertation on AppArmor internals, the long awaited
|
|
method to unconfine tasks is unfold: set task->security to NULL. It's
|
|
that simple, but it would have been unfair to provide the answer
|
|
without a little analytical effort. It should be noted that this method
|
|
likely works for most LSM based solutions, unless they specifically
|
|
handle the case of a NULL security context with a denial response.
|
|
|
|
The serialized profiles passed to the kernel are unpacked by the
|
|
aa_unpack_profile() function (defined in module_interface.c).
|
|
|
|
Finally, these structures are allocated within one of the standard kmem
|
|
caches, via kmalloc. AppArmor does not use a private cache, therefore
|
|
it is feasible to reach these structures in a slab overflow scenario.
|
|
|
|
The approach to abuse AppArmor isn't really different from that of any
|
|
other kernel security frameworks, technical details aside.
|
|
|
|
------[ 10. References
|
|
|
|
[1] "The Slab Allocator: An Object-Caching Kernel Memory Allocator"
|
|
Jeff Bonwick, Sun Microsystems. USENIX Summer, 1994.
|
|
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.4759
|
|
|
|
[2] "Anatomy of the Linux slab allocator" M. Tim Jones, Consultant
|
|
Engineer, Emulex Corp. 15 May 2007, IBM developerWorks.
|
|
http://www.ibm.com/developerworks/linux/library/l-linux-slab-allocator
|
|
|
|
[3] "Magazines and vmem: Extending the slab allocator to many CPUs
|
|
and arbitrary resources" Jeff Bonwick, Sun Microsystems. In Proc.
|
|
2001 USENIX Technical Conference. USENIX Association.
|
|
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.97.708
|
|
|
|
[4] "The Linux Slab Allocator" Brad Fitzgibbons, 2000.
|
|
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.4759
|
|
|
|
[5] "SLQB - and then there were four" Jonathan Corbet, 16 December 2008.
|
|
http://lwn.net/Articles/311502/
|
|
|
|
[6] "Kmalloc Internals: Exploring Linux Kernel Memory Allocation"
|
|
Sean.
|
|
http://jikos.jikos.cz/Kmalloc_Internals.html
|
|
|
|
[7] "Address Space Layout Randomization" PaX Team, 2003.
|
|
http://pax.grsecurity.net/docs/aslr.txt
|
|
|
|
[8] In-depth description of PaX UDEREF, the PaX Team.
|
|
http://grsecurity.net/~spender/uderef.txt
|
|
|
|
[9] "MurmurHash2" Austin Appleby, 2007.
|
|
http://murmurhash.googlepages.com
|
|
|
|
[10] "Attacking the Core : Kernel Exploiting Notes" sgrakkyu and twiz,
|
|
Phrack #64 file 6.
|
|
http://phrack.org/issues.html?issue=64&id=6&mode=txt
|
|
|
|
[11] "Sysenter and the vsyscall page" The Linux kernel. Andries
|
|
Brouwer, 2003.
|
|
http://www.win.tue.nl/~aeb/linux/lk/lk-4.html
|
|
|
|
[12] "The Inevitability of Failure: The Flawed Assumption of Security in
|
|
Modern Computing Environments" Peter A. Loscocco, Stephen D.
|
|
Smalley, Patrick A. Muckelbauer, Ruth C. Taylor, S. Jeff Turner,
|
|
John F. Farrell. In Proceedings of the 21st National Information
|
|
Systems Security Conference.
|
|
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.117.5890
|
|
|
|
[13] "Targeted vs Strict policy History and Strategy" Dan Walsh. 3 March
|
|
2005. In Proceedings of the 2005 SELinux Symposium.
|
|
http://selinux-symposium.org/2005/presentations/session4/4-1-walsh.pdf
|
|
|
|
[14] "Exploiting Kernel Pool Overflows" Kostya Kortchinsky. 11 June
|
|
2008. In Proceedings of SyScan'08 Hong Kong.
|
|
http://immunitysec.com/downloads/KernelPool.odp
|
|
|
|
[15] "When a "potential D.o.S." means a one-shot remote kernel exploit:
|
|
the SCTP story" sgrakkyu. 27 April 2009.
|
|
http://kernelbof.blogspot.com/2009/04/kernel-memory-corruptions-are-not-just.html
|
|
|
|
------[ 11. Thanks and final statements
|
|
|
|
"For there is nothing hid, which shall not be manifested; neither was
|
|
any thing kept secret, but that it should come abroad."
|
|
Mark IV:XXII
|
|
|
|
The research and work for KERNHEAP has been conducted by Larry
|
|
Highsmith of Subreption LLC. Thanks to Brad Spengler, for his
|
|
contributions to the otherwise collapsing Linux security in the past
|
|
decade, the PaX Team (for the same reason, and their behind-the
|
|
iron-curtain support, technical acumen and patience). Thanks to the
|
|
editorial staff, for letting me publish this work in a convenient
|
|
technical channel away of the encumbrances and distractions present in
|
|
other forums, where facts and truth can't be expressed non-distilled,
|
|
for those morally obligated to do so. Thanks to sgrakkyu for his
|
|
feedback, attitude and technical discussions on kernel exploitation.
|
|
|
|
The decision of SUSE and Canonical to choose AppArmor over more
|
|
complete solutions like grsecurity will clearly take a toll in its
|
|
security in the long term. This applies to Fedora and Red Hat
|
|
Enterprise Linux, albeit SELinux is well suited for federal customers,
|
|
which are a relevant part of their user base. The problem, though, is
|
|
the inability of SELinux to contemplate kernel vulnerabilities in its
|
|
threat model, and the lack of sound and well informed interest on
|
|
developing such protections from the side of the Linux kernel
|
|
developers. Hopefully, as time passes on and the current maintainers
|
|
grow older, younger developers will come to replace them in their
|
|
management roles. If they get over past mistakes and don't inherit old
|
|
grudges and conflicts of interest, there's hope the Linux kernel will
|
|
be more receptive to security patches which actually provide effective
|
|
protections, for the benefit of the whole community.
|
|
|
|
Paraphrasing the last words of a character from an Alexandre Dumas
|
|
novel: until the future deigns to reveal the fate of Linux security
|
|
to us, all wisdom can be summed up in these two words: Wait and hope.
|
|
|
|
Last but not least, It should be noted that currently no true mechanism
|
|
exists to enforce kernel security protections, and thus, KERNHEAP and
|
|
grsecurity could also fall prey to more or less realistic attacks. The
|
|
requirements to do this go beyond the capabilities of currently
|
|
available hardware, and Trusted Computing seems to be taking a more
|
|
DRM-oriented direction, which serves some commercial interests well,
|
|
but leaves security lagging behind for another ten years.
|
|
|
|
We present the next kernel security technology from yesterday, to be
|
|
found independently implemented by OpenBSD, Red Hat, Microsoft or all
|
|
of them at once, tomorrow.
|
|
|
|
"And ye shall know the truth, and the truth shall make you free."
|
|
John VIII:XXXII
|
|
|
|
------[ 12. Source code
|
|
|
|
begin 644 kernheap_phrack-66.tgz
|
|
M'XL(`%U3/$H``^P]:U?;2++SU?H5'<]A(Q%C;&-,$H8YAX#)<,+K8+/#W"Q'
|
|
M1\AMK$66-)),8'=S?_NMJN[6VV!GD^S,7;09+/6C7EW=5=6OO>6A-^%68`:3
|
|
MT+)OUWJ]]1^^]M."9VMS$W_;6YNM[*]Z?FBW.^VMUF:[NP'EVMU6>^,'MOG5
|
|
M*:EX9E%LA8S]$+G6`P_GEWLJ_T_ZW);;/^)VR&.76[=-^ZO@P`;N=;OSVK_3
|
|
M[6Y1^[>[W?96IPWMW]MH0?NWO@KV)Y[_\O9?7]78*MOS@X?0N9G$3-\S6*?5
|
|
M>L,&L^N0!['C>^SH:*_)=EV749&(A3SBX1T?-;'J.0=-B?B(S;P1#UD\X2SF
|
|
MX31B_I@^,G!.`^ZQ@3\+;<Z.')M[$6?ZX'1P9+"[=K.%X-8U[4?'L]W9B+.?
|
|
MHGCD^,W)S_DDU[DNIH6.=Y-/&]M>[&*2!LT;.S:S?2^*@<;(N?&`6-?W;L2?
|
|
MP(J!7H_ML-9]M]79[6QVNKUWO79OL[MW<72TG4!P/-?Q.+OSG9&J9(X=U]43
|
|
MH/8$-&GU>C9NL,CY!S=CYG+/T/ZIU0I%`H!:<\9,AWRV0H7]L5XFSC"T6@WZ
|
|
MXBSTL$8`1`+T;:WV:>*X(+N`_<1T2&&O")/!`%5M5:^`Q%8-%AA07U(.,`#<
|
|
MJYU'<$.1S]IG37.\F$TMQ]/QQ0IO[(;D8A4^[H@]R>WT&E^@'I8<<>@PVR76
|
|
MB7[!/,("!C8$V>,`6C$>Z]#"/`P;K'X163?\+5N)V$<$"_)GQ^^NV$<"C%\P
|
|
M3D7-J[]Y]0:2=?>Q=84TU_B]$^O]R\.A>;![>'1QWA><:#5!'LC`BGU'IRKM
|
|
M*P-46&_#&`2_^(.%!89LN0Z"A@80)-9_A69GENOZMA5SMC)CUP\QCYB^,CLV
|
|
MFD20P-7(HR(8V%P[(%"LK8MBB!,%0ED[[`343L@D`$GXH5X7I>N/LZ>(^Q\>
|
|
M^M`?4%/&T"&MF*T$S6:3`560A*6G?!KQ6"=%;2E:,4.!V/<]CGSDF0Z=V,ET
|
|
MF-@'P,"SZUX:*?R&RA=UL_V$<A=$=A!R_C@38RBAR_='00U<S@.$-?9#MC)"
|
|
MM?&]4030J*6HL1%(A.5T]:G)?L=:V]@)_M/#]#=[*NR_2EH#G8ZCF1/S?],E
|
|
M7,;_V^CVP/YO];JM9__O>SR+M;_M.MR+OU0-EFK_SM8/+7`%VYO/[?\]GJ7:
|
|
M_W:"2<W@83D<C[=_N]>!8"]I?WAO=3I;6YUG__][/#^^6)]%X?JUXZUS[XX%
|
|
M#_'$][1ZO:YE8@)[F9C@I#\\.CSYP"+?ON4Q.-XC\(/L^"$`#VG$QX[G(("(
|
|
M6=Z(^1`AA.S:`0`^`'(\RWU;1KRU!G]>L]VC(80-WNR^P<XXN!7LKTTVL.XX
|
|
MZ.:=]C7#$&V?1W;H4.FWVG#B1`S^(131#\B3^-`_/_FEOWL&G-\`\Q%"QMZ!
|
|
MH0@[1+ZGTYGGH(<8:>`K877L6=QE4W\T`_?=OT//AN7EU2"YA/SWF0-06>C[
|
|
M,0-/Y@[\_1L!*)QY33:<H$,<`S&(%ZGS,<2)0]]EM]*Y-!KLEKPD>.$QN.UW
|
|
MC@7^_C1PN8;4`:*(C4-_RF;0=BY\(ER`A=4_1<0-`;[F"'LTLV.0+P>'S(X=
|
|
M$/M#4_O-G[$I-16TJVI.I"L4DD#($8DK1L!"%\#5?8!ZGYQHTB1-TX@(J2)`
|
|
MGQ_&;%4D2B62B;L'II16@PU.]SZ8Y[N_-I0`S8M!_QQ3,>_(Q+?^$-_-P<G^
|
|
MNXL#>CW?^ZM\A=K[[\]WCS4)VX_46T">(P2B;,P_L?<'9VSL6C>19IKP;OZZ
|
|
M>SBLU6H8+K9;,NWP5*9T5,K!0*:\;FGXC=K2/Z(T/87#_L44@.3U8&``9H[2
|
|
M'&>0:TK?S),C\_WYZ<49PGKS)DT_WGU_N">QMB"4L:[M4:;:WO&^.>P/AE0@
|
|
MG_SA>/?HZ'0/<]J%G(/S?A_3._GTX_[QH$^@-DH9>V>_848WG[%W>O:;.3S%
|
|
MG,V*G(/STV/,ZQ4I`WB[>[_TS;WS_NZP#R6VYI40/.RPU_,*$"\@LWGY^R"=
|
|
M\]/?4`HM[>3H>/`>9'II'O5/DO9NM337N;9KE+"W?W2DU_&[&?G-7MW0--NU
|
|
MHHC=3NSI:#(*]4$<0J>9A=QXJ]4@".+N*#(A^OL(<1Q&=3>.76_4;'-&`7>#
|
|
M4K$7E!(#:P1IA40(^6_B"945`;BAU:X4#9X[C6Z>IB$#8P9*O]$I40&I[9Y,
|
|
M)54L)T?\=TE=%D3@C'*I1>*>H`QHQ^J*#PET9,66(`!#;HC9<^TD<&1;P7P:
|
|
MD<(`8`O(1"M"NFK0)8C(TH`#Z>)$S*4B):/,IAKSOR86D4%P2<%$:TI5RRM$
|
|
M03/_P4-?I&85`FS"S(VSI4N2$O;J&S`Q&J&-?A2WG!7Y+L@+';@H5IPJ$\EB
|
|
MTHQU*\FU@X=O0.X(M+5,:T3^TD(\E$BUP:$ST<?X!M3.T<"OR03Z(-B&3U`_
|
|
MMJ:.^R#&A6@"CD1I[,ZE%D9'D7H3^K,@'6'SH^8S'7DZ/H`W_0O$J4-H:]`M
|
|
M__KOX)@B*>,1$''B>URK@1+!.RF3AA.[8_"R,`(Q33WB[AB<T<@<\>O9S<Z!
|
|
MY4;$2`TSF@1#VG9T0?62[TG>8]G[!"^@5A,5>8QU_0`&E6J'M+>YN=$SGBZO
|
|
MO%997M$882L`F4F#&+F<)L@4<OVH><-C>"]F"^%"B9)KF2\G&A+*I2)(2'!3
|
|
M.4$(.=*E[!KL^B'D8SV%8C344D,F#0F":""!]`*\+&P!`=MV_8@3T;70<B!(
|
|
MZ]_;(H33ZX3,8&,+XJ(1A!+QA*V,ZFR%Z0J807("Z*J%4\#T"60/PQG'&75H
|
|
M^'(FZ4/"9X!K+R(P:)Z%/(X?SD**<W0@!**P'1RAA8*1M<=*I$PP6$'-C!^0
|
|
M-@+X^'LB#-,AM\%*KGH"4EEWH;(HQYU>M\%&OHF6=J?5$$'"3AII)+C)4\W[
|
|
M"(;(:J:6'<O@K\P@8)@F0@^1B)@P32*%U,48D<&%6,D*2(^H%NCD.7[K\TO3
|
|
M_#M5:@K?(9$'N0E"&J1(14G+N%<Q*@PPKNC`F[8@W1@N&'+0D+Y!BK'!;!2[
|
|
M,!P[+45`*FQ9HT!"39(@$D5M2!,O,A5-/J39DQ"G/UZQEW^[;[74?R\7EKL(
|
|
MT1(5DNZ"X`"M(^A1:!,'E<1C844\%L>6AQ^9(@PI:DUH5S"SH(A%L)B02&Y"
|
|
M[`M/01`:^PV:H0CB<`ZIJ6]1H#;VB[1*0%].KXQA%U3E7.F,*LO&SO",A/W!
|
|
MN<;X?!F^T_+S.(?*$N<`1A[).6`G9F4U:3$PE:UEOIK2!<PG)BZCDBW9CV0D
|
|
M)B.5IH&54@6S_"=TT!\3XV`E?HFU27$[R]I-FGK)%\**-2RDH.2SP0FBW-;]
|
|
MV!;_R^<+464$G945RB/7>"I%RE4T2L))R@1`;;#0^J3@AMR^TZ41]OP8L\@0
|
|
M2O&3"R5R(2?/_HLB^U21#*3^TO%L/P3@N'$`"]]9[HR_8/H-X%BY-UZBG2Y"
|
|
M-(PG,2,?B%CQE$4I%)-*3)UH:L7V1"$<E1"20.;@4\!Q);[*-&4%E+.K3>K&
|
|
M<J#7Y2^H)N`UC!KY&8^!3[H-\\,G2@U/RT0D(\*C9&A:D66E_U)=H`=)!T/8
|
|
M-*D8F")<)+�O8Z*&U(@V21HY'W1"WI.F).QG>D'/!'Q:R:_+1RG\(UQ82R
|
|
M;PHE`&_2V],A3==UVEL`-K,+(T/;8']A_ZMWU]K4U+DZB+WH&6=\T=1?E-KU
|
|
M<?6*#4!*.*6=;O<`N=,&GIE0L`JJ,(H1J'/].M<0TO?W1K%?=)]1OJ(Y:*.&
|
|
M]*FMC"]M&:K?4U].!CL1]Z138,(9]I!<Z:YC>525^3A]]=$2_^1`(:'LL+4V
|
|
MB4B@4CU(,J53*=(G0.U@V.594VZ:6/&E:>*&(M-\^593$CZ?>1Y*5\>@1779
|
|
M3`,9Y":@[A7"/DCG3>597\_&;2C"F\K7[75%:B>7VNZ\)G"9UI59:[TN$RQ`
|
|
M6+'2>GU)="!8)*"B/("JKM"1%#?3W3;M!NXQ@[]$53:G0Q(69%$ZNE\B7=3#
|
|
M&EIMP@$GVA2:%S(#_>4N>Q=:=YR=\$_L5S]T1R]>`HCI0Q1C8&B'W(JY*3;(
|
|
MF6(OC2Z;EF`E3(%P"JZ(P)Z.'Z)\HAFJNJH/XV^H$]XFC?:8`P)7NW3:1NZS
|
|
M0Y])@!=`./__>(_-'_E9:OT_L,+8L=PU7+G\%$+&8EL!GMK_"__2]?\6KO_#
|
|
M[_/^W^_R?/WU_V^W$C^@M6LFM9`E6DA3+G(IN;2:&X+7[$\?66(6FUI48MZZ
|
|
MJ*G&,X'R5&',3#:6YA33.<1J:R6J7%N14YC0H8K"Q-31Q"B3J/@%#;PN,%T7
|
|
M<0!DX.S&QRO\$GN#P?KJE&ZPGUB[1Y9:31<2;8DQ))\I]74(]:[<VSI*O1TK
|
|
M!N,%!NY>S+*)W:UJ$D_RJRR:F"%I"<Y$/I+2M((`74VJAJ1"SP/[CB1A]L>4
|
|
MYO7.59K]*IF<4A2^N&))4Q0($W6,9)8MH4EDH`7>?-U@KPD_;DP@H>!N8D3\
|
|
M-BL'P**VH680+,&SE#,9O81GX2PHIVWF%71&Z`64+#I-=>4TU1.GJ5YTFNH5
|
|
M3A,J84E]R?PB<N/9[/YW/TO9_Y$_NW;Y&FGT$KL`G[+_&_+\C]C_V0'[WVUM
|
|
M=)_M__=X_H3V'S?(V3Z^WC.AD6)!0&Q86\0/6,3D[Q-D-`"E]<5O;_*I*/&#
|
|
MIM:[`1:SG!K"[C]FS_-P\W:,Z6,GC&(C;\^JS=5<8P@U;\`>+01$3@K0>IN:
|
|
M]Z;V:Z?B<YB8_9%\27^F5I-LT3H5);*-#L;##?:F1R$S_,&WSB9^PM`!KZTN
|
|
MI'9;;WI76K5\E?J0A-M"FHH`X3\YZ#8!M$<<)Z%137OB@^KJ.3J%@T!\Y)V>
|
|
M&N)YA?OL4"P"0,3Y"*WV3'SK&QUQ[DOF3B!L=X4PHW3:5\DX3X0H]&CMK,]#
|
|
MJ6_G^BK?R4.)J9^`?$M=#C)_9!<>;E95VVYG4;']L(P`T4RT*H'Z[.0\\2QE
|
|
M_W&*:"WVUW"":(T."'^-^']C8Z-3./_1Q>QG^_\=GC^3_1_2AO1WH35Z&>56
|
|
MBQGJ(K,GW+Z-Y![V:.+/W!&+@;P;VF8?6)YC-S1?4&2-_F[9.*;(`X:T@X1(
|
|
M]6^!ID]XP/.:$US@)/9AW/FF$PPH[*%_`;P<`<KOZFN4!2EG%IZ:,$!1+>9^
|
|
M/#:=`-H%!,AV,*HF%YZ(LRNS*<CO=)<C)%$*H01EDC"]:FH!DA\A26;?VQN"
|
|
M)"$Y3QP$KIRBQRT?[<.3@].C_NZ'EQF(A3EZ`3F=HU=PTVGZ)$5L2X/635:6
|
|
MI8>`/RB?U&G"D]"T*RV>!H\0F0(K.HJHPOF5,I3O&OXQZ-BPFDM)0324OR38
|
|
M>97)R?)'!!E&U@W*[1HI%56[=0K$@N<B"C1Q51C/KB22PA0#5YGE\E:&KV%V
|
|
MX,CJ"/MD17*L>$$=Y\?LJDB*Z>.C2*\>=[@K'>TG.D76GR/7^G'H2^JW@$Z5
|
|
M_E,S6X5Q\WE>:]EG,?\O*44S[<O>"O.X_]?I=CK)^=_61H?.?_8VGL]_?I='
|
|
MW/^2'&9,FAQ2_U`7PR#$U=55MG_*3DZ'[&+09\-?#@?L^'3_XJC/3D_8+CL[
|
|
MAX_A(;P/?AL,^\=8H7BGC(L'2->KKHP1.>)L9E4.^F#Y=(_'Z[CCI9P*_T&E
|
|
MVRHPT2W:C&H$?NR,'ZJR'H%W&T_`.H^JLN[DQJ2*K#BNQ#.=TITY/](Y7<Y*
|
|
M9PO%R<+==WL@]'*I](AAJSHS.6C8GI,OCAMVJG/5H<.-N=ET]+!;G9T<0-Q\
|
|
M)%\<0^S-H[YP&''KJ7*"V]=/%2.NWSQ52AU/;,\1[OG%B?EA^`N0ME]K5PB8
|
|
M-E"=02.^>8-W&N'1#29V);'L#45T`1#MR-LNIF)P4TH,K-%V<@,0N%C;VF>P
|
|
M[E8,/>QZ!DZCJ>NF&8!IX2/3I"MP4MQJZQQ+[Q"2-PCE<=`^]&PJ^E:X"[U4
|
|
M4NP37XX&FCLNB0"=IJ7@"&=I'J"<A&ITK,O^V+U:%H,-H5()PRB*2X*(0OO+
|
|
MFR793%C&56J*Q2A8`+]2V/?](>KSF1X8-5W7Z;ZM52/`;7UJUUM6>3$8R)`N
|
|
M]H&B/HFTZ);B%;8*XR[>%.2,@!KX"^\"LKA0*R>H[)U?,;OCX;4?@5+*-`48
|
|
M!GZVBO;2Q-=B]BTTEFE;-ABZ51(IOB:%]OL'AR=]\_ABV+_4"0:U+PCFWLAC
|
|
MQXV!@BE3#/12(%9X(Z[>*K*^JK9HZ^4L`R_*VL8Y`D)E0M>[U?]2PB]S9]Z<
|
|
M?*TF[9%)%QQ$L9Z(H2&V>J_]#.).WD'<N'T0`*=2,2EV2"0CRY8O7I*B(*93
|
|
M6=!=9`16R.CW!DM:,[E\#=[5_6LB$\_M;L_3#)FJ-G"R5<^=;%?)-[B-T_38
|
|
M`C#R?36.;G.JA-LBZ63PX&QWKZ]+>DC^``3W9Z;B$),G&7EDSO+(Z\&H3NYZ
|
|
M,(R5;FGWO=D_/V?UE>BM.A(%879R05E"O+CQRC3',\^&7K>MI;?+T15B(`D\
|
|
M)"/6,6ZO==K"62;D!>1A&'CC`Q9_%I-&N).$W;.+H4[M3RU$K2,R]D]/^@V6
|
|
MR@$JK?U,$C=I7,$5&`2_`V]\=[NI04MY<ACLC,5_>-I8`(YF2KN%)+>K@
|
|
ME"!XDKT?$\24ANCVE"`WQ7*Z8P_:$!M&]#531+&%'@A\W<8-5A?S>F::7T\$
|
|
M=#C`!M$!6.$*/S(V)%N4X;82^8FX:5"*`EIP%N)^:X0E95VL"&)_^VA_`AJ-
|
|
M4@\"?LSL4;"*KM!@53TA2<U:;1S8@DS_DD-I14$23]HW@#8A],H!O6J@5Z"H
|
|
MH==7V>7EY5MFTQ3O-8=_>,V<F*;E%%P$U@U7'<`/V82'G(T</)*`77A=7<,G
|
|
M/61=$%1U#U^N]0H7YZ6UTAV[C:+6-N98K5PM8+FZ&!W@""I@)`)1$E%3F\0^
|
|
MV>XHX+8S=M`>BSD_7#B/0!Q<3H&+4W&.!WTX9')T6]=J508D:4F#/4HOLB5'
|
|
M?>&*(9R<7V`DIQF1,^B?-+,JWFD0,-3]D-+P&H61#N=%Q5"G`*W,&CBGZ-X;
|
|
MZ5;LUOV*>YD;[L1>_#E($ULEJ"8:,N9&'?+=>T?=KTD#6V;THC%.W'Z8Z,1=
|
|
MYH;">=U0G&/\LDY(;FO:!1>3&"&D&3_WTLB+1XJ#)@O)1HG"R@'+9W^NO!95
|
|
M\:6.2WX18]*/7I(SB5*RUJ!9S<Y]`P]GS&4SJQ'VQ]:5RDNLM`1:(0)5M%SM
|
|
M2<G0,<POE0SZ_\M+!E'F)`-]9;YHP)G/]14\)IKE,!FX"G+!>JR05JH\IR?(
|
|
M6?PO%$P:KSQIDC)%_WVCE`#+W.SZ)S8I*3M"_:L,049^BYL"Y=RU%Q[:<ZLZ
|
|
M7T-UA>(R,)1)%@M\0$TWG,F+V2((2'EJ`Z$6+GYE25E.Y;WO9$>29<'G_O/<
|
|
M?]+^DRX4?[T>-*<#J1L$*SM/2L<?J_N0Z,P)OY<D!*5[V^6G(P)3]>F/QS*A
|
|
M?+EYSM\5J08+DONH<PV%J[#[LVF06RO'Y2BZ.E'<>4V-1'&X8$&C#7RZH(AV
|
|
M*6+@RIQ7K\2<`(6?#OO7O]@+*+2"*_DB@W(<VH&H2*FKJ[)K-63IU0YSMK5,
|
|
M]DJK.\(;MR&3Y@H^:ZF^U<'C[ER*Z[@_.E>Y2\@3R)]S4UD4+4,(QIT[3H=2
|
|
M%QZLA+.#IQ3#D$X)H^@W.L@Z?F8T`$>@M4)_2NX%R(UD!-NF.T-D]"UO$-$+
|
|
M98SR/$1Z>?]/<_KN_/D9T`P_%'M\\?]'`.\6FCLMP];:4JZ(#V"O_3SG]/UR
|
|
M",5VIT]A,M5/X=,ET4$*DAL5)%*B*^TR-EZ1(/O8?')11C]3L>4(Q-HLK5RF
|
|
M;!EJ(F#7GDB*Z.P_$6-;4<7BU=M,#R@,JGB;`!:9WVPU.3"+L1J^KT-NX:1@
|
|
M!;+L?0(Y>>U4F-`TW"<IE.9QL.>059\3M">3ZGA3A/$D:;@BM2AAM"&]0):(
|
|
M:ZN)$G'KLB2)Q;\%:1).08$H%9164J6"SB\@:^_LM\7)`N>B3!9%A//(HHAO
|
|
M6;*RUT0\35?&;\F1]G_M77MSVS82OW^M3\%HFD32R19?(J6X[=1QG-93)\[$
|
|
MN=Y=DXZ&DBA%9[V&E)WX;O+=;W?Q)$%*<N)SKXTPTZ01E\`"6(`+[&]WY:&L
|
|
MD+>,)O,9[*&M]2X89'KO7;.8-_8^V4`EU_(Z(K&LUM%P,V\QF6;CI:$#;>S-
|
|
M^;-STKJT[@SC470U76W8RJ*Y=36_G"\^S/F&UI1;<='6IC9;4\');+^2CT_*
|
|
MF`,?;D/]RN@#=+-MZ@/ZE[_<2(.Z0,(LJS(W#KS,/S]9*XS-/\[,8,&T!HR3
|
|
MR^_C21IK.8U"?N5)&:FC=J7J_S:G?^B!94BY2G13T-'9Z8\O<_7+=A/VQ1-5
|
|
MLXKXH8K_=E@1/,*(P.^F4L570EWH?<)6!U.L+1-XF^ET%H:_PYC<-<,:\PAY
|
|
M[CWO'1W_O+8N6VF'\&-O>36=LJ<)[]RG=5-?.NWK;92F]+`)W&"[S*JD#%EM
|
|
M9>!M]!.="*3PP61>1U-AG9+&3I0?/B#L`J''@+*U1X37GN.*,H.3YP`9I-;H
|
|
MG6F2/MHD>%./P9O$*5AK^19V0/2AYGSR>.IK%,Z3TY>_')WQQ2O-25E;)>\E
|
|
M-X#M`Q7]7BTZ$S.SKH@A0SP?2OU05'Z+SB3Q>(+!39`=BU[?MC/F1IA!O>T3
|
|
M[,V:+J)A/)1)A&3\,A0`8P?K]3`;4TY\*$&3%)^<@"0,%*=F,F<5QS!NR>)&
|
|
M#4Z]^.A8Q#K*O,8\\,K`;$R@31D'&DY`+)N]``(F?[VST^.3EQ<GM>H&T)YS
|
|
M8%?56T=_>_/3^>O,2V=GQ]:W"!.,DL'['U+Y`%VWOL=7?V\\Y-=6ML/_OH@N
|
|
MXQ$LP<]K8X/_E^U[#N)_$?8;AB'B?QW/;N_PO_=1%OU_[<_PZB<+\5Y4*G!V
|
|
M!`5R!C-O[1];K>FDS\&Q:>N;6HKQGV#'B6;P-*FW^E>3Z=!Z\=TWM5=_?U;G
|
|
M&4Y2]&Z*HSE4D\R`#+[M!POX[Q+_6&!(-.N@P?Y:S98].#NGE!FF<0#O'PSD
|
|
M6R\8)G<6)<!D*OZ9WLSP#='6P2(9HC\5[H'08'HU7,`G/H6GN:Y=+BILIQ14
|
|
MR<P@JGPU[@-%^5\'T?P.L[]N]O_T/5_F?VWC^H<__5W\AWLIN_ROV^9_Y9?_
|
|
MH$WW9M%'O)I16M[ST[,3JZ$!25>+%>J+E3TT"(P6T/%:M;5,%H,6O#>9CQ:@
|
|
MLE83"1@KR?M)+U9UE=;FRNP(%^FH-H):7L2S-]C:$XO*PROK\BG4_HA8H$R9
|
|
M+,[>2!BE6(!&>JR2GA;#%K`5X>^W/LUM,S>V_0C-4__)@X)9+MKE80[+30IN
|
|
M/)2Z;CP4AZVE`2"2^6S-5+B%=$:"7!SPQA)'7*0J98=;$[:-.$&\@_EK'L,$
|
|
MS>W33=1>.6_L+H3L,;Q/O!V9F=1ZOH!%(R60>:)./_*<JO+*F_M%-I:L1FV$
|
|
MV!WS)VF:V9#0U\IG],T<<D<:CO\CPSF"G.-OXD>0`L/FA>8?BK1!QMCH(W"E
|
|
MKQ")B(0?2O+\'D?SQW3$3B;Q=6PMW]^DDP&()KR^2&ZH/P_>S;=,?GL!`CO/
|
|
MV-,6(U$3.6.WAO$UKD&1_A7Y8G;-RAZLAM,1RU2&!\ZT27G*6'ZS]QAE,YY'
|
|
M?3R'XHX#YUM1%T+H;A!#QPYY5!':0H>S.!T+3W3XZP,<OV<QU(\S,KF,J:DG
|
|
M1/XJ68R3:&:IW.OHNLYQO(,!XO1D:_UX]2&&G1334>U_#YLU[)Q[=`5(R1CX
|
|
M9L.IH8OGO=?/SE^>_5-N-D-8*W9VIU$;3>D0\X5&H@$5R`FERRB4#?0I1H`Z
|
|
MQI)])+<'D$7FLZL:E:TB>35W%<NND;1]Q[QM9)77>*86DLLF$UFJBW$(2X$+
|
|
MK#1(L6UPR#C1.D\/,N?]/WG27ZT4Z'_H6(,KY*[4OXWZG\W]/U'_"T('];_`
|
|
M#7?ZWWV4G?ZWT_^X_J?T`OR?*!DC$(>V7/0VNM8!,3``36L<+9G-):/*30[S
|
|
M.E^CL5PE*0X&>P,5%]N&IFO8MF*!1@*;A?W990-1K%#D-(HO4RF*/WCL,]1J
|
|
MP%?X!K_![R>@Y4YF$Q:$*OT0+9G-#1F$+XT'O9##P4TJBO]HM9A@QZ[?.K_5
|
|
M"_I-"LR*0IYQ>WX-7VSQ*O&50JVN49>N0?1VD0BQ"C=\VKEI'&NA:_I;-*]4
|
|
MKU_ALU&L>644K3Q\*=N6AF1";MY.?E.#(DBX`,CG>K<+^YU3+O(,]]%4Q.!6
|
|
MH'IS[IML(3&0'&^*U''&A9!^X%S,HCZ*2(P#J5C^=)NNZ]U[D.T>YUT$[]!X
|
|
M/]!99?TF&(;^RR?.B/C]Z]1Y=D65HON_"%-T_SL&>2830.M+VUB?_YT*U_]<
|
|
M._1=T/]"S['_8K7OHH.;RE>N_VTS_X@6/%A]7'UN&YOB__H!L_\$KMMN!QC_
|
|
MS[']7?S_>RDON$K$YCQ:B33VJ;"$5RJ@5+Y$.\^>!<H^^?3M6><4H`X>[6]7
|
|
ML!)4X\B+E9PS]\Y_/H"*8M]V**\VTUXE_-]M!W5)$WANK&@DB>UW=!K/,6GP
|
|
M:BI#XYLT':?K9FA"22-(/#<,LFWU-9XMF0W&<R5-&PY17=LW:0)?Z[O?'Q31
|
|
M=`-5C^O9?5_1Z(EG%(WKC#HF#0RB:@O7E&W2M!U7XR?J=A2-;`K55=F6XX6^
|
|
M28.3H?$3PA@:-!1<3O'3'A30X&1H_'1=I-%%\/AF,,6P?==C)'F>P,^W$T,L
|
|
M?5!X8-WC?@<]"[NLL>%PT!]4\A0TT(ZG:$+;H*&!=CU/T/2''8/&<VV@Z822
|
|
MIF.VY76`'1!`2>.;;?DXT*'G.[*U:.0;5#2MON(ZBDR.VB%P[7=5/6V3H\`'
|
|
MKMN*H\@Q.0IMX+JMZND.3'Y@_0#7OJ()37XZN'P"-4)=U^2G@R(4JGYU8I.?
|
|
M;@`\AQU%T^7\:)-*(MUIVY+&[U0JY9NAQ-;\V13D;;[_7X;^V(S_\%R???_M
|
|
M=N![^/VW0U`)=M__>R@2_Z'-^A\=`O)[C^D?J6RS_@W1N*5E8/WZ=URO[6?/
|
|
M?Z#R^#O\U[V4_^?[_[N,U;@V[.&:0(DI[$I3(\9C)EIBQL)`%^9LF^KA1:+U
|
|
MG?7X'X\/,[#AQC*)KT>HN1X6!XN(,5)XKQ\-J8J\`72I82WPQGTN8T$LW\[I
|
|
MPC#[1EWGI\AE&*\*JT^1V8=75HW2-Y)W9YV9S'F,8W9!.DGQ:EWL",,'`J4P
|
|
M;UK8.OS)_):9]X.!*LDP3@';Z31(K*5&1\>YG@JWW+S!81NGV.48+T'S[BGD
|
|
M%(M#EG=1K.(5^<FS`^L==IM=K8ZY)28[.\CEI"[O_;4J\!23>UW>MDZ*W6%H
|
|
M,,3H*OM2?F"`JFDU%E.&F,'C=)';,1ZXT9@;#ZG>6D$D+_Q=AO)B@;\8UAXK
|
|
MAOHQEA:0'"H;`36M"533>G7TXTGOXO17,B?(`WXM'_:'FEK+;Z^7N2*X-;\T
|
|
MS5@YS"?PG@]1UFI8K_`I!V00HJ./0`HIS633,<'V5O[NXMV*#`N&_++!T<>#
|
|
MQ^G*#)Y]VR%3SD1ZCY]D63TZ.WG]QJH^-\._,1&@RH35RUR?2@2O12(;(84Y
|
|
MHVE^#6IP,":1++HEZH5O`_\W,[*#%M;A`4)#UHL?0WGI81XRPE?@Y5_92^?<
|
|
M3H),2!@*_@-3QCM-JRIX>3B]JE=E+<2S3HW6I\?O;-B_B\7BH1.D*`U-ZFZA
|
|
M3!#+(G#.ETTD>6T+SNO;S.7EY\ZE@?,JFUKXE,6SY:HWG*2(C2+[+'XON<-+
|
|
M38OL<:GFGT45/'IS_N+T>"M9V#3_NJQ<9F3AR]O&D!<XJ^@D=W8*,S_&G>,J
|
|
M)?4FQ>,/2]?1M!;P2_)ADF*V+OQ,4MX/W%.P#63Q`?&8WY@VBZNX'0-QO4MI
|
|
MI;ZEEY.E10G!1,LB(`=/WH##N5PE%J:.7,%V&8TPV![S^Z;>4="%[_2W&\@?
|
|
MX2[-G]>O$&`(U<L5#:X6MXY"C:`C+V^OH&)LKX"-7(08WA"/GZB)BA1EAO(C
|
|
M269NA)HLKUG!6[Q\BV5^N>4REU=:-1'U%,>45K7$9_*EVFKEAD*'YT$+#0S)
|
|
MBZZT'&)7)TS,,(Z&<$3SV0`QM14>]'IXWE]-YCWZI<>C"=9LMNA(<M1D\!&F
|
|
M2#[TZ'N&P%/<,[8W:8SH^&WTN$PKS,%0!G1Q_!:1JGFH[_48F8:_>%B7O#89
|
|
M:+HDKX:`":@5L'\S7(IB30`5RJAA7<M'2(=M@TCKOVG>TS5\W+*">OGTT\%<
|
|
M*8QJGW;M3(=I;)HR3K86K1966OH6^<1>8GX]RJU':?4\%Z3%Z\!O/B:):3LN
|
|
M_!$&F(;/1NT_M.%1&&!>74S.U\%\?-W`9OGXH">:#V-57.6#8IRYRX=_TV4^
|
|
MB#O46-WZ2E]X:AK@"BYZO&=U&#_AG<IPHQQP4=G#_F[>@M7=\4.^!].`<6#%
|
|
MNKUXZ\UX3X;,S,IYKBT9GN=*.UVP5^&4(=9H!AE=$I&'NS_G[W=R'M`F^QN-
|
|
MAF)25./ZM',K(OPO,R-^[I3#MX+`TQ,.VJ;/+4=NBS[U")^=6O$$O\T65U.&
|
|
M]!Y(S'RQLCXLDDL><`GV7`ZE+CB-B3:Y!L9[S$\36!$V?:#>-A1IM&D>ECV4
|
|
M1X*6NP71%B2-;>II='3T^Z7>,SQ[3Q%X/IXN^M'4NC@[>MJ[.'EY<?KF])<3
|
|
M-O"4ZFP:8W;&&/T%%NF*W_ND,4NB1F.BCZFADGJ*3^-9X)<_ZP;ESV#K*G^H
|
|
M3X/Q$/:V-=5RJ%=)O6B/+GUJ3ETY"9NZ_-JM[.5V>Q.]5>"<;BSNK'_Z;19W
|
|
MSL$\HY/HKN:%^TG.V[R0K2*'\_.+LP*7\K,H`1Y_FHS?IS-4EZ1/.:YBY58.
|
|
M>OC*$J[E.[M`0=D6_P/Z\F=#@#;A?T*[+?`_@>TXB/]IMW?W__=2\NOP"3>F
|
|
M65-^0?]X<\B'Q]8JHD"0[*ASL,:._K\&%8E;6P*8N,-28)%&%]CH0E4"+LK2
|
|
M>:4`HRQ=4`HRRM)U2X%&63J]'UFPD49'@".W!'"DTSE^,!J4@(YT.M#ZAP6@
|
|
M(L+#9-IUNGX9^"@['QVG#("DT]G#42D(2:>+O&!0!D3*M-ONEH*1,G3AJ!20
|
|
ME*'KV/<`2NJ&?-#:Y:`DQ-1PFE)0$H?E`$TI*,EK^Y*F#)3D.XJF#)3D=V1;
|
|
MI9"DMJ]HRB!)`1)RFC)(4A#:DJ8,DA1ZCJ`IAR2I<2Z%)'4"R4\I)*GK2II2
|
|
M2%*70Y*0I@R2Y-AM5=%7BDG:E5W9E5W9E5W9E5W9E5W9E5W9E;LO_P4A):,A
|
|
$`,@`````
|
|
`
|
|
end
|
|
|
|
--------[ EOF
|