mirror of
https://github.com/fdiskyou/Zines.git
synced 2025-03-09 00:00:00 +01:00
928 lines
40 KiB
Text
928 lines
40 KiB
Text
![]() |
- P H R A C K M A G A Z I N E -
|
||
|
|
||
|
Volume 0xa Issue 0x38
|
||
|
05.01.2000
|
||
|
0x06[0x10]
|
||
|
|
||
|
|------------------------------ PROJECT AREA52 -------------------------------|
|
||
|
|-----------------------------------------------------------------------------|
|
||
|
|------------------------ Jitsu-Disk <jitsu@nmrc.org> ------------------------|
|
||
|
|----------- Simple Nomad <thegnome@nmrc.org> Irib <irib@nmrc.org> -----------|
|
||
|
|
||
|
|
||
|
"Delirium Tremens"
|
||
|
|
||
|
----| Background
|
||
|
|
||
|
Military tactics have evolved along with technology. Reaching an objective is
|
||
|
done with computed strategies gathering the impact of warfare on the field.
|
||
|
This information is used to plan the next offensive. As the NSA has pointed
|
||
|
out, cyber-warfare happens much like its real-life counterpart, hence the
|
||
|
same intelligence can be used. This draft will try to explore the means and
|
||
|
tools with which to build an automated attack engine based on a universal
|
||
|
classification of attack strategies (regardless of the actual attacks).
|
||
|
|
||
|
|
||
|
----| Classification
|
||
|
|
||
|
Writing the proper classification of computer attacks actually fills entire
|
||
|
books [1], yet we can devise levels of access -- Read, Write and Modify --
|
||
|
that an attacker can gain over a system. The steps to achieve your goal will
|
||
|
vary depending upon whether you are attacking remotely, locally, or even
|
||
|
physically. Achieving the goal is also dependent upon the security policy
|
||
|
of the targeted system.
|
||
|
|
||
|
The objective of the classification is to provide a means to universally
|
||
|
describe the levels of acquired access, depending on one's situation.
|
||
|
|
||
|
Later we will explore the building of a generic engine to defeat various
|
||
|
security policies on target systems through the steps described in the
|
||
|
classification.
|
||
|
|
||
|
To illustrate this we will attempt to define the classification of remote
|
||
|
intrusion, based upon the OSI model. A similar classification for physical
|
||
|
and local intrusion can be derived, although this paper will mainly focus
|
||
|
on the remote element.
|
||
|
|
||
|
Various levels of access holds both logical properties and mathematical
|
||
|
ones. For example, a logical property might be "if you can read the TCP/IP
|
||
|
stream you can read the networked layer". A mathematical example might be
|
||
|
"the Write property is intransitive; you can spoof traffic on the network
|
||
|
yet not Modify existing data or hijack a session". The mathematical issues
|
||
|
are left as an exercise to the reader, the logical ones will be used as
|
||
|
the basis for the attack engine.
|
||
|
|
||
|
The following is our classification:
|
||
|
|
||
|
[ Acc : Access level
|
||
|
M = Modify capabilities
|
||
|
W = Write capabilities
|
||
|
R = Read capabilities ]
|
||
|
|
||
|
Situation : Remote
|
||
|
------------------
|
||
|
|
||
|
OSI Layers Acc Implication
|
||
|
|
||
|
*--------------*
|
||
|
| Application |
|
||
|
| 6 | M Application rights, compromise all layers below
|
||
|
| | W DoS, unprivileged access
|
||
|
| | R Data gathering
|
||
|
|--------------|
|
||
|
| Session |
|
||
|
| 5 | M Session redirection, compromise all layers below
|
||
|
| | W DoS, service scanning
|
||
|
| | R Session Data gathering
|
||
|
|--------------|
|
||
|
| Presentation |
|
||
|
| 4 | M Redirection, compromise all layers below
|
||
|
| | W DoS, scanning
|
||
|
| | R Data gathering
|
||
|
|--------------|
|
||
|
| Transport |
|
||
|
| 3 | M Redirection, compromise all layers below
|
||
|
| | W DoS, scanning
|
||
|
| | R Data gathering
|
||
|
|--------------|
|
||
|
| Network |
|
||
|
| 2 | M Redirection, compromise all layers below
|
||
|
| | W DoS, scanning
|
||
|
| | R Data gathering
|
||
|
|--------------|
|
||
|
| Data Link |
|
||
|
| 1 | M Redirection, compromise all layers below
|
||
|
| | W DoS, scanning
|
||
|
| | R Data gathering
|
||
|
|--------------|
|
||
|
| Physical |
|
||
|
| 0 | M Redirection
|
||
|
| | W DoS, scanning
|
||
|
| | R Data gathering
|
||
|
*--------------*
|
||
|
|
||
|
|
||
|
This attack-based model works top/down: if you can control the Application
|
||
|
(Modification rights to what it does), all dependent layers are compromised.
|
||
|
To be more specific, all dependent layers of the specific process you control
|
||
|
are now "owned" by you. If you control sendmail you may fool around with
|
||
|
all associated network functions, in the scope of access rights. Hence, if we
|
||
|
define our "attack goal" to be "running a shell as root on the target system",
|
||
|
a listening sendmail daemon running as root would be a good target. If
|
||
|
sendmail is compromised to the point of executing commands as root, the remote
|
||
|
attacker could easily gain a root shell, thereby meeting the goal. If the goal
|
||
|
was to establish a covert channel to the target for Denial of Service (DoS)
|
||
|
purposes or for launching further attacks, then appropriate actions would be
|
||
|
taken.
|
||
|
|
||
|
On the other hand, having control of a lower level layer doesn't automatically
|
||
|
guarantee you control of the above layer. For example, as an attacker you
|
||
|
might be sniffing the network and see two computers exchanging data. But if
|
||
|
this conversation is encrypted (and assuming you cannot decrypt the session)
|
||
|
you could at best simply disrupt the conversation -- you would not control it.
|
||
|
|
||
|
On the same layer their is a subtlety regarding the Read and Write
|
||
|
capabilities: being able to Read and Write only your own data is of limited
|
||
|
interest from an attacking standpoint, port scanning notwithstanding. Hence
|
||
|
we assume Read and Write capabilities are reached if you can Read and/or
|
||
|
Write data we don't "own".
|
||
|
|
||
|
Given the above definition of Read/Write for a layer, if one can both Read and
|
||
|
Write a layer it MAY be able to Modify it at that layer as well. However if
|
||
|
encryption is in use, this is not guaranteed. Therefore Read/Write
|
||
|
capabilities on a layer is required yet insufficient for Modify capabilities.
|
||
|
|
||
|
On a perfectly designed and secured system, one should not be able to get
|
||
|
additional rights on a higher layer. The attack engine works by exploiting
|
||
|
security breach to progressively reach a desired goal given a starting point.
|
||
|
For instance, achieving administrative access by starting with just the IP
|
||
|
address of the victim.
|
||
|
|
||
|
In order to illustrate some of this, let's define a very primitive
|
||
|
"Local Situation Access Classification" :
|
||
|
|
||
|
LS
|
||
|
|
||
|
6 kernel level (R,W,M)
|
||
|
5 drivers level (R,W,M)
|
||
|
4 process level (R,W,M)
|
||
|
3 user/group admin (R,W,M)
|
||
|
2 user/group "average" (R,W,M)
|
||
|
1 user/group null (R,W,M)
|
||
|
|
||
|
Now that we hold a classification hierarchy of access level, we need to
|
||
|
apply this to the security breach we know of.
|
||
|
|
||
|
For example in the NMRC Pandora project, a Hijacking attack called
|
||
|
"Level 3-1" would be referenced in this manner:
|
||
|
|
||
|
Name Systems Versions Level required Level gained
|
||
|
----------- -------- -------------- -------------- ------------
|
||
|
"Level 3-1" "Novell" "4.11","5 IPX" "Remote 3 M" "Local 3 R W"
|
||
|
|
||
|
This hack works on two levels -- a remote compromise of the IPX session,
|
||
|
and the payload that will actually give you admin privilege.
|
||
|
|
||
|
Another attack from Pandora called "GameOver" that exploits a bug looks
|
||
|
like so:
|
||
|
|
||
|
Name Systems Versions Level required Level gained
|
||
|
----------- -------- -------------- -------------- ------------
|
||
|
"GameOver" "Novell" "4.11","5 IPX" "5 W" "Local 4 R W"
|
||
|
|
||
|
In this case the process attacked is Bindery Supervisor equivalent in
|
||
|
rights. The Bindery Supervisor holds a restricted set of the Admin
|
||
|
rights. In this example we clearly see that this primitive description
|
||
|
of Local Situation doesn't quite fit -- although we have achieved a
|
||
|
higher level we have a restricted set of rights compared to previous
|
||
|
attack. A better Classification is to be devised.
|
||
|
|
||
|
The NMap[2] tool would be:
|
||
|
|
||
|
Name Systems Versions Level required Level gained
|
||
|
----------- -------- -------------- -------------- ------------
|
||
|
"NMAP" "ALL" "ALL" "3 W*" "5 R*"
|
||
|
|
||
|
W* and R* mean Write and Read in a restricted sense. Write implies valid
|
||
|
data you can legitimately write, Read data that you "own".
|
||
|
|
||
|
Two advantages are immediately obvious from this approach
|
||
|
|
||
|
-- Recognition of re-usability in attacks (e.g. if you only have R*/W*
|
||
|
access in a 3com switched environment running an attack to overload the
|
||
|
switch MAC table would provide you with R/W access and opens doors to
|
||
|
new attacks).
|
||
|
|
||
|
-- Independence of the type of code used for the attacks (scripts, Perl,
|
||
|
C, etc.) with the actual hack engine.
|
||
|
|
||
|
To facilitate the reference, the web's most popular hack archives [3][4][5]
|
||
|
could automate this in their commentary. This will be highlighted in the
|
||
|
next section.
|
||
|
|
||
|
Before we get there let's refine the classification method:
|
||
|
|
||
|
Assumptions
|
||
|
-----------
|
||
|
(1) For each situation (via network, via local process, via physical
|
||
|
access) a set of layer between you and the goal are defined.
|
||
|
(2) Each layer, independent from any other, are linked top-down.
|
||
|
(3) A layer is defined by its uniqueness and the ability to associate
|
||
|
Read/Write and Modify access levels for it.
|
||
|
|
||
|
Implications
|
||
|
------------
|
||
|
(1) Modify access in the highest layer implies control of all the preceding
|
||
|
layers (Layer N+1 includes Layer N), restricted by the given
|
||
|
Classification (in a Remote Situation that would be the process's
|
||
|
dependant layers, in a Local Situation, the runlevels).
|
||
|
(2) R/W/M access is a superset of R*/W*/M* where R*/W*/M* is the
|
||
|
legitimate privilege access for a layer and R/W/M includes access
|
||
|
to more privileges for the same layer (M>M*,W>W*,R>R*).
|
||
|
(3) Read/Write access to a layer is required to gain Modify access but
|
||
|
is not sufficient.
|
||
|
(4) The concept of security breach comes from the fact that there exists
|
||
|
a way to gain access to a higher layer (or access level) by defeating
|
||
|
the security policy protocol between two layers (or access levels).
|
||
|
|
||
|
For classification to be really universal and easily implemented, the three
|
||
|
situations (Remote, Local, Physical) must be devised in layers that apply
|
||
|
to all known systems. This might sound a bit utopic, yet the OSI model for
|
||
|
remote access seems universal enough since virtually every networked system is
|
||
|
either based on it or can be appropriately mapped against it. For Local access
|
||
|
to a system (via a remote shell, local session or whatever) to be properly
|
||
|
specified in layers, we should first look into what could be universally
|
||
|
considered as local system security layers such as run levels, groups and
|
||
|
users and hardware access (this has yet to be done). Physical access, brings
|
||
|
into light a world ruled by other means than just electricity, so things
|
||
|
might not be so obvious.
|
||
|
|
||
|
|
||
|
----| Storage : A Hack Database
|
||
|
|
||
|
Now that we have a Universal Classification for Remote, Local and Physical
|
||
|
access, let's set the following abbreviations:
|
||
|
|
||
|
Remote Situation : RS
|
||
|
Local Situation : LS
|
||
|
Physical Situation : PS
|
||
|
|
||
|
Layer N : L(N)
|
||
|
Layer N-1 : L(N-1)
|
||
|
|
||
|
Read access : R
|
||
|
Write access : W
|
||
|
Modify access : M
|
||
|
|
||
|
Restricted Read access : R*
|
||
|
Restricted Write access : W*
|
||
|
Restricted Modify access : M*
|
||
|
|
||
|
A privilege level is defined by the "tuple" : (situation, layer(x), access).
|
||
|
For example, ability to modify the application sendmail remotely (given OSI
|
||
|
model above) would be sendmail (RS,L(6),M). A remote buffer overflow in
|
||
|
sendmail, that just requires an attacker to send a mail to the daemon would be
|
||
|
listed this way:
|
||
|
|
||
|
Name Systems Versions Level required Level gained
|
||
|
--------------- -------- --------------- -------------- ------------
|
||
|
Sendmail-sploit All Unix Sendmail 8.10.1 (RS,L(6),W*) (LS,L(3),M)
|
||
|
|
||
|
We would also store the attack code in the database as well (remembering
|
||
|
the actual attack engine will be separate).
|
||
|
|
||
|
The stored code would return a value indicating attack success or failure,
|
||
|
and could also return parameters to be used with further attacks on completion.
|
||
|
For instance, a successful remote Sendmail buffer overflow would return TRUE
|
||
|
and a handle to the remote shell; then the attack would be taken to the
|
||
|
LS level where local attacks would be run to get runlevel 0 access (or root).
|
||
|
This means the attack engine would run stored functions in a dynamic database,
|
||
|
such as:
|
||
|
|
||
|
*------------* *-----------*
|
||
|
| Attacks | | Results |
|
||
|
*------------* *-----------*
|
||
|
| Attack_ID | | Result_ID |
|
||
|
| Name | | Type |
|
||
|
| System | 0,1---0,N-> | Identifier|
|
||
|
| Version | *-----------*
|
||
|
| Level Req | | Handle |
|
||
|
| Level Gain | *-----------*
|
||
|
*------------*
|
||
|
| Code |
|
||
|
*------------*
|
||
|
|
||
|
Attack_ID and Result_ID are unique.
|
||
|
|
||
|
The relation between the Attack table and the Result table is "one to many".
|
||
|
An attack could have been completed successfully on various targets. A
|
||
|
"result" is linked to one and only one attack.
|
||
|
|
||
|
In the result table the Type defines whatever it is, a temporary hack or a
|
||
|
permanent one (like a backdoor), the Identifier specifies a unique name to
|
||
|
the target (IP address, DNS name...).
|
||
|
|
||
|
The handle would be a pointer to a successful hack, based on the situation,
|
||
|
i.e. in a Remote attack a pointer to a Libnet[6] structure, in a Local attack
|
||
|
a pointer to a shell, a remote cmd.exe...
|
||
|
|
||
|
The "Code" part in the "Attack table" would be either the source code, which
|
||
|
means we have a built-in compiler in the engine, the attack binary code that
|
||
|
would require platform specific code to be pre-built, or some sort of
|
||
|
scripting language we would rewrite all attacks with (see Nessus in comparison
|
||
|
chapter below).
|
||
|
|
||
|
Those specifications are far from completed and the database is very simple,
|
||
|
but you get the point. The idea is to separate on the diagram what is gained
|
||
|
from knowledge (Attacks), to what is gained in the wild (Results). Just as
|
||
|
an example, that could be :
|
||
|
|
||
|
(known exploits code)
|
||
|
Systems-0,1-0,N-Vulnerabilities-0,N-0,1-Instructions
|
||
|
(known systems) | (known related instructions/daemons/programs...)
|
||
|
|
|
||
|
0,1
|
||
|
|
|
||
|
0,N
|
||
|
|
|
||
|
Result (handles to hack, Libnet stack, shell ...)
|
||
|
| (& collected info, e.g. [10.0.0.1] is [Novell 5sp3])
|
||
|
0,N
|
||
|
|
|
||
|
0,1
|
||
|
|
|
||
|
Target (standard specification of target IP,Name...)
|
||
|
|
||
|
|
||
|
This approach implies either standardized interfaces of hacks (normalized input
|
||
|
parameters and output handles), or a "Superset Code" could be written, that
|
||
|
given the attacks specifications (input parameters, Level Req'd, Level Gained),
|
||
|
would wrap the attack, run it, and return values in our standard form. Since
|
||
|
writing regular expression engines is, ahem, NOT fun maybe we could decide for
|
||
|
the first solution.
|
||
|
|
||
|
With respect to what we have seen in the Classification of the Remote
|
||
|
Situation, we stated that compromising a layer is understood in the restricted
|
||
|
sense of the attacked application's layers. Yet we could assume that
|
||
|
compromising an application, say Sendmail, would give you control over another
|
||
|
one, maybe DNS in this case. We need to be able to describe this in the
|
||
|
database -- compromising an application might give you control over some
|
||
|
others. A schematic representation would be:
|
||
|
|
||
|
0,1-[hack_id]-0,N (recursive link - a hack grants you access to more than)
|
||
|
| | (one system/instructions)
|
||
|
(known exploits code)
|
||
|
(and access levels)
|
||
|
Systems-0,1-0,N-Vulnerabilities-0,N-0,1-Instructions
|
||
|
(known systems) | (known related instructions/daemons/programs...)
|
||
|
|
|
||
|
0,1
|
||
|
|
|
||
|
0,N
|
||
|
|
|
||
|
Result (handles to hack, Libnet stack, shell ...)
|
||
|
| (& collected info, e.g. [10.0.0.1] is [Novell 5sp3])
|
||
|
0,N
|
||
|
|
|
||
|
0,1
|
||
|
|
|
||
|
Target (standard specification of target IP,Name...)
|
||
|
|
||
|
|
||
|
So we have now a pretty good idea of what the unified hack database would look
|
||
|
like:
|
||
|
|
||
|
1) A knowledge database of known systems, systems instructions and associated
|
||
|
exploits.
|
||
|
2) The database would have a standard for describing all fields.
|
||
|
3) It would define the level required/level gained "tuples" (situation,
|
||
|
layer(x),access), for each known exploit.
|
||
|
4) Exploit code would be stored in the database and follow a standard
|
||
|
representation of the interface (normalized input parameters and
|
||
|
output handles).
|
||
|
|
||
|
There exists today an international effort for a standard way to describe
|
||
|
exploits. Such databases are in their infancy, but strong projects like
|
||
|
CVE[7] are certainly breaking new ground.
|
||
|
|
||
|
The aim of such standardization is to achieve unified descriptions of attack
|
||
|
scenarios (to be used in attack automation, either via vulnerability
|
||
|
assessment tools or actual penetration tools). Therefore our attack engine
|
||
|
would offer three modes:
|
||
|
|
||
|
- Simulation (no actual attack performed, but we could use results for
|
||
|
vulnerability assessments, future attack scenarios, etc),
|
||
|
- Manual (attack performed manually, no wrap code, like the mils;-)),
|
||
|
- Automated (the ultimate Hacking Machine).
|
||
|
|
||
|
|
||
|
----| Artificial Intelligence
|
||
|
|
||
|
The reader might not be trained in AI, so let's attempt to define some
|
||
|
of the principles we need for this discussion.
|
||
|
|
||
|
--| Intelligence
|
||
|
|
||
|
AI is by no means meant to "create", but rather to "think". Thinking,
|
||
|
logically and reproducibly, is a process, therefore it may be mimicked by a
|
||
|
machine. In fact, given the proper thinking strategy and process a computer
|
||
|
solves known problems much faster than humans. Building a new Hack is a
|
||
|
simple process if the methodology is known. If the methodology is not known
|
||
|
you must create it. When no logical path takes you to where you want to go you
|
||
|
have to create a new Hack when it can't be related to any other hacks. The
|
||
|
new Hack then enrichs the world of known hacks, and can possibly be added to
|
||
|
the overall Hack process. It is assumed that AI can solve our problems, given
|
||
|
the following restrictions:
|
||
|
|
||
|
1) The problem solving time is, generally, unpredictable and may
|
||
|
even take years if done manually.
|
||
|
2) The problems that can't be solved because an individual doesn't
|
||
|
hold enough "process knowledge" for resolution (or the knowledge
|
||
|
necessary can't be described with the formalism we've chosen, see
|
||
|
the Godel theorem of incompleteness and the book "Godel Escher
|
||
|
Bach, The Eternal Golden Braid" by Douglas Hofstadter).
|
||
|
|
||
|
In other words, any system can be hacked; granted we have enough time and
|
||
|
known hacks for this purpose.
|
||
|
|
||
|
--| Inference
|
||
|
|
||
|
The "thinking engine" we want to use here will have to use known facts
|
||
|
(hacks) against field results, to explore the paths that takes us to the
|
||
|
ultimate goal (or result). Such engines are described in AI as "inference
|
||
|
engines", starting from the goal and finding a possible path to the knowledge
|
||
|
base is called "backward inference", starting from the knowledge base and
|
||
|
finding a path to the goal is called "forward inference". In the present case
|
||
|
"backward inference" is only good for simulation, but in the field we can only
|
||
|
use "forward inference" (which is algorithmically known as being slower than
|
||
|
backward inference).
|
||
|
|
||
|
The initial theory behind inference engines is based on two "logic" rules,
|
||
|
one for forward inference called Modus Ponens (MP) the other for backward
|
||
|
inference called Modus Tollens (MT). MP states that [if (P) AND (P includes Q)
|
||
|
THEN (Q)], MT says [if (NOT Q) AND (P implies Q) THEN (NOT P)].
|
||
|
|
||
|
--| The Inference Engine
|
||
|
|
||
|
Algorithmically speaking, the Inference Engine is a recursive algorithm that
|
||
|
takes a set of known facts as input (target is www.blabla.bla), processes it
|
||
|
against the database of rules (if RedHat 5.0 then SendMail is vulnerable) and
|
||
|
adds a new facts to the set (if target is RedHat 5.0 then target is vulnerable
|
||
|
to SendMail bug). The engine stops when either we have reached our goal
|
||
|
(target is compromised) or we can't add anything new to the set of facts (all
|
||
|
possibilities have been explored). In order to optimize the process, the
|
||
|
Inference Engine is set to use strategies in choosing which rules to test first
|
||
|
(buffer overflow might be easier to try than "tmp race", so we set the engine
|
||
|
to try a buffer overflow first). As discussed in the following "distributed"
|
||
|
section, it is essential to see that the hacking process is not in the engine
|
||
|
itself, but in the database rulesets. For instance, tests would be performed
|
||
|
to understand the target installation/setup/OS and match the subsequent hacks,
|
||
|
the engine provides the mechanism for this and the rulesets the paths to
|
||
|
understand how one must attack. It is in the description of the ruleset that
|
||
|
we have the actual "Intelligence", hence if a new OS appears on the market
|
||
|
with a new security mechanism, we do not need to rewrite the engine, but
|
||
|
specify new rules specific to this OS.
|
||
|
|
||
|
--| An Inference Engine of order 0
|
||
|
|
||
|
Consider a ruleset that contains no variables, only static facts:
|
||
|
|
||
|
If monkey close to tree, monkey on tree
|
||
|
If monkey on tree AND banana on tree, monkey eat banana
|
||
|
|
||
|
We use "order 0" inference engine (O.K AI pals, this is not quite the
|
||
|
definition, yes there is a whole theory behind this, we know, don't flame us).
|
||
|
|
||
|
With the initializing fact
|
||
|
monkey close to tree
|
||
|
|
||
|
we will get
|
||
|
monkey on tree
|
||
|
|
||
|
and finally
|
||
|
monkey eat banana
|
||
|
|
||
|
--| An Inference Engine of order 1
|
||
|
|
||
|
If the ruleset contains variables :
|
||
|
If monkey close to (X), monkey on (X)
|
||
|
If monkey on (X) AND banana on (X), monkey eat banana
|
||
|
|
||
|
The inference engine that processes the rules and operates variable
|
||
|
substitution is said to be of order 1 (And if you're curious to know, there is
|
||
|
no engine of order 2 or higher, all problems are proven to be described in
|
||
|
order 1). This is the type of engine we want to use, as it allow us to use
|
||
|
variables -- they will be the "handles" resulting of our hacks.
|
||
|
|
||
|
--| Pattern Matching
|
||
|
|
||
|
Just like there are interpreted languages and faster-running compiled ones,
|
||
|
there are AI Inference Engines based on "interpreted rulesets" and other
|
||
|
based on "compiled rulesets". Compiling the ruleset means you have to
|
||
|
rearrange it in such a way that is "immediately efficient". The compilation
|
||
|
method we're interested in is called Pattern Matching and is based on binary
|
||
|
trees. For instance, lets assume the following:
|
||
|
|
||
|
Initial database:
|
||
|
|
||
|
Name Systems Versions Level required Level gained
|
||
|
----------- -------- -------------- -------------- ------------
|
||
|
d0_v8-BOF Unix,All Sendmail 8.8.* (RS,L(6),W*) (LS,L(3),M)
|
||
|
d0_v9-BOF Unix,All Sendmail 8.9.1 (RS,L(6),W*) (LS,L(3),M)
|
||
|
|
||
|
Ruleset:
|
||
|
|
||
|
if system[X] is Unix AND Version[Y] is Sendmail 8.8.* AND
|
||
|
Level_s[Z] is RS AND Level_l[Z] is 6 AND Level_a[Z] is W* AND
|
||
|
Hack(d0_v8-BOF,X) THEN Level_a[Z] is [LS,L3,M]
|
||
|
|
||
|
if system[X] is Unix AND Version[Y] is Sendmail 8.9.1 AND
|
||
|
Level_s[Z] is RS AND Level_l[Z] is 6 AND Level_a[Z] is W* AND
|
||
|
Hack(d0_v9-BOF,X) THEN Level_a[Z] is [LS,L3,M]
|
||
|
|
||
|
Compiled ruleset for pattern matching:
|
||
|
|
||
|
|
||
|
system
|
||
|
|
|
||
|
[Sendmail]
|
||
|
|
|
||
|
version
|
||
|
|
|
||
|
[UNIX]
|
||
|
|
|
||
|
level_s
|
||
|
|
|
||
|
[RS]
|
||
|
|
|
||
|
level_l
|
||
|
|
|
||
|
[6]
|
||
|
|
|
||
|
level_a
|
||
|
|
|
||
|
[W*]
|
||
|
|
|
||
|
FUNC
|
||
|
/ \
|
||
|
/ \
|
||
|
/ \
|
||
|
/ \
|
||
|
[d0_v8-BOF] [d0_v9-BOF]
|
||
|
\ /
|
||
|
----------
|
||
|
|
|
||
|
*
|
||
|
level_s [L]
|
||
|
level_l [3]
|
||
|
level_a [M]
|
||
|
|
||
|
[] are used to represent variables, filled in for clarity
|
||
|
|
||
|
The tree is parsed from the top every time a new fact is added to the
|
||
|
knowledge database, and this allows for a dynamic-algorithm (i.e. intelligent
|
||
|
self-modifying knowledge base). When the tree is parsed and brings in a new
|
||
|
fact, the knowledge base is increased with this fact, and the tree is parsed
|
||
|
again for more facts...
|
||
|
|
||
|
Since an attack happens in different phases (see distributed chapter below),
|
||
|
facts may have different impacts. They may just be collected facts (system is
|
||
|
RH6.0, buffer overflow on sendmail possible, "poor default config" exploit
|
||
|
on sendmail possible), or facts that trigger attacks (buffer overflow and
|
||
|
"poor config" exploits possible, rule says test config first -- config exploit
|
||
|
will be tested and result added to database, we gain new rights or we move on).
|
||
|
|
||
|
Optimization comes from the fact that whereas in the flat ruleset sample all
|
||
|
rules must be parsed to find the matching one, in a tree-like representation a
|
||
|
simple pattern matching mechanism shows the right branch. Although it's a pain
|
||
|
to compile such a ruleset into a tree is not obvious for a few rules on our
|
||
|
database, it really shows if the database contains thousands of facts. Besides,
|
||
|
once the database is compiled into a tree, it's done and you dont have to do it
|
||
|
again (insertion of new elements into a tree is possible, yet the tree could
|
||
|
also be recompiled on each new addition).
|
||
|
|
||
|
More optimization, not for engine itself but in the hacking sense, can be
|
||
|
achieved if we set some "grading rules" per attack and organize the tree this
|
||
|
way -- say we know two attacks for Sendmail, same version, one relies on a
|
||
|
complex buffer overflow and the other on misconfiguration. The
|
||
|
misconfiguration should be tried first (if the buffer overflow fails we might
|
||
|
kill Sendmail altogether), hence given a higher mark. This marking process
|
||
|
would look at two factors -- the level required to perform attack and method
|
||
|
use, for instance:
|
||
|
|
||
|
Situation - Grade - Level - Grade - Method - Grade
|
||
|
|
||
|
Remote +100 6 +60 Config +3
|
||
|
Local +200 5 +50 Filesys +2
|
||
|
4 +40 BuffOver +1
|
||
|
...
|
||
|
|
||
|
The guarding mechanism can be automated in the AI, the method is another piece
|
||
|
of information to be Classified and stored in the Hack Database.
|
||
|
|
||
|
--| A Pattern Matching, Forward Inference Engine of Order 1
|
||
|
|
||
|
So what we're looking for is :
|
||
|
|
||
|
An AI engine, of forward inference type, of order 1. The engine is better
|
||
|
optimized, like in pattern matching for instance and it allows for function
|
||
|
executions.
|
||
|
|
||
|
An academic sample of such an algorithm is the RETE algorithm (beyond the scope
|
||
|
of this preliminary discussion) and the interested reader is directed to the
|
||
|
paper by Charles L. Forgy in "Artificial Intelligence" : "The RETE Matching
|
||
|
Algorithm" (Dept of Computer Science, Carnegie-Mellon University). You could
|
||
|
also look into a similar systems called OPS and TANGO ("OPS5 user's manual" by
|
||
|
the same author and "TANGO" by Cordier-Rousset from L.R.I of Orsay Faculty in
|
||
|
France). Working code of RETE can be found at the MIT repository [8]. You
|
||
|
can also check Pr. Miranker's Venus project [9]. Original code for OPS exists
|
||
|
in LISP [10]. However, the one piece of work that would definitely match
|
||
|
our expectation is a system called CLIPS, written in C, by NASA (initially
|
||
|
by NASA, but now it is maintained in the public domain) [11].
|
||
|
|
||
|
--| The Hacking Engine
|
||
|
|
||
|
The engine will first query the database of facts for all known hacks sorted
|
||
|
in the classification form we defined along with systems and versions
|
||
|
information, these known hacks are written as a set of rules the exact
|
||
|
representation of hacks into rules is linked to the engineitself and is yet to
|
||
|
be defined.
|
||
|
|
||
|
Then this ruleset is compiled into a binary tree (or some other efficient
|
||
|
data structure) for better optimization, provided a proper optimizing
|
||
|
strategy (which may branch to the left-most side for instance, maybe granted a
|
||
|
difficulty grade per attack). The optimizing strategy might take the
|
||
|
classification rules into account to decide that if a higher level is reached,
|
||
|
all branches that refer to lower level attacks must be ignored -- this would
|
||
|
be a called "restrictive optimization".
|
||
|
|
||
|
The engine is initialized with the initially known facts (target id), and
|
||
|
starts applying rules to these facts in order to get more information out of
|
||
|
them, until the goal is reached or all branches have been explored. The
|
||
|
engine in simulation mode would only use the initializing facts and match
|
||
|
function calls with them, in manual mode the hacker would be provided the
|
||
|
function code by the engine that would then wait for the result, in automatic
|
||
|
mode the engine would run the code itself.
|
||
|
|
||
|
|
||
|
----| Distributed paradigm
|
||
|
|
||
|
Distributed hacking theory, analysis and advantage has been extensively
|
||
|
reviewed in an excellent article by Andrew J. Stewart entitled "Distributed
|
||
|
Metastasis [12]. Hence we will base this proposed implementation on it,
|
||
|
please refer to the above article.
|
||
|
|
||
|
--| Distributed Schematic
|
||
|
|
||
|
In a distributed attack, the attacker (A) is the "parent" of all nodes
|
||
|
(agents). Each node is characterized by a running agent (the hacking engine),
|
||
|
its address (IP,IPX...), and the level the agent is running at. For instance:
|
||
|
|
||
|
[10.0.0.1,A,parent], knows (10.0.2.1,10.0.2.5,10.0.3.1)
|
||
|
|
|
||
|
|
|
||
|
----------------- -----------
|
||
|
| |
|
||
|
[10.0.3.1,A1,(LS,L(3),M)] [10.0.2.1,A2,(LS,L(3),M)], knows 10.0.2.5
|
||
|
|
|
||
|
[10.0.2.5,A3,(LS,L(3),M)]
|
||
|
|
||
|
|
||
|
The attacker knows the existence of all nodes, but communicates through the
|
||
|
hierarchy (to send a command to 10.0.2.5, he issues this to 10.0.2.1 for
|
||
|
routing). This keeps risk to a minimum, should any of the agents be
|
||
|
discovered. When 10.0.2.5 tries to talk to the attacker, he sends stuff via
|
||
|
10.0.2.1 -- A3 knows A2 but not A. It is obvious that if any of the nodes
|
||
|
are to be uncovered, attached parent node and child nodes could be too. In
|
||
|
this case, the Attacker could issue a direct order to any of the potentially
|
||
|
compromised agents to either "attach" themselves to somewhere else, or to
|
||
|
sacrifice the agent's "territory" and have the agent eliminate itself.
|
||
|
|
||
|
Example: Agent 10.0.2.1 was discovered, the Attacker decides to attach
|
||
|
10.0.2.5 to 10.0.3.1 and sacrifice 10.0.2.1.
|
||
|
|
||
|
[10.0.0.1,A,parent], knows (10.0.2.5,10.0.3.1)
|
||
|
|
|
||
|
|
|
||
|
----------------- -----------
|
||
|
| |
|
||
|
[10.0.3.1,A1,(LS,L(3),M)] x
|
||
|
|
|
||
|
[10.0.2.5,A3,(LS,L(3),M)]
|
||
|
|
||
|
To ensure better privacy, encryption is to be used at each node for the
|
||
|
database of "parent&child" they have.
|
||
|
|
||
|
At least two other secret-routing systems can be used:
|
||
|
|
||
|
1. A child knows its parent address, but parent doesn't know its children.
|
||
|
All communication to a child would first require a request to the top
|
||
|
node (A) to learn the location of the children. This would ensure lesser
|
||
|
risk to compromise an entire branch in case one of the node is uncovered
|
||
|
|
||
|
[10.0.0.1,A,parent], knows (10.0.2.5,10.0.3.1)
|
||
|
|
|
||
|
|
|
||
|
----------------- -----------
|
||
|
|
|
||
|
[10.0.3.1,A1,(LS,L(3),M)]
|
||
|
* |
|
||
|
| x
|
||
|
[10.0.2.5,A3,(LS,L(3),M)]
|
||
|
|
||
|
A3 knows how to talk to A1, A1 asks A for who to talk with.
|
||
|
|
||
|
2. All nodes in the tree (except for A) don't know the other nodes' addresses
|
||
|
but know the subnet on which the node resides and may sniff packets. For
|
||
|
instance A1 would send packets to 10.0.2.6, whereas 10.0.2.6 discards it but
|
||
|
10.0.2.5 sees the data and replies to 10.0.3.2. [13]
|
||
|
|
||
|
|
||
|
--| Distributed & Simultaneous Attack
|
||
|
|
||
|
Phase 0
|
||
|
|
||
|
The actual attack happens in phases. The attacker decides on a target and the
|
||
|
level desired. Then the AI will look in the known set of Agents, and the
|
||
|
defined rules for attack optimization. For instance, if we want to attack
|
||
|
10.0.3.2, the AI could decide to pass the attack to 10.0.3.1. The AI could
|
||
|
also decide for multiple agents to attack at once (hence the distributed
|
||
|
paradigm), in this case, collected information (the knowledge base) is passed
|
||
|
between each phase to the Attacker, who could decide to redistribute it to
|
||
|
the attacking agents.
|
||
|
|
||
|
Phase 1
|
||
|
|
||
|
Once a given Agent has received an order to attack, it queries its parent node
|
||
|
for updated hacking database entries. Depending on the initial Attack order
|
||
|
issued, this query might move up to the Attacker or not happen at all. If
|
||
|
the communication model used is hierarchical, we could even implement this in
|
||
|
DNS queries/replies to benefit from the existing code (see Phrack [14] issues
|
||
|
50-53 on this).
|
||
|
|
||
|
Phase 2
|
||
|
|
||
|
The agent performs ruleset optimization as discussed previously chapter.
|
||
|
|
||
|
Phase 3
|
||
|
|
||
|
The agent updates or build its RETE vulnerability tree.
|
||
|
|
||
|
Phase 4
|
||
|
|
||
|
The agent satisfies the first "target detection" ruleset (this includes host,
|
||
|
service, network topology, OS, Application layer info detection), before
|
||
|
moving to the next phase. This happens exclusively as an RS. In the case of
|
||
|
a simultaneous attack (by many agents for one target) information gathered is
|
||
|
moved to the Attacker who might push back other info gathered by the other
|
||
|
agents.
|
||
|
|
||
|
Phase 5
|
||
|
|
||
|
The Agent actually attempts to compromise the target. This phase is not
|
||
|
completed until the level of access the attacker decided upon is reached, and
|
||
|
the "target clean-up" (cleaning the logs) rulesets are satisfied. The cleanup
|
||
|
rules might even trigger the necessary hack of another box where the logs may
|
||
|
reside -- it is common practice in security administration to log to a
|
||
|
different machine (especially at high profile sites with high profile targets).
|
||
|
This phase might fail upon unsuccessful hacks or a timeout.
|
||
|
|
||
|
Phase 6
|
||
|
|
||
|
Install the hacking engine child on target. Target becomes part of the tree as
|
||
|
a subordinate of the successful attacking agent. The Attacker is notified of
|
||
|
the new acquisition.
|
||
|
|
||
|
Phase 7
|
||
|
|
||
|
The new agent goes into passive mode -- it waits for input from its parent and
|
||
|
monitors traffic and trust relationships locally to increase its local
|
||
|
knowledge database. On a regular basis the agent should "push" info to its
|
||
|
parent, this is necessary if the agent is behind a firewall or the address is
|
||
|
set dynamically.
|
||
|
|
||
|
Note: Phase 4+5+6 are the so-called "consolidation components".
|
||
|
|
||
|
The Simultaneous aspects of attack are controlled by the Attacker and not by
|
||
|
delegation to other parent nodes. This could be called Centrally Controlled
|
||
|
Distributed and Simultaneous Attack.
|
||
|
|
||
|
Let's summarize the phases:
|
||
|
|
||
|
Engine Phase Comments
|
||
|
----------- ----- --------
|
||
|
AI 0 Decide for agent(s) to attack target
|
||
|
Incremental 1 Database query
|
||
|
AI 2 Ruleset optimization
|
||
|
Incremental 3 Tree build
|
||
|
AI 4 Target information gathering
|
||
|
AI 5 Compromise target, cleanup
|
||
|
Incremental 6 Seed target
|
||
|
AI 7 New agent enters passive mode
|
||
|
|
||
|
Other concepts can be put into this, such as cryptography and multiple target
|
||
|
acquisition at once. It would certainly be an interesting exercise to write a
|
||
|
valid algorithm for all this.
|
||
|
|
||
|
|
||
|
----| Comparison
|
||
|
|
||
|
--| COPS Kuang system
|
||
|
|
||
|
The "Kuang system", a security analysis tool developed by Robert W. Baldwin of
|
||
|
MIT is included in COPS security package available from Purdue University [15].
|
||
|
The Kuang system is a ruleset-based system used to check UID/GID's rights on a
|
||
|
local system, i.e. it processes a knowledge base (list of privilege
|
||
|
users/files, list of rights needed on users/files to attain their level of
|
||
|
privilege) against a set of rules (if any user can write a startup file of
|
||
|
root, any user can become root). The ruleset is written as such that it is
|
||
|
"string parsable" in the inference engine (which is a forward inference engine
|
||
|
of order 1). The system can perform tests stored in a shell script to decide
|
||
|
if a rule is satisfied by the configuration of the system it is currently
|
||
|
running on.
|
||
|
|
||
|
In comparison to what is described in this paper, the Kuang system evolves
|
||
|
between (LS,L(1)) and (LS,L(3)). It uses a non-optimized forward inference
|
||
|
engine that performs Phase(4) of our distributed scheme.
|
||
|
|
||
|
We should consider the Kuang system as a working-case study, to build Area52.
|
||
|
|
||
|
--| A sample vulnerability scanner : Nessus
|
||
|
|
||
|
The Nessus Open source project [16] aims at providing a free security scanner.
|
||
|
It works by testing systems (remote/local) for known vulnerabilities. The
|
||
|
Nessus developers wrote a scripting language for this purpose -- we mentioned
|
||
|
earlier that the actual coding attacks should be freely coded in a highly
|
||
|
portable language for our proposed system. Yet the Nessus approach is not to
|
||
|
be neglected -- could we use the Nessus effort and extend its scripting
|
||
|
language so to actually re-write all exploits? This would mean a continuous
|
||
|
effort in writing the project, but then alleviates many compatibility and
|
||
|
database issues. We could even hope for a "common hacking language" relying
|
||
|
on multi-platform libraries like libpcap and libnet as core components.
|
||
|
Until an open source vulnerability scanner that can run on multiple platforms
|
||
|
comes along, this is a fairly attractive piece of technology.
|
||
|
|
||
|
--| Another Approach : Attack Trees
|
||
|
|
||
|
As is probably obvious, this "ultimate hack tool" could be used to help
|
||
|
protect as well as compromise. While most of the discussion has been from the
|
||
|
intruder perspective, we could easily use the tool for our own vulnerability
|
||
|
assessment. If we feed the knowledge database with all relevant information
|
||
|
about our own network and run the engine in simulation mode, this will output
|
||
|
a possible sequence of attack. Then, if the engine is told to search for ALL
|
||
|
possible sequences of attack, and the output can be arrange as a tree of
|
||
|
attack sequences (much like the tree of known vulnerabilities describe above),
|
||
|
this would provide a means to help automatically generate "Attack Trees", as
|
||
|
described by Bruce Schneier of Counterpane Internet Security in Dr. Dobb's
|
||
|
Journal [17] (December 1999).
|
||
|
|
||
|
--| Others...
|
||
|
|
||
|
Some distributed denial of service tools, have caused quite a stir in security
|
||
|
circles lately [18]. Those tools expose an interesting sample of distributed
|
||
|
communication and data tunneling, which code could be reused in the project
|
||
|
outlined in this paper. The main problem with these denial of service tools
|
||
|
is that their main output (floods of packets against a target) is never seen
|
||
|
by the Attacker, which is what we would certainly require.
|
||
|
|
||
|
|
||
|
----| References
|
||
|
|
||
|
[1] See discussions by Dr Ross Anderson from University of Cambridge
|
||
|
http://www.cl.cam.ac.uk/Teaching/1998/Security/
|
||
|
|
||
|
[2] NMap by Fyodor.
|
||
|
http://www.insecure.org/nmap
|
||
|
|
||
|
[3] PacketStorm
|
||
|
http://packetstorm.securify.com
|
||
|
|
||
|
[4] Security Bugware
|
||
|
http://oliver.efri.hr/~crv
|
||
|
|
||
|
[5] Security Focus
|
||
|
http://www.securityfocus.com/
|
||
|
|
||
|
[6] Libnet multi-platform packet mangling
|
||
|
http://www.packetfactory.net/libnet/
|
||
|
|
||
|
[7] Common Vulnerabilities and Exposures
|
||
|
http://cve.mitre.org, a unified hack database
|
||
|
|
||
|
[8] RETE LISP implementation
|
||
|
http://www.mit.edu/afs/cs.cmu.edu/project/ai-repository/ai/areas/expert/systems/frulekit/
|
||
|
|
||
|
[9] Prof. Miranker Venus project in C++
|
||
|
http://www.arlut.utexas.edu/~miranker/
|
||
|
|
||
|
[10] Original OPS LISP code
|
||
|
http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/expert/systems/ops5/
|
||
|
|
||
|
[11] NASA RETE-like system, coded in C, very impressive!
|
||
|
http://www.ghg.net/clips/CLIPS.html
|
||
|
|
||
|
[12] "Distributed Metastasis: A Computer Network Penetration Methodology"
|
||
|
by Andrew J. Stewart
|
||
|
http://www.phrack.com/search.phtml?view&article=p55-16
|
||
|
|
||
|
[13] "Strategies for Defeating Distributed Attacks" by Simple Nomad
|
||
|
http://packetstorm.securify.com/papers/contest/Simple_Nomad.doc
|
||
|
|
||
|
[14] Phrack Magazine
|
||
|
http://www.phrack.com/
|
||
|
|
||
|
[15] Home archive of the COPS system
|
||
|
ftp://coast.cs.purdue.edu/pub/Purdue/cops/
|
||
|
|
||
|
[16] The Nessus Project
|
||
|
http://www.nessus.org
|
||
|
|
||
|
[17] "Attack Trees: Modeling Security Threats" by Bruce Schneier
|
||
|
http://www.ddj.com/articles/1999/9912/9912a/9912a.htm, DDJ article on Attack Trees
|
||
|
|
||
|
[18] Analysis of distributed denial of service tools by David Dittrich
|
||
|
http://staff.washington.edu/dittrich/
|
||
|
Also, the source code for these DoS tools can be found at [3].
|
||
|
|
||
|
|EOF|-------------------------------------------------------------------------|
|