|
Gojko Vujovic Juniper Networks Amsterdam, NL
Administrator Član broj: 1 Poruke: 13174 *.gojko.ss.
Sajt: www.gojkovujovic.com
|
Poto moe da se desi da ovo nestane sa neta, postujem ovde ceo tekst:
Avoiding security holes when developing an application - Part 4: format strings
Abstract:
For some time by now, messages announcing format strings based exploits get more and more numerous. This article explains where the danger comes from and will show that an attempt at saving in six bytes is enough to compromise the security of a program.
1. Where is the danger ?
Most of the security flaws comes either from a bad configuration or laziness. This rule is once more true about format strings.
It is very often necessary to write a string in a program (the "where" is not the point here, as it could be with buffer overflow - we can deal with stdin, files, ...). A single instruction is enough :
printf("%s", str);
However, a programmer can decide to save time and six bytes while writing only :
printf(str);
With "economy" in mind, this programmer comes to open a potential hole in his work. He is satisfied with passing a single string as an argument, which he wanted simply to display without any change. However, this string will be parsed to look for directives of formatting (%d, %g...) . When such a character of format is discovered, the corresponding argument is looked for in the stack.
We will start introducing the printf() functions. At least, we expect everyone knows them ... but not in details, so we will deal with far less known aspects of these routines. Then, we will see how to get the necessary information to exploit such a mistake. Lastly, we will gather all this within the framework of a single example.
2. Deep inside format strings
In this part, we will consider the format strings. We will make an abstract about their use and we will discover a rather little known format instruction that will reveal all its mystery..
printf() : they told me a lie !
Note for non-French residents : we have in our nice country a racing cyclist who pretended for months not to have taken dope while all the other members of his team admitted it. He claims that if he has been doped, he didn't knew it. So, a famous imitators show used the French sentence "on m'aurait menti !" which gave me the idea of this title.
Let us start with what we all learned in our programming's handbooks : most of the input/output C functions use data formatting, which means that one has not only to provide the data before reading/writing, but also how to do it. The following program illustrates this :
/* display.c */
#include <stdio.h>
main() {
int i = 64;
char a = 'a';
printf("int : %d %dn", i, a);
printf("char : %c %cn", i, a);
}
Running it displays :
>>gcc display.c -o display
>>./display
int : 64 97
char : @ a
The first printf() writes the value of the integer variable i and of the character variable a as int (this is done using %d), which leads for a to display its ASCII value. On the other hand, the second printf() converts the integer variable i to the corresponding ASCII character code, that is 64.
Nothing new by now and everything remains in conformity with many functions using a prototyping similar to the one of the printf() function :
one argument, in the form of characters string (const char *format) is used to specify the selected format ;
one or more other optional arguments, containing the variables in which values are formatted according to the indications given in the previous string.
Most of our programming lessons stop there, providing a non exhaustive list of possible formats (%g, %h, %x, the use of the dot character . to force the precision...) But, there is another one never talked about :%n. Here is what the printf()'s man page tells about it :
The number of characters written so far is stored into the integer indicated by the int * (or variant) pointer argument. No argument is converted.
Here is the most important thing of this article :this argument makes possible to write into a pointer type variable , even when used in a display function !
Before continuing, let us say that this format also exists for functions from the scanf(), syslog(), family ...
Time to play
We are going to study the use and the behavior of this format through small programs. The first, printf1, shows a very simple use :
/* printf1.c */
1: #include <stdio.h>
2:
3: main() {
4: char *buf = "0123456789";
5: int n;
6:
7: printf("%s%nn", buf, &n);
8: printf("n = %dn", n);
9: }
The first printf() call displays the string "0123456789" which contains 10 characters. The next %n format writes this value to the variable n :
>>gcc printf1.c -o printf1
>>./printf1
0123456789
n = 10
Let's slightly transform our program by replacing the instruction printf() line 7 with the following one :
7: printf("buf=%s%nn", buf, &n);
Running this new program confirms our idea : the variable n is now 14, (10 characters from the buf string variable added to the 4 characters from the "buf=" constant string, contained in the format string itself).
So, we know the %n format counts every character that appears in the format string. Moreover, as will demonstrate the printf2 program, it counts even further :
/* printf2.c */
#include <stdio.h>
main() {
char buf[10];
int n, x = 0;
snprintf(buf, sizeof buf, "%.100d%n", x, &n);
printf("l = %dn", strlen(buf));
printf("n = %dn", n);
}
The use of the snprintf() function is to prevent from buffer overflows. The variable n should then be 10 :
>>gcc printf2.c -o printf2
>>./printf2
l = 9
n = 100
Strange ? In fact, the %n format reckons the amount of characters that should have been written. This example shows that truncating due to the size specification is ignored.
What really happens ? The format string is fully extended before being cut and then copied into the destination buffer :
/* printf3.c */
#include <stdio.h>
main() {
char buf[5];
int n, x = 1234;
snprintf(buf, sizeof buf, "%.5d%n", x, &n);
printf("l = %dn", strlen(buf));
printf("n = %dn", n);
printf("buf = [%s] (%d)n", buf, sizeof buf);
}
printf3 contains some differences compared to printf2 :
the buffer size is reduced to 5 bytes
the precision in the format string is now set to 5 ;
the buffer content is finally displayed.
We get the following display :
>>gcc printf3.c -o printf3
>>./printf3
l = 4
n = 5
buf = [0123] (5)
The first two lines are not surprising. The last one illustrates the behavior of the printf() function :
the format string is deployed, according to the commands1 it contains, which provides the string "00000
|