Starting Point

During routine testing, an integer overflow in apache2-mpm-worker 2.2.19 mod-setenvif was found. The crash occured when mangling request headers using a crafted .htaccess-file. The broken code was ap_pregsub in server/util.c, where the buffer size of a new header field could overflow, the value was then used for memory allocation. When copying data to the buffer an, overwrite of the an apr (apache portable runtime) memory-pool boundaries occured, similar to standard heap buffer overflows.

Outline of Exploit

The main goals creating the exploit were:

Two different exploit layouts were developed. The first one used multiple threads, so that one was overwriting the data of the second thread before hitting the end of the memory area. Precise timing was essential to get shell access.

The second one used a more crafted substitution expression, stopping the copy in a single thread by modifying the regular expression currently processed in the thread. Since there is race condition involved, this exploit was far more reliable than the first one.

First Exploitation Attempt

Due to the afore-mentioned requirements, an exploit with about 30% success rate for apache2-mpm-worker 2.2.19 on ubuntu oneiric 32bit was developed with the purpose to learn alternative programming techniques in a hands-on approach. To get hold of crucial apache data structures, the programs outlined here tried to exploit concurrent access to already corrupted data structures before the otherwise inevitable apache crash. Due to the additional

Step 0: Create an .htaccess-file that will copy more than 4GB of data into a 16MB buffer. Usually ap_pregsub will copy data to the buffer, overwriting the whole apr-pool memory until copy hits the upper mapped memory boundary (see second while-loop in ap_pregsub from server/util.c). Since the .htaccess will cause ap_pregsub to fill the whole heap with repeating copies of the HTTP-header block, this will also circumvent the heap randomization. The heap data will span such a large portion of heap, so that a pointer to heap will always hit one of the copies.

Step 1: To get a chance to execute code, ap_pregsub copy process has to be stopped without SEGV. Two ways are possible:

To extend the window of opportunity, an ap_pregsub-stop sequence is sent first using SendTrigger-SingleThreadAprPallocEndlessLoop.c. This will add a 16MB race buffer, slow down the server by sending one thread into endless loop, both helps to extend the race window in step 3 to 100ms on a 800MHz CPU, which would also be sufficient for remote TCP exploitation.

Step 2: Send traffic, that will make apache use one function pointer more frequently. For reproducibility it was important, to send data, that will make apache loop just over a very limited part of the whole apache binary code. Otherwise a wide variety of crashes at different code positions were observed. The RequestFlood.c program will open multiple connections to apache, send GET /AAA and then continue to send AAAA every 100ms, thus making the URL data on server side longer in each iteration. Due to the long time between the sending of GET header bytes, it is quite likely, that the heap is overwritten by thread started in step 3 while the current thread is in apr_socket_timeout_set (srclib/apr/network_io/unix/sockopt.c). apr_socket_timeout_set has also one other advantage, it will pick up the sock pointer from corrupted heap and write the timeout value to that location, thus giving the opportunity to write the first MSB 0-byte and using that value as function pointer later on in ap_get_brigade from server/util_filter.c. The RequestFlood program has to be running before sending the remote shell trigger in step 3.

Step 3: Send a trigger request, that will overwrite the apr-heap, similar to the request from step 1, but without any stop sequence. The HTTP request data will be overwrite the heap, thus the threads from step 2 can pick it up. This request contains also the remote shell code, but there are a few obstacles blocking code execution:

Workarounds for these problems are collected in SendTrigger-RemoteShell.c:

The trigger program takes the libc start address as argument. If it is possible to place a symbolic link to /proc/self/maps on the host, simple renaming of the link will defeat the NoFollowSymlinks options and allow to read the offsets from /proc/self/maps, thus defeating the ASLR ( more). If not known, address has to be guessed using different values in SendTrigger-RemoteShell --LibcMapPos 0xxxxxx.

Step 4: To reduce the code size, the remote command connection does not return stdout. To get stdout, the first command sent to remote should be exec 1>&0. Since SendTrigger-RemoteShell.c does not implement a nice shell gui, one can also telnet to apache before starting step 3, the telnet connection will turn to remote shell connection while open.

Second Exploitation Attempt

In contrast to the first attempt, this exploit overwrites the currently interpreted regex with a crafted stop sequence to terminate buffer overwriting before reaching the upper heap limit. In contrast to the first attempt, this program requires only a single apache thread to give remote shell and could also be used to take over process on non-mpm-worker apache servers. Steps:

Step 0: Create an .htaccess-file in /var/www, that will copy more than 4GB of data into a 16MB buffer. The new variable value expression is designed in such way, that when apache is copying data to the destination buffer and overwriting the variable definition data itself, the new definition corresponds to a variable size of zero. Hence the buffer-overflowing copy process is stopped as soon as the variable definition data is overwritten.

Step 1: Mix up heap data to get a layout favorable to our tasks. This can be done by just sending a normal GET request for an existing file via a Keep-Alive connection and using that connection to send the trigger afterwards.

Step 2: The most stable server code/data flow leading to successful exploitation would be one using a function pointer near to the point where the overflow begun. Otherwise the exploit code will depend also on other modules loaded or platform configuration parameters (the first attempt used a function pointer after mod_setenvif processing was completed). To archive that, apr_table_setn is used. The function usually would store the new variable value to a hash-table. Since it is operating on an overwritten table data structure, it can be used to create 0-bytes at appropriate locations and finally trigger an invalid allocation. Thus is leading to apr_palloc to call an abort-function and this function pointer can be controlled.

At the moment of this function call, only the function destination can be controlled, the content of all other registers cannot be used to call a suitable target function directly. Since the stack was not overwritten, standard ROP methods do not work. As a workaround, a part of the _IO_file_seekoff function can be used:

   0x00736ff7 <_IO_file_seekoff+407>:   mov    0x8(%ebp),%ecx
   0x00736ffa <_IO_file_seekoff+410>:   mov    0x14(%ebp),%edx
   0x00736ffd <_IO_file_seekoff+413>:   mov    0x4c(%ecx),%eax
   0x00737000 <_IO_file_seekoff+416>:   mov    %esi,0x4(%esp)
   0x00737004 <_IO_file_seekoff+420>:   mov    %edx,0xc(%esp)
   0x00737008 <_IO_file_seekoff+424>:   mov    %edi,0x8(%esp)
   0x0073700c <_IO_file_seekoff+428>:   mov    %ecx,(%esp)
   0x0073700f <_IO_file_seekoff+431>:   call   *0x40(%eax)

At the end of the sequence the stack will contain one pointer of our choice, the value 1 and the call will go to a controllable destination. That stack layout matches the function call of __libc_dlopen_mode, the internal symbol for dlload(). A sample program to archive this is TriggerRemoteShell.c.

Step 2: Since the attack assumed, that an attacker could place an .htaccess file on the server, it is also sensible to assume, that he could put a second file there also. This second file should be a shared library loaded by the dlload call. The library contains the _init symbol, this function is called during loading of the library, hence activating the exploit code. The library itself is not very special, it just tries to identify all open socket connections using a getsockopts call and forks a shell for every connection, e.g. ExploitLib.c. When the library is loaded, the open connection of TriggerRemoteShell.c is turned to a remote shell. Since apache server on ubuntu oneiric uses ASLR and the exploit needs the correct libc memory locations, the TriggerRemoteShell program can be started with the correct libc mapping information for testing. In real world examples, one might guess or try to get access to the /proc/[pid]/maps file before sending the exploit using an apache symlink timerace.

buildhost-ubuntuoneiric1110:~$ ./TriggerRemoteShell --LibcMapPos 0x6e5000
Using libc map pos at 0x6e5000
Opening ...
HTTP/1.1 200 OK
Date: Thu, 22 Dec 2011 08:56:36 GMT
Server: Apache/2.2.20 (Ubuntu)
Last-Modified: Sun, 20 Nov 2011 22:55:18 GMT
ETag: "1b76-4-4b23276d097f8"
Accept-Ranges: bytes
Content-Length: 4
Vary: Accept-Encoding
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/plain

AAAA
Linux buildhost-ubuntuoneiric1110 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:50:42 UTC 2011 i686 athlon i386 GNU/Linux

/usr/sbin/apache2 -k start: Completed: not found
sh: turning off NDELAY mode
ls
bin
boot
dev
...

With a different payload library, the lower-privileged www-data process can modify the shared memory scoreboard data in a way to trigger an invalid free/gcc-lib load in the root-priv master process, see ApacheScoreboardInvalidFreeOnShutdown.

Thinking About Security

Due to my limited programming skills, getting this first and far-from-good POC exploit was not quite easy. Some apache, libc, linux software design decisions made it simpler or easier for me:

Last modified 20120322
Contact e-mail: me (%) halfdog.net