Shellcode Extraction, Automated Decoding and Carving using Pefile

A little while back a friend of mine asked about a tool for carving out executables from file streams. PE carvers are useful for analyst who spend time analyzing files on the command line or extracting PE files from malicious documents. The initial post was going to be a simple Python script using Pefile for carving out executables. The simplicity was destroyed when I choose a random sample that used an offset specific encoding. I could have chosen another sample with a simpler encoding but that's boring. This post will cover three main topics. The first one is extracting and analyzing the shellcode from a malicious document CVE-2010-3333. The second is using patterns from encoding null bytes for automatically decoding (xor-count-up) embedded files. No static analysis or brute-forcing is used. The last topic is using Pefile for carving out executable files from file streams. This post is all Python except for the hexdump which is vim.
vim CVE-2010-3333_DOC.bad

{\rtf1{\shp{\*\shpinst{\sp{\sn pFragments}{\sv 1;1;ffffffffff050000000000000000000000..
..

:Vi can be used as a hexdump with the following command 
:%!xxd 

0000000: 7b5c 7274 6631 7b5c 7368 707b 5c2a 5c73  {\rtf1{\shp{\*\s
0000010: 6870 696e 7374 7b5c 7370 7b5c 736e 2070  hpinst{\sp{\sn p
0000020: 4672 6167 6d65 6e74 737d 7b5c 7376 2031  Fragments}{\sv 1
0000030: 3b31 3b66 6666 6666 6666 6666 6630 3530  ;1;ffffffffff050
0000040: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
0000050: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
0000060: 3030 3030 3030 3062 3735 6631 6637 6430  0000000b75f1f7d0
0000070: 3030 3038 3037 6330 3030 3038 3037 6342  000807c0000807cB
0000080: 4242 4242 4242 4243 4343 4343 4343 4344  BBBBBBBCCCCCCCCD
0000090: 4444 4444 4444 4439 3039 3039 3039 3039  DDDDDDD909090909
00000a0: 3039 3034 3134 3134 3134 3134 3134 3134  0904141414141414
00000b0: 3134 3134 3134 3134 3134 3165 3830 3130  14141414141e8010
....
0000740: 6239 3838 3730 6436 3730 6439 3839 3833  b98870d670d98983
0000750: 3038 3366 3766 3766 3766 3766 3766 3766  083f7f7f7f7f7f7f
0000760: 3765 6230 3665 6230 3430 3932 3330 3033  7eb06eb040923003
0000770: 3065 6230 3665 6230 3438 6332 3430 3033  0eb06eb048c24003
0000780: 3065 3837 3766 3466 6666 667d 7d7d 7d5c  0e877f4ffff}}}}\
0000790: 6164 6566 6c61 6e67 3130 3235 5c61 6e73  adeflang1025\ans
00007a0: 695c 616e 7369 6370 6739 3336 5c75 6332  i\ansicpg936\uc2
00007b0: 5c61 6465 6666 3331 3530 375c 6465 6666  \adeff31507\def
.....
:%!xxd -r 
:q!
Some quick notes in regards to CVE-2010-3333 via Mitre "Stack-based buffer overflow in Microsoft Office XP SP3, Office 2003 SP3, Office 2007 SP2, Office 2010, Office 2004 and 2008 for Mac, Office for Mac 2011, and Open XML File Format Converter for Mac allows remote attackers to execute arbitrary code via crafted RTF data, aka "RTF Stack Buffer Overflow Vulnerability."

The intent of this post is not to analyze CVE-2010-3333 but rather the encoding algorithm in the shellcode. For more information on the exploit see the following link. The most relevant parts are the rtf file header, the fragment objects (pFragments) and the data following the semicolon. The shellcode resides in the large ascii data block in the range 0-9A-F. If we read two ASCII chars and then treat them as hex/binary we would have valid Intel instructions. Let's see what this would look like in Python
Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> import os
>>> import pydasm
>>> # Open our specimen 
>>> f = open("CVE-2010-3333_DOC.bad","rb")
>>> f.seek(0x97)
>>> # 0x97 is the start of the ascii encoded shellcode
>>> b = f.read(2)

>>> buff = ""
>>> while b != '':
        try:
        buff = buff+chr(int(b,16))
        b = f.read(2)
        except ValueError:
        break
The variable buff contains our ascii to binary shellcode. We can use Pydasm to disassemble it.
>>> offset = 0
>>> while offset < len(buff):
    i = pydasm.get_instruction(buff[offset:],pydasm.MODE_32)
    print offset , ' ',pydasm.get_instruction_string(i,pydasm.FORMAT_INTEL, offset)
    if not i:
        break
    offset +=  i.length

    
0   inc ecx
1   inc ecx
2   inc ecx
3   inc ecx
4   inc ecx
5   inc ecx
6   inc ecx
7   inc ecx
8   inc ecx
9   inc ecx
10   inc ecx
11   inc ecx
12   call 0x12
17   add [ebx-0x3b7cdbf4],cl
23   add al,0x8d
25   dec ecx
26   adc al,[ecx-0x80]
29   xor [edi-0x80],esp
32   cmp [eax+0x600df775],edx
38   psubd mm6,[esi]
....
85   push dword 0x1300f74
90   push byte 0xffffff98
92   None
Notice after address 12 the code turns to junk instructions. This is caused by our disassembly being off by 1 byte after the call. We can modify our pydasm loop to skip this null byte.
>>> offset = 0
>>> while offset < len(buff):
    i = pydasm.get_instruction(buff[offset:],pydasm.MODE_32)
    print hex(offset) , ' ',pydasm.get_instruction_string(i,pydasm.FORMAT_INTEL, offset)
    if not i:
        break
    offset +=  i.length
    if offset == 17:
        offset += 1

        
0x0   inc ecx
0x1L   inc ecx
0x2L   inc ecx
0x3L   inc ecx
0x4L   inc ecx
0x5L   inc ecx
0x6L   inc ecx
0x7L   inc ecx
0x8L   inc ecx
0x9L   inc ecx
0xaL   inc ecx
0xbL   inc ecx
0xcL   call 0x12
0x12L   mov ecx,[esp]
0x15L   add esp,0x4
0x18L   lea ecx,[ecx+0x12]
0x1bL   inc ecx
0x1cL   xor byte [ecx],0x67
0x1fL   cmp byte [ecx],0x90
0x22L   jnz 0x1b
0x24L   or eax,0x36fa0f60
.......
0x53L   aad 0x51
0x55L   push dword 0x1300f74
0x5aL   push byte 0xffffff98
0x5cL   None
Note: address changes to hex. At the address 0x12 we can see the start of the XOR loop with a key of 0x67. The rest of the code is junk due to it being XORed. The next step to get to the second stage of the shellcode.
>>> buff2 = buff[:35]
# Read past XOR loop
>>> c = 36
>>> while c < (len(buff)-36):
    buff2 =  buff2 + chr(ord(buff[c])^0x67)
    c += 1
The above code is a simple XOR loop. Now we can dissasemble the second stage using Pydasm
>>> offset = 0
>>> while offset < len(buff2):
        i = pydasm.get_instruction(buff2[offset:],pydasm.MODE_32)
        print hex(offset) , ' ',pydasm.get_instruction_string(i,pydasm.FORMAT_INTEL, offset)
        if not i:
            break
        offset +=  i.length
        if offset == 17:
            offset += 1

        
0x0   inc ecx
0x1L   inc ecx
0x2L   inc ecx
0x3L   inc ecx
0x4L   inc ecx
0x5L   inc ecx
0x6L   inc ecx
0x7L   inc ecx
0x8L   inc ecx
0x9L   inc ecx
0xaL   inc ecx
0xbL   inc ecx
0xcL   call 0x12
0x12L   mov ecx,[esp]
0x15L   add esp,0x4
0x18L   lea ecx,[ecx+0x12]
0x1bL   inc ecx
0x1cL   xor byte [ecx],0x67
0x1fL   cmp byte [ecx],0x90
0x22L   jnz 0x8e
0x24L   pop es
0x25L   push dword 0x3519d
0x2aL   push dword 0x2d000
0x2fL   push dword 0x819d
........
0x8fL   xor edx,edx
0x91L   mov ebx,fs:[edx+0x30]    ; GET PEB
0x95L   mov ecx,[ebx+0xc]
0x98L   mov ecx,[ecx+0x1c]
....
0xbaL   mov esi,[ebx+edi*4]
0xbdL   add esi,ebp
0xbfL   cwd 
0xc0L   movsx eax,[esi]
0xc3L   cmp al,ah
0xc5L   jz 0xcf
0xc7L   ror edx,0x7        ; ROR API HASH
0xcaL   add edx,eax
0xccL   inc esi
0xcdL   jmp 0xc0
...
0x1b1L   mov [ebp+0x604],ebx
0x1b7L   xor ecx,ecx
0x1b9L   lea esi,[ebp+ecx+0x200]
0x1c0L   lodsb 
0x1c1L   xor al,cl            ; decode
0x1c3L   xchg edx,edi
0x1c5L   lea edi,[ebp+ecx+0x200]
0x1ccL   stosb 
0x1cdL   xchg edx,edi
0x1cfL   inc ecx
0x1d0L   cmp ecx,[ebp+0x604]
0x1d6L   jnz 0x1b9            ; loop
...
0x34eL   push ebp
0x34fL   mov ebp,esp
0x351L   mov eax,[edi-0x14]
0x354L   inc [eax]
Now that we have our shellcode. If we scroll up we will see a loop with xor al, cl and then an inc ecx. This is our xor-count-up loop. This type of encoding can be tricky because we have to know the exact offset of where the loop starts decoding data. Usually this means we would have to do some good old fashioned static analysis. There's only so many times we can reverse shellcode until you get bored. Let's get creative. Let's try something that hasn't been done before (probably wrong?). We know it's a xor-count-up so let's mimic the pattern the XOR would create in null data.
>>> o  = open('out.bin','wb+')
>>> d = f.read()
>>> k = ''
>>> for x in range(0,0xff):
        k = k + chr(x)

>>> k
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r...'
# Get subset of the full XOR-Countup outpu
>>>  key = k[:15]
Note: The k value was edited due to formatting issues.  For a  screenshot of the output please see the following link.

This is where things get kind of interesting with the encoding. Firstly we are going to do a search for a subset (key) of the recurring pattern (k). The pattern would appear on null bytes that have been XORed with count-up. We choose a subset because the full pattern would not likely be present because the null bytes would need to have a length of 255. If we know the address of k and the algorithm we can calculate the one to one relationship of the first value 0x0-0xFF to the first byte of the file. Once we have that information we will just need to mimic the xor-count-up code from the start of the file.
>>> #  
>>> i = (0xff - (d.find(key)&0xff)) + 1
>>> b= ''
>>> for val in d:
        b = str(b) + chr(ord(val)^(i&0xff))
        i += 1
  
>>> o.write(b)
>>> o.close()
>>> f.close()
>>> 
Let's see what our saved off buffer looks like in a VIM hexdump.
vim out.bin
0000000: 8d8c ae92 9090 909b 9997 9dca cccf cba3  ................
0000010: 616f 716a 5864 6874 616a 7a6c 353e 3853  aoqjXdhtajzl5>8S
0000020: 6572 204f 7571 7371 7e2a 2b2e 2c2a 427b  er Ouqsq~*+.,*B{
0000030: 4547 4413 7856 5254 404f 4e49 4f45 1d1e  EGD.xVRT@ONIOE..
0000040: 0501 076f 4741 455f 5e55 5558 540e 0f0a  ...oGAE_^UUXT...
0000050: 7077 1e30 3036 2e21 2020 2923 7f7c 7b7f  pw.006.!  )#.|{.
0000060: 660d 2127 273d 3035 316a 6b6e 6c6a 023b  f.!''=051jknlj.;
..
:/This
00255b0: 0000 0000 0408 b004 4d5a 9000 0300 0000  ........MZ......
00255c0: 0400 0000 ffff 0000 b800 0000 0000 0000  ................
00255d0: 4000 0000 0000 0000 0000 0000 0000 0000  @...............
00255e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00255f0: 0000 0000 f000 0000 0e1f ba0e 00b4 09cd  ................
0025600: 21b8 014c cd21 5468 6973 2070 726f 6772  !..L.!This progr
0025610: 616d 2063 616e 6e6f 7420 6265 2072 756e  am cannot be run
0025620: 2069 6e20 444f 5320 6d6f 6465 2e0d 0d0a   in DOS mode....
0025630: 2400 0000 0000 0000 b3bf 303a f7de 5e69  $.........0:..^i
:q!
Yeah, that's hot. Now that we have the file decoded we can carve out the executable from the data stream. The first indicator for an embedded executable is the MZ header. If we find the MZ header we would need to jump 0x3c bytes, read the value, jump to that value, check for the PE header... or we could use pefile to validate the PE file for us. All we need to do is search a stream of data for the MZ header, set the file pointer to that address, read till end of file, pass the data to pefile and then check for errors. If there is no errors we will have pefile trim the file and then write it to disk. This might not be the best method if the file is large or if we were overly concerned about file overlays.
import pefile
import re
import sys

def carve(f):
    c = 1
    for y in [tmp.start() for tmp in re.finditer('\x4d\x5a',f.read())]:
        f.seek(y)
        try:
            pe = pefile.PE(data=f.read())
        except:
            pass
            continue
        # determine file ext
        if pe.is_dll() == True:
            ext = 'dll'
        if pe.is_driver() == True:
            ext =  'sys'
        if pe.is_exe() == True:
            ext = 'exe'
        o = open(str(c)+ '.' + ext, 'wb')
        print ext , 'found at offset', hex(y) 
        o.write(pe.trim())
        o.close()
        c = c + 1
        ext = ''
        f.seek(0)
        pe.close()

def main(argv):
    if len(sys.argv) < 2:
        print "cpe.py <file-stream>"
    else:
        i = open(sys.argv[1], "rb")
        carve(i)
        i.close()
            
if __name__== '__main__':
        main(sys.argv[1:])
Output from the above script on the out.bin
python cpe.py out.bin
exe found at offset 0x7a10
dll found at offset 0x255b8


No comments:

Post a Comment