xxxswf.py is a Python script for carving, scanning, compressing, decompressing and analyzing Flash SWF files. The script can be used on an individual SWF, single SWF or multiple SWFs embedded in a file stream or all files in a directory. The tool could be useful for system administrators, incident response, exploit analyst, malware analyst or web developers.
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -h Usage: xxxswf.py [options] file.bad Options: -h, --help show this help message and exit -x, --extract Extracts the embedded SWF(s), names it MD5HASH.swf & saves it in the working dir. No addition args needed -y, --yara Scans the SWF(s) with yara. If the SWF(s) is compressed it will be deflated. No addition args needed -s, --md5scan Scans the SWF(s) for MD5 signatures. Please see func checkMD5 to define hashes. No addition args needed -H, --header Displays the SWFs file header. No addition args needed -d, --decompress Deflates compressed SWFS(s) -r PATH, --recdir=PATH Will recursively scan a directory for files that contain SWFs. Must provide path in quotes -c, --compress Compresses the SWF using Zlib</pre>xxxswf.py with no options and a file passed. The output is extremely simple. The [SUMMARY] shows the count of embedded SWFs. The MD5 and name of the scanned file, the address of the embedded SWF and the header of the SWF. FWS is uncompressed and CWS is compressed with zlib.
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py test.swf [SUMMARY] 1 SWF(s) in MD5:7ca4ab177f480503653702b33366111f:test.swf [ADDR] SWF 1 at 0xa18 - CWS Headerxxxswf.py with the -x (--extract) option. The file will be carved and saved to the working directory. The name will be the MD5 of the deflated SWF and the '.swf' extension. If there are multiple files with the same MD5 the file's name will be MD5.count.swf. The count will only go up to 50. A useful example of this will be given later.
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -x x.bin [SUMMARY] 2 SWF(s) in MD5:32fed596fa850057211121488f6c6b75:x.bin [ADDR] SWF 1 at 0x0 - FWS Header [FILE] Carved SWF MD5: c46299a5015c6d31ad5766cb49e4ab4b.swf [ADDR] SWF 2 at 0x7774 - FWS Header [FILE] Carved SWF MD5: c46299a5015c6d31ad5766cb49e4ab4b.2.swfThe -r or --recdir option can be used to recursively search or carve out all SWFs in a directory. This could be used on a temporary internet directory or a repository of malicious documents. It's recommend to pipe the output to a text file. The path will need to be in quotes. This can take a few minutes due to the size of the directory and the speed of your processor.
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -x -r "C:\Documents and Settings\XOR\Desktop\samples\mal" > out.txt vi out.txt [SUMMARY] 1 SWF(s) in MD5:93d63b5f9167d7ab579ca9bd70d1dd3e:C:\Documents and Settings\XOR\Desktop\samples\mal\301.xls= [ADDR] SWF 1 at 0x13ef81 - [ERROR]: Zlib decompression error. Invalid CWS SWF [SUMMARY] 1 SWF(s) in MD5:d2cad99c92a1a43b8ed0c217b6a501af:C:\Documents and Settings\XOR\Desktop\samples\mal\CVE-2009-3129.xls [ADDR] SWF 1 at 0x13ef81 - [ERROR]: Zlib decompression error. Invalid CWS SWF [SUMMARY] 1 SWF(s) in MD5:358895e898866ef0432391b931096209:C:\Documents and Settings\XOR\Desktop\samples\mal\CWS.swf [ADDR] SWF 1 at 0x0 - CWS Header [FILE] Carved SWF MD5: f05ba07d32e9a7b47a18aa3f172ad4e5.swf [SUMMARY] 1 SWF(s) in MD5:c46299a5015c6d31ad5766cb49e4ab4b:C:\Documents and Settings\XOR\Desktop\samples\mal\simple.swf [ADDR] SWF 1 at 0x0 - FWS Header [FILE] Carved SWF MD5: c46299a5015c6d31ad5766cb49e4ab4b.3.swf [SUMMARY] 7 SWF(s) in MD5:7089ec4198e70f58f09547201ae4e185:C:\Documents and Settings\XOR\Desktop\samples\mal\swfxxx.py [ADDR] SWF 1 at 0x607 - [ERROR] Invalid SWF Version [ADDR] SWF 2 at 0x60b - [ERROR] Invalid SWF Version [ADDR] SWF 3 at 0x958 - [ERROR] Invalid SWF Version [ADDR] SWF 4 at 0x981 - [ERROR] Invalid SWF Size [ADDR] SWF 5 at 0x18d0 - [ERROR] Invalid SWF Size [ADDR] SWF 6 at 0x1c45 - [ERROR] Invalid SWF Size [ADDR] SWF 7 at 0x1cc3 - [ERROR] Invalid SWF Size ....The search for embedded SWFs is done simply by using a regular expression with "FWS" and "CWS" as the search criteria. This generic search will return false positives. Verifying the SWF is done by checking for a valid version, valid size and valid decompression (if compressed). Please see the function verifySWF(). This approach is time consuming but it does work. Above we can see the different errors being generated. All errors will contain the string "[ERROR]". If the sample set is large enough odds are there will be recurring MD5 file names. xxxswf.py can be used to classify or alert on commonly used MD5 SWFs. The function checkMD5 can be edited to alert on specific MD5s.
def checkMD5(md5): # checks if MD5 has been seen in MD5 Dictionary # MD5Dict contains the MD5 and the CVE # For { 'MD5':'CVE', 'MD5-1':'CVE-1', 'MD5-2':'CVE-2'} MD5Dict = {'c46299a5015c6d31ad5766cb49e4ab4b':'CVE-XXXX-XXXX'} if MD5Dict.get(md5): print '\t[BAD] MD5 Match on', MD5Dict.get(md5) returnThe MD5 "c46299a5015c6d31ad5766cb49e4ab4b" was found in the x.bin example from a couple example above. MD5 scanning is done by passing the -s or --md5scan. All hashing or signature alerts contain the string [BAD].
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -s x.bin [SUMMARY] 2 SWF(s) in MD5:32fed596fa850057211121488f6c6b75:x.bin [ADDR] SWF 1 at 0x0 - FWS Header [BAD] MD5 Match on CVE-XXXX-XXXX [ADDR] SWF 2 at 0x7774 - FWS Header [BAD] MD5 Match on CVE-XXXX-XXXXxxxswf can be used to decompress a single SWF by using the -d --decompress option.
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -d test.swf [SUMMARY] 1 SWF(s) in MD5:7ca4ab177f480503653702b33366111f:test.swf [ADDR] SWF 1 at 0xa18 - CWS Header [FILE] Carved SWF MD5: f0f40a975ef68cf6358f84515a8f103e.4.swfIt can compress SWFs using the -c or --compress options. Note: In testing I wasn't able to decompress a SWF downloaded from the internet and compress it again to get a matching MD5. A single byte is off. If someone could give me a clue on this one or recommend another technique please let me know.
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -c f0f40a975ef68cf6358f84515a8f103e.2.swf [SUMMARY] 1 SWF(s) in MD5:f0f40a975ef68cf6358f84515a8f103e:f0f40a975ef68cf6358f8 4515a8f103e.2.swf [ADDR] SWF 1 at 0x0 - FWS Header [FILE] Compressed SWF MD5: e9e6c13c461dc38006ff7d26c18e904e.swfThe SWF headers information can be displayed by using -H or --header
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -H 11cc16d78597fe9999b7f6b714727ac3.10.swf [SUMMARY] 1 SWF(s) in MD5:11cc16d78597fe9999b7f6b714727ac3:11cc16d78597fe9999b7f 6b714727ac3.10.swf [ADDR] SWF 1 at 0x0 - FWS Header [HEADER] File header: FWS [HEADER] File version: 7 [HEADER] File size: 52647 [HEADER] Rect Nbit: 15 [HEADER] Rect Xmin: 0 [HEADER] Rect Xmax: 11000 [HEADER] Rect Ymin: 0 [HEADER] Rect Ymax: 3600 [HEADER] Frame Rate: 7936 [HEADER] Frame Count: 1The output for Rect is in twips. The script contains the ability to scan the deflated SWF(s) with yara. The options are -y and --yara. This makes it easy to create signatures on malicious SWF files that do not have static MD5s. Due to the scanning set only being in a SWF file the signatures can be a little more generic. Let's walk through an example using information gathered from the excellent write up by Microsoft.
http://blogs.technet.com/b/mmpc/archive/2011/03/17/a-technical-analysis-on-the-cve-2011-0609-adobe-flash-player-vulnerability.aspx
After reading the link we know some key things. We know what is triggering the exploit (bytecode verification error), we know there is some shellcode and we know there is some code for creating the heap spray. The analysis gives a nice clue about the exploit and what to target. From the analysis "The Adobe Flash file embedded inside the Excel file is another carrier for the exploit. It loads shellcode inside memory, performs heap-spraying, and loads a Flash byte stream from memory to exploit the 0-day vulnerability". If you look closely at the byte stream in the screenshot you will notice the string "43575309". What would this sring or Flash byte stream looks like if it was actually binary data and not a string?
import sys s = "43575309" for i in xrange(0,len(s),2): sys.stdout.write(chr(int('0x'+ s[i:i+2],16))) CWSAs mentioned earlier 'CWS' is the header for a compressed SWF. Nine is the Flash version. We have an embedded SWF that is stored into a byte array and then converted from hex to binary. Let's create a yara signature targeting this. Note: the string hexToBin is the name of a function and in a way is arbitrary. It's better to go after the code or data related to triggering the exploit. This exploit is a little more difficult because the trigger is embedded in a compressed SWF stored as ASCII hex. For more information please see my poor-grammar-non-proof-read post called An Intro to Creating Anti-Virus Signatures.
rule CVE_2011_0609 { strings: $CWSHeader = "435753" $FWSHeader = "465753" $hex2bin = "hexToBin" condition: ($CWSHeader or $FWSHeader) and $hex2bin }Saved int the working dir as rules.yar.
C:\Documents and Settings\XOR\My Documents\Projects\swfxxx>python xxxswf.py -y "CVE-2011-0609_.xls__" [SUMMARY] 1 SWF(s) in MD5:4bb64c1da2f73da11f331a96d55d63e2:CVE-2011-0609_.xls=__ [ADDR] SWF 1 at 0xa18 - FWS Header [BAD] Yara Signature Hit: CVE_2011_0609If you would like to import this script there is a function called bad(). This function can be used for scanning a SWF with MD5 and Yara. An open file handle will need to be passed to the function. The output will then need to be parsed for a line containing [BAD]. If interested in Yara and MD5 signatures feel free to contact me. I won't be posting my signature sets but I might be able to share depending on the organization or group.
Summary
The goal of this tool is to be able to work with embedded SWF files in an easy and quick way. This script is a work in progress. With a recent move to NYC I needed a new project. If you find any bugs or have some comments please contact me or leave a comment.
xxxswf.py - download
# xxxswf.py was created by alexander dot hanel at gmail dot com # version 0.1 # Date - 12-07-2011 # To do list # - Tag Parser # - ActionScript Decompiler import fnmatch import hashlib import imp import math import os import re import struct import sys import time from StringIO import StringIO from optparse import OptionParser import zlib def checkMD5(md5): # checks if MD5 has been seen in MD5 Dictionary # MD5Dict contains the MD5 and the CVE # For { 'MD5':'CVE', 'MD5-1':'CVE-1', 'MD5-2':'CVE-2'} MD5Dict = {'c46299a5015c6d31ad5766cb49e4ab4b':'CVE-XXXX-XXXX'} if MD5Dict.get(md5): print '\t[BAD] MD5 Match on', MD5Dict.get(md5) return def bad(f): for idx, x in enumerate(findSWF(f)): tmp = verifySWF(f,x) if tmp != None: yaraScan(tmp) checkMD5(hashBuff(tmp)) return def yaraScan(d): # d = buffer of the read file # Scans SWF using Yara # test if yara module is installed # if not Yara can be downloaded from http://code.google.com/p/yara-project/ try: imp.find_module('yara') import yara except ImportError: print '\t[ERROR] Yara module not installed - aborting scan' return # test for yara compile errors try: r = yara.compile(r'rules.yar') except: pass print '\t[ERROR] Yara compile error - aborting scan' return # get matches m = r.match(data=d) # print matches for X in m: print '\t[BAD] Yara Signature Hit:', X return def findSWF(d): # d = buffer of the read file # Search for SWF Header Sigs in files return [tmp.start() for tmp in re.finditer('CWS|FWS', d.read())] def hashBuff(d): # d = buffer of the read file # This function hashes the buffer # source: http://stackoverflow.com/q/5853830 if type(d) is str: d = StringIO(d) md5 = hashlib.md5() while True: data = d.read(128) if not data: break md5.update(data) return md5.hexdigest() def verifySWF(f,addr): # Start of SWF f.seek(addr) # Read Header header = f.read(3) # Read Version ver = struct.unpack('<b', f.read(1))[0] # Read SWF Size size = struct.unpack('<i', f.read(4))[0] # Start of SWF f.seek(addr) try: # Read SWF into buffer. If compressed read uncompressed size. t = f.read(size) except: pass # Error check for invalid SWF print ' - [ERROR] Invalid SWF Size' return None if type(t) is str: f = StringIO(t) # Error check for version above 20 if ver > 20: print ' - [ERROR] Invalid SWF Version' return None if 'CWS' in header: try: f.read(3) tmp = 'FWS' + f.read(5) + zlib.decompress(f.read()) print ' - CWS Header' return tmp except: pass print '- [ERROR]: Zlib decompression error. Invalid CWS SWF' return None elif 'FWS' in header: try: tmp = f.read(size) print ' - FWS Header' return tmp except: pass print ' - [ERROR] Invalid SWF Size' return None else: print ' - [Error] Logic Error Blame Programmer' return None def headerInfo(f): # f is the already opended file handle # Yes, the format is is a rip off SWFDump. Can you blame me? Their tool is awesome. # SWFDump FORMAT # [HEADER] File version: 8 # [HEADER] File is zlib compressed. Ratio: 52% # [HEADER] File size: 37536 # [HEADER] Frame rate: 18.000000 # [HEADER] Frame count: 323 # [HEADER] Movie width: 217.00 # [HEADER] Movie height: 85.00 if type(f) is str: f = StringIO(f) sig = f.read(3) print '\t[HEADER] File header:', sig if 'C' in sig: print '\t[HEADER] File is zlib compressed.' version = struct.unpack('<b', f.read(1))[0] print '\t[HEADER] File version:', version size = struct.unpack('<i', f.read(4))[0] print '\t[HEADER] File size:', size # deflate compressed SWF if 'C' in sig: f = verifySWF(f,0) if type(f) is str: f = StringIO(f) f.seek(0, 0) x = f.read(8) ta = f.tell() tmp = struct.unpack('<b', f.read(1))[0] nbit = tmp >> 3 print '\t[HEADER] Rect Nbit:', nbit # Curretely the nbit is static at 15. This could be modified in the # future. If larger than 9 this will break the struct unpack. Will have # to revist must be a more effective way to deal with bits. Tried to keep # the algo but damn this is ugly... f.seek(ta) rect = struct.unpack('>Q', f.read(int(math.ceil((nbit*4)/8.0))))[0] tmp = struct.unpack('<b', f.read(1))[0] tmp = bin(tmp>>7)[2:].zfill(1) # bin requires Python 2.6 or higher # skips string '0b' and the nbit rect = bin(rect)[7:] xmin = int(rect[0:nbit-1],2) print '\t[HEADER] Rect Xmin:', xmin xmax = int(rect[nbit:(nbit*2)-1],2) print '\t[HEADER] Rect Xmax:', xmax ymin = int(rect[nbit*2:(nbit*3)-1],2) print '\t[HEADER] Rect Ymin:', ymin # one bit needs to be added, my math might be off here ymax = int(rect[nbit*3:(nbit*4)-1] + str(tmp) ,2) print '\t[HEADER] Rect Ymax:', ymax framerate = struct.unpack('<H', f.read(2))[0] print '\t[HEADER] Frame Rate:', framerate framecount = struct.unpack('<H', f.read(2))[0] print '\t[HEADER] Frame Count:', framecount def walk4SWF(path): # returns a list of [folder-path, [addr1,addrw2]] # Don't ask, will come back to this code. p = ['',[]] r = p*0 if os.path.isdir(path) != True and path != '': print '\t[ERROR] walk4SWF path must be a dir.' return for root, dirs, files in os.walk(path): for name in files: try: x = open(os.path.join(root, name), 'rb') except: pass break y = findSWF(x) if len(y) != 0: # Path of file SWF p[0] = os.path.join(root, name) # contains list of the file offset of SWF header p[1] = y r.insert(len(r),p) p = ['',[]] y = '' x.close() return r def tagsInfo(f): return def fileExist(n, ext): # Checks the working dir to see if the file is # already in the dir. If exists the file will # be named name.count.ext (n.c.ext). No more than # 50 matching MD5s will be written to the dir. if os.path.exists( n + '.' + ext): c = 2 while os.path.exists(n + '.' + str(c) + '.' + ext): c = c + 1 if c == 50: print '\t[ERROR] Skipped 50 Matching MD5 SWFs' break n = n + '.' + str(c) return n + '.' + ext def CWSize(f): # The file size in the header is of the uncompressed SWF. # To estimate the size of the compressed data, we can grab # the length, read that amount, deflate the data, then # compress the data again, and then call len(). This will # give us the length of the compressed SWF. return def compressSWF(f): if type(f) is str: f = StringIO(f) try: f.read(3) tmp = 'CWS' + f.read(5) + zlib.compress(f.read()) return tmp except: pass print '\t[ERROR] SWF Zlib Compression Failed' return None def disneyland(f,filename, options): # because this is where the magic happens # but seriously I did the recursion part last.. retfindSWF = findSWF(f) f.seek(0) print '\n[SUMMARY] %d SWF(s) in MD5:%s:%s' % ( len(retfindSWF),hashBuff(f), filename ) # for each SWF in file for idx, x in enumerate(retfindSWF): print '\t[ADDR] SWF %d at %s' % (idx+1, hex(x)), f.seek(x) h = f.read(1) f.seek(x) swf = verifySWF(f,x) if swf == None: continue if options.extract != None: name = fileExist(hashBuff(swf), 'swf') print '\t\t[FILE] Carved SWF MD5: %s' % name try: o = open(name, 'wb+') except IOError, e: print '\t[ERROR] Could Not Create %s ' % e continue o.write(swf) o.close() if options.yara != None: yaraScan(swf) if options.md5scan != None: checkMD5(hashBuff(swf)) if options.decompress != None: name = fileExist(hashBuff(swf), 'swf') print '\t\t[FILE] Carved SWF MD5: %s' % name try: o = open(name, 'wb+') except IOError, e: print '\t[ERROR] Could Not Create %s ' % e continue o.write(swf) o.close() if options.header != None: headerInfo(swf) if options.compress != None: swf = compressSWF(swf) if swf == None: continue name = fileExist(hashBuff(swf), 'swf') print '\t\t[FILE] Compressed SWF MD5: %s' % name try: o = open(name, 'wb+') except IOError, e: print '\t[ERROR] Could Not Create %s ' % e continue o.write(swf) o.close() def main(): # Scenarios: # Scan file for SWF(s) # Scan file for SWF(s) and extract them # Scan file for SWF(s) and scan them with Yara # Scan file for SWF(s), extract them and scan with Yara # Scan directory recursively for files that contain SWF(s) # Scan directory recursively for files that contain SWF(s) and extract them parser = OptionParser() usage = 'usage: %prog [options] <file.bad>' parser = OptionParser(usage=usage) parser.add_option('-x', '--extract', action='store_true', dest='extract', help='Extracts the embedded SWF(s), names it MD5HASH.swf & saves it in the working dir. No addition args needed') parser.add_option('-y', '--yara', action='store_true', dest='yara', help='Scans the SWF(s) with yara. If the SWF(s) is compressed it will be deflated. No addition args needed') parser.add_option('-s', '--md5scan', action='store_true', dest='md5scan', help='Scans the SWF(s) for MD5 signatures. Please see func checkMD5 to define hashes. No addition args needed') parser.add_option('-H', '--header', action='store_true', dest='header', help='Displays the SWFs file header. No addition args needed') parser.add_option('-d', '--decompress', action='store_true', dest='decompress', help='Deflates compressed SWFS(s)') parser.add_option('-r', '--recdir', dest='PATH', type='string', help='Will recursively scan a directory for files that contain SWFs. Must provide path in quotes') parser.add_option('-c', '--compress', action='store_true', dest='compress', help='Compresses the SWF using Zlib') (options, args) = parser.parse_args() # Print help if no argurments are passed if len(sys.argv) < 2: parser.print_help() return # Note files can't start with '-' if '-' in sys.argv[len(sys.argv)-1][0] and options.PATH == None: parser.print_help() return # Recusive Search if options.PATH != None: paths = walk4SWF(options.PATH) for y in paths: #if sys.argv[0] not in y[0]: try: t = open(y[0], 'rb+') disneyland(t, y[0],options) except IOError: pass return # try to open file try: f = open(sys.argv[len(sys.argv)-1],'rb+') filename = sys.argv[len(sys.argv)-1] except Exception: print '[ERROR] File can not be opended/accessed' return disneyland(f,filename,options) f.close() return if __name__ == '__main__': main()
Superb practical work showing through this blog and i am really glade to join this blog through this commenting.
ReplyDeleteWhat does it mean that the header says [SUMMARY] 0 SWF(s)?
ReplyDeleteDo you have an example? My email is in the code. Send me an email and we can check it out.
Deletewhat is the license under which the xxxswf.py is available.
ReplyDeleteThanks foor sharing this
ReplyDelete