dism-this.py dism-this.py Analysis: Info: Instructions Disassembled Count 2166 Error: Invalid Disassembly Count 71 * Example: ?? jna 0x129 Invalid: Static Offset Count 97 * Example: sub [0xd218000a], ecx Invalid: Segment Register Use Count 127 * Example: fs daa Anomaly: Infrequent Instruction Use Count 1135 * Example: arpl [ebp+ecx+0xa],bpThe first line of the analysis contains a count of how many lines of code were disassembled. The second analysis contains a count of how many lines pydasm could not disassemble due to the line not being valid. The third counts the use of static offsets. The fourth counts the number of segment registered used. The later two are not typically used. The FS register is used for traversing the PEB to get the base offset of Kernel32.dll. But the script only checks the first couple of chars in the disasembled line. The last analysis checks if the instruction is infrequent. Most executable code contains one of twenty one instructions. We can run the following Python code in IDA to get the top concordance count of the code.
instr = [] ea = ScreenEA() for funcea in Functions(SegStart(ea), SegEnd(ea)): E = list(FuncItems(ea)) for e in E: instr.append(GetMnem(e)) count = {} for mnem in instr: if mnem in count: count[mnem] += 1 else: count[mnem] = 1 popMnem = sorted(count, key = count.get, reverse = True) print len(popMnem[:35]) print popMnem[:35]Output of the command on an IDB.
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] IDAPython v1.5.5 final (serial 0) (c) The IDAPython Team <idapython@googlegroups.com> -------------------------------------------------------------------------------------- 21 ['push', 'call', 'mov', 'pop', 'add', 'inc', 'and', 'movzx', 'cdq', 'idiv', 'shr', 'test', 'or', 'xor', 'sub', 'jz', 'retn', 'jnz', 'jmp', 'jnb', 'cmp']It should be noted that shellcode has instructions that are not always seen in normal executable code. These were not included because I did not want to use them as signatures. I'm trying to do this more generically.
dism-this.py -h Usage: dism-this.py [options] data.file Options: -h, --help show this help message and exit -v, --verbose print disassembly -s SKIP, --skip=SKIP skip n input bytes -c COUNT, --count=COUNT disassembly only n input blocks -a, --ascii_blob disassembly ascii blobdism-this.py has four arguments. The -v or --verbose is to print the output from pydasm. The -s or --skip is used to skip n number of bytes. The -c or --count is to read only n amount of bytes. The -a or --ascii_blob is for disassembling ascii blobs. An example of an ascii blob would be '9090' which would translate to nop and nop; once it has been converted to a hex binary format.
Hex of ASCII blob |
dism-this.py -a -s 0x14 -c 0x96 -v asm.txt Disassembly: xor edx,edx push edx push dword 0x636c6163 mov esi,esp push edx push esi mov esi,fs:[edx+0x30] mov esi,[esi+0xc] mov esi,[esi+0xc] lodsd mov esi,[eax] mov edi,[esi+0x18] mov ebx,[edi+0x3c] mov ebx,[edi+ebx+0x78] mov esi,[edi+ebx+0x20] add esi,edi mov ecx,[edi+ebx+0x24] add ecx,edi inc edx lodsd cmp dword [edi+eax],0x456e6957 jnz 0x2f movzx edx,[ecx+edx*2-0x2] mov esi,[edi+ebx+0x1c] add esi,edi add edi,[esi+edx*4] call edi int3 Analysis: Info: Instructions Disassembled Count 28 Error: Invalid Disassembly Count 0 * Example: ?? jna 0x129 Invalid: Static Offset Count 0 * Example: sub [0xd218000a], ecx Invalid: Segment Register Use Count 0 * Example: fs daa Anomaly: Infrequent Instruction Use Count 3 * Example: arpl [ebp+ecx+0xa],bpPlease email if you find any bugs or have any questions. My email can be found in the comments of the code. I have created a bitbucket repo. Please download from the repo. The below code is not the most current.
Source Code - BitBucket Repo
#!/usr/bin/env python # dism-this.py is a script that analyzes data for the possible detection of shellcode or instructions. # Written by alexander dot hanel at gmail dot com import re import sys from optparse import OptionParser try: import pydasm except ImportError: print "Error: Pydasm Can Not be Found" sys.exit() class CKASM(): def __init__(self): self.brRegex = re.compile(r'\[.+?\]') self.registers = ['eax', 'ebx', 'ecx', 'edx', 'esi', 'edi', 'esp', 'ebp', 'ax', 'bx', 'cx', 'dx', 'ah', 'al', 'bh', 'bp', 'bl', 'ch', 'cl', 'dh', 'dl', 'di', 'si', 'sp', 'ip'] self.popMnem = ['push', 'call', 'mov', 'pop', 'add', 'inc', 'and', 'movzx', 'cdq', 'idiv', 'shr', 'test', 'or', 'xor', 'sub', 'jz', 'retn', 'jnz', 'jmp', 'jnb', 'cmp'] self.segment = [ 'ds', 'cs', 'ss', ' es', 'gs', 'fs'] self.segmentCount = 0 self.errorCount = 0 self.skip = None self.count = None self.buffer = None self.ascii = False self.verbose = False self.fhandle = None self.parser = None self.callParser() self.checkFileArgs() self.getBuffer() self.asciiBlob() self.errorStaticCount = 0 self.errorStatic = [] self.errorInvalidInstCount = 0 self.errorInvalidInst = [] self.outcastInstr = 0 def dis(self, buff): 'disassembles buffer using pydasm, returns assembly in buffer' offset = 0 outDis = [] while offset < len(buff): i = pydasm.get_instruction(buff[offset:],pydasm.MODE_32) tmp = pydasm.get_instruction_string(i,pydasm.FORMAT_INTEL,offset) outDis.append(tmp) if not i: return outDis offset += i.length return outDis def callParser(self): 'parses the command line arguments' self.parser = OptionParser() usage = 'usage: %prog [options] <data.file>' self.parser = OptionParser(usage=usage) # command options self.parser.add_option('-v', '--verbose', action='store_true', dest='verbose', help="print disassembly") self.parser.add_option('-s', '--skip', type="int", dest='skip', help='skip n input bytes') self.parser.add_option('-c' , '--count', type="int", dest='count', help='disassembly only n input blocks') self.parser.add_option('-a', '--ascii_blob', action='store_true', dest='ascii', help='disassembly ascii blob') (options, args) = self.parser.parse_args() # Assigns passed variables if options.verbose == True: self.verbose = True if options.skip != None: self.skip = options.skip if options.count != None: self.count = options.count if options.ascii != None: self.ascii = options.ascii def analyzeInstr(self, line): 'add instruction analysis here' if None == line: return elif '??' in line: self.errorInvalidInstCount += 1 self.errorInvalidInst.append(line) return elif '[' in line and ']' in line: if self.staticOffset(line) != None: self.errorStaticCount += 1 self.errorStatic.append(line) return self.segmentCheck(line) self.outcast(line) return def checkOffsetBounds(self, line): if self.getOffset(line) > 0xfffff and line != None: print "Invalid: Offset %s" % line def staticOffset(self, line): value = re.search(self.brRegex, line).group(0)[1:-1] try: tmp = int(value,16) return tmp except: return None def segmentCheck(self,line): for seg in self.segment: if seg in line[0:3]: self.segmentCount += 1 def outcast(self,line): b = False for mnem in self.popMnem: if mnem in line[0:5]: return else: b = False if b == False: self.outcastInstr += 1 def checkFileArgs(self): 'janky way for checking file arguments' if len(sys.argv) == 1: self.parser.print_help() sys.exit() else: try: self.fhandle = open(sys.argv[len(sys.argv)-1], 'rb') except: print "Error: Could not access the file" sys.exit() pass def asciiBlob(self): 'converts ascii blobs to binary two bytes at a time' if self.ascii == False: return from StringIO import StringIO tmpBuff = StringIO(self.buffer) buff = '' b = tmpBuff.read(2) while b != '': try: buff = buff + chr(int(b,16)) b = tmpBuff.read(2) except ValueError: break self.buffer = buff def getBuffer(self): 'checks the skip and count contents then reads the data to a buffer' if self.skip != None: self.fhandle.seek(self.skip) if self.count != None: self.buffer = self.fhandle.read(int(self.count)) return self.buffer = self.fhandle.read() return def start(self): 'disneyland' disO = self.dis(self.buffer) for assemblyLine in list(disO): self.analyzeInstr(assemblyLine) if self.verbose == True: self.verbosed(disO) self.output(disO) def output(self,disO): 'print output of analysis' print "Analysis:" print "\tInfo: Instructions Disassembled Count %s" % len(disO) print "\tError: Invalid Disassembly Count %s" % self.errorInvalidInstCount print "\t\t* Example: ?? jna 0x129" print "\tInvalid: Static Offset Count %s " % self.errorStaticCount print "\t\t* Example: sub [0xd218000a], ecx" print "\tInvalid: Segment Register Use Count %s " % self.segmentCount print "\t\t* Example: fs daa" print "\tAnomaly: Infrequent Instruction Use Count %s " % self.outcastInstr print "\t\t* Example: arpl [ebp+ecx+0xa],bp" print def verbosed(self, disO): 'print disassembly' print 'Disassembly:' for assemblyLine in list(disO): print '\t' + assemblyLine print def main(): ck = CKASM() ck.start() if __name__ == "__main__": main()
No comments:
Post a Comment