Hooked on Mnemonics Worked for Me

Random Applocker Thoughts

While reading through the Windows Internals book I came across an interesting feature called AppLocker.
AppLocker provides a robust experience for IT administrators through new rule creation tools and wizards. For example, IT administrators can automatically generate rules by using a test reference computer and then importing the rules into a production environment for widespread deployment. The IT administrator can also export a policy to provide a backup of the production configuration or to provide documentation for compliance purposes. Your Group Policy infrastructure can be used to build and deploy AppLocker rules as well, saving your organization training and support costs.
Source
The paragraph above is a good description of it's functionality. The tool focuses on policy based security. For example prevent kazaa.exe from running in our enterprise environment. Applocker has three ways of blocking files from executing. The three are publisher, path/file and hash. The publisher is by far the most interesting because it is a little more generic but still targets a decent characteristic. For example if Company A is targeted by Malware B that is always signed with a stolen or legitimate certificate XXX. Company A could create a policy or rule to target the publisher, product name, file name or file version of that signed file. If that rule is ever triggered the file could be blocker or logged as an event and then feed into a SIEM.  The second useful example is in the event of a mass spam campaign. If an organization received 15k emails with a zip attachment. It's almost guaranteed 1% of that population will execute the file within the zip. This espically true if inbox cleanup could take an hour or two.  Most email spam campaign attachments contain a static hash. If an organization had a user report the spam campaign an analyst could create a Applocker rule based on the file hash and push it out as a new policy rule. Odds are the turn around time on pushing out an Applocker policy rule would be faster than getting an AV signature update.

Dear Microsoft Employees,
In future releases of Applocker could you please have an option to use the parent process as a filter for rules?  For example in Haifei Li slide 21 in Exploring in the Wild: A Big Data Approach to Application Security Research (and Exploit Detection) [link], he mentions that out of half million Microsoft Documents the only time they saw MSCOMCTL.OCX loaded was during exploitation. If a rule could be created by an Applocker user to alert when MSCOMCTL.OCX was loaded by a Microsoft Application it could give early alerting on possible exploitation. The same concept could also be applied to Adobe Reader, Java and other commonly exploited third party applications.

Disclaimer
I'm currently not working in an enterprise environment so I can't test these thoughts. Applocker is only available on premium versions of Windows 7 and up.

Kind of a cool feature. Here is a video overview. Anybody have any success stories using AppLocker?

garts.py


'''
Name:
    garts.py (Get all referenced text strings)
Version:
    1.0 
Date:
    2014/05/21
Author:
    alexander<dot>hanel<at>gmail<dot>com
Description:
    garts.py is a simple string viewer for IDA. It will iterate through
    all known functions, look for possible string refercences and then
    print them. This is super helpful for dealing with strings in Delphi
    executables, finding missed strings, having the exact location of
    where a string is being used or where data is possibly going to be
    decrypted.  

    Example Output 
    Address      String
    0x1a701045                  <- new line char. Not the best starting example..         
    0x1a7010bd   #command
    0x1a701199   SOFTWARE\Microsoft\Windows\CurrentVersion\Run
    0x1a7011be   govShell

    Xref Example
    .text:1A7010BD                 push    offset aCommand ; "#command"
    .text:1A7010C2                 lea     eax, [ebp+var_110]
    ....
    .text:1A701199                 push    offset SubKey   ; "SOFTWARE\\Microsoft\\Windows\\CurrentVersi"...
    .text:1A70119E                 push    80000001h       ; hKey
    .text:1A7011A3                 call    ds:RegOpenKeyA

    The script also calls the helpful idautils.strings and then adds all
    the found strings to the viewer window.

    Any ideas, comments, bugs, etc please send me an email or ping me on twitter @nullandnull. Cheers. 
'''

import idautils
class Viewer(idaapi.simplecustviewer_t):
    # modified version of http://dvlabs.tippingpoint.com/blog/2011/05/11/mindshare-extending-ida-custviews
    def __init__(self, data):
        self.fourccs = data
        self.Create()
        self.Show()

    def Create(self):
        title = "A Better String Viewer"
        idaapi.simplecustviewer_t.Create(self, title)
        c = "%s %11s" % ("Address", "String")
        comment = idaapi.COLSTR(c, idaapi.SCOLOR_BINPREF)
        self.AddLine(comment)
        for item in self.fourccs:
            addy = item[0]
            string_d = item[1]
            address_element = idaapi.COLSTR("0x%08x " % addy, idaapi.SCOLOR_REG)
            str_element = idaapi.COLSTR("%-1s" % string_d, idaapi.SCOLOR_REG)
            line = address_element + "  " +  str_element
            self.AddLine(line)
        return True

    def OnDblClick(self, something):
        value = self.GetCurrentWord()
        if value[:2] == '0x':
            Jump(int(value, 16))
        return True

    def OnHint(self, lineno):
        if lineno < 2: return False
        else: lineno -= 2
        line = self.GetCurrentWord()
        if line == None: return False
        if "0x" not in line: return False
        # skip COLSTR formatting, find address
        addy = int(line, 16)
        disasm = idaapi.COLSTR(GetDisasm(addy) + "\n", idaapi.SCOLOR_DREF)
        return (1, disasm)


def enumerate_strings():
    display = []
    # interate through all functions 
    for func in idautils.Functions():
        flags = GetFunctionFlags(func)
        # ignore library code 
        if flags & FUNC_LIB or flags & FUNC_THUNK:
            continue
        # get a list of the addresses in the function. Using a range of < or >
        # if flawed when the code is obfuscated. 
        dism_addr = list(FuncItems(func))
        # for each instruction in the function 
        for line in dism_addr:
            temp = None
            val_addr = 0
            if GetOpType(line,0) == 5:
                val_addr = GetOperandValue(line,0)
                temp = GetString(val_addr, -1)
            elif GetOpType(line,1) == 5:
                val_addr = GetOperandValue(line,1)
                temp = GetString(val_addr, -1)
            if temp:
                # in testing isCode() failed to accurately detect if address was code
                # decided to try something a little more generic 
                if val_addr not in dism_addr and GetFunctionName(val_addr) == '':
                    if GetStringType(val_addr) == 3:
                        temp = GetString(val_addr, -1, ASCSTR_UNICODE)
                    display.append((line, temp))

    # Get the strings already found
    # https://www.hex-rays.com/products/ida/support/idapython_docs/idautils.Strings-class.html
    s = idautils.Strings(False)
    s.setup(strtypes=Strings.STR_UNICODE | Strings.STR_C)
    for i, v in enumerate(s):
        if v is None:
            pass
        else:
            display.append((v.ea, str(v)))

    sorted_display = sorted(display, key=lambda tup:tup[0])
    return sorted_display


if __name__ == "__main__":
    ok = enumerate_strings()
    Viewer(ok)

Link to Repo

ex_pe_xor.py

For anyone else who doesn't want to manually carve out single byte XOR encoded executables.

C:\Documents and Settings\Administrator\Desktop\x>ex_pe_xor.py bad.bin
 * Encoded PE Found, Key 0x21, Offset 0x0
 * exe found at offset 0x0

C:\Documents and Settings\Administrator\Desktop\x>dir

04/30/2014  08:36 PM    <DIR>          .
04/30/2014  08:36 PM    <DIR>          ..
04/30/2014  08:36 PM            24,576 1.exe   <- carved
04/30/2014  05:44 PM            24,576 bad.bin
04/30/2014  08:06 PM             3,526 ex_pe_xor.py

Pefile must be installed.

## detects single byte xor encoding by searching for the 
## encoded MZ, lfanew and PE, then XORs the data and 
## uses pefile to extract the decoded executable. 
## written quickly/poorly by alexander hanel 

import sys
import struct
import pefile
import re
from StringIO import StringIO 

def get_xor():
    # read file into a bytearray
    byte = bytearray(open(sys.argv[1], 'rb').read())

    # for each byte in the file stream, excluding the last 256 bytes
    for i in range(0, len(byte) - 256):
            # KEY ^ VALUE ^ KEY = VALUE; Simple way to get the key 
            key = byte[i] ^ ord('M')
            # verify the two bytes contain 'M' & 'Z'
            if chr(byte[i] ^ key) == 'M' and  chr(byte[i+1] ^ key) == 'Z':
                    # skip non-XOR encoded MZ
                    if key == 0:
                            continue
                    # read four bytes into temp, offset to PE aka lfanew
                    temp = byte[(i + 0x3c) : (i + 0x3c + 4)]
                    # decode values with key 
                    lfanew = []
                    for x in temp:
                            lfanew.append( x ^ key)
                    # convert from bytearray to int value, probably a better way to do this
                    pe_offset  = struct.unpack( '<i', str(bytearray(lfanew)))[0]
                     # verify results are not negative or read is bigger than file 
                    if pe_offset < 0 or pe_offset > len(byte):
                            continue
                    # verify the two decoded bytes are 'P' & 'E'
                    if byte[pe_offset + i ] ^ key == ord('P') and byte[pe_offset + 1 + i] ^ key == ord('E'):
                            print " * Encoded PE Found, Key 0x%x, Offset 0x%x" % (key, i)
                            return (key, i)
    return (None, None)

def getExt(pe):
        if pe.is_dll() == True:
            return 'dll'
        if pe.is_driver() == True:
            return 'sys'
        if pe.is_exe() == True:
            return 'exe'
        else:
            return 'bin'
            
def writeFile(count, ext, pe):
        try:
            out  = open(str(count)+ '.' + ext, 'wb')
        except:
            print '\t[FILE ERROR] could not write file'
            sys.exit()
        # remove overlay or junk in the trunk
        out.write(pe.trim())
        out.close()
                        
def xor_data(key, offset):
        byte = bytearray(open(sys.argv[1], 'rb').read())
        temp = ''
        for x in byte:
            temp += chr(x ^ key)
        return temp
        
def carve(fileH):
        if type(fileH) is str:
            fileH = StringIO(fileH)
        c = 1
        # For each address that contains MZ
        for y in [tmp.start() for tmp in re.finditer('\x4d\x5a', fileH.read())]:
            fileH.seek(y)
            try:
                pe = pefile.PE(data=fileH.read())
            except:
                continue 
            # determine file ext
            ext = getExt(pe)
            print ' *', ext , 'found at offset', hex(y) 
            writeFile(c,ext,pe)
            c += 1
            ext = ''
            fileH.seek(0)
            pe.close

def run():
    if len(sys.argv) < 2:
        print "Usage: ex_pe_xor.py <xored_data>"
        return 
    key, offset = get_xor()
    if key == None:
        return
    data = xor_data(key, offset)
    carve(data) 
    
run()

Upatre, rtrace and XP EOL

A couple of days while reading my RSS feeds I noticed an article entitled Daily analysis note: "Upatre" is back to SSL? on the Malware Must Die! blog. During their analysis they mentioned that they didn't solve the obfuscation. I had recently wrapped up an analysis of different obfuscation techniques that Upatre uses. So I thought I'd take a crack at it. After glancing at the sample for a couple of seconds I thought to myself it was a variant of the obfuscated dword XOR sample set. I did a quick educated guess hit F9 in ollydbg and nothing happened. Which is strange because typically Upatre will write the sample to the %TEMP% directory and then try to download it's payload. Instead of the expected activity, I was looking at ExitProcess. At this point my interest was peeked and I wanted to find the anti-debugging...but I didn't care enough to read the assembly code. The code looked somewhat obfuscated.



Rather than reversing the obfuscation I decided to take a different route. Ollydbg has a useful feature called Run Trace or rtrace for short. It's great for tracking down anti-debugging or logging changes to registers. It's not a feature I would recommend for tracing over large amounts of code. Creating a PinTool would be a better choice for this type of task but rtrace is good for small jobs. Here are three excellent reads on using Pintool for similar tasks 1, 2 and 3.

rtrace log
Rtrace logs the address, thread, instructions, and the register changes. This is very useful for if you want to review how specific registers are changed.  The rtrace output can be saved to a file in Ollydbg by selecting View, Run Trace, right-click, and Log to file. Since rtrace logs can be quite large it's best to only trace code that is worth logging. Breakpoints can be used for creating the boundaries of the trace data. Rather than working with rtrace log as a text file I decided to create an ordered dictionary and save it to a JSON using Python. Once the rtrace log is converted to a JSON I can populate the IDB with interesting values. Some values that I thought were interesting were the count of how many times an address got called, the amount of unique register values for an instruction. For example if the following instruction sub     ecx, 0D733EFF3h is called X number of times; how many different values of ECX were there? If the set is 1 but the instruction was called 6,000 different times we know it's a static value and not give it another thought. Here is the Python code that I hacked together. 


'''
Name:
    rtrace_2_idb.py
Date:
    2014/04/??
Author:
    alexander<dot>hanel<at>gmail<dot>com
'''
import sys
import json
import collections

##########################################################
def run():
    'this function creates an ordered dict saved to a json from an rtrace log'
    rtrace = collections.OrderedDict()
    # rtrace log passed as argument
    with open(sys.argv[1]) as f:
        for line in f:
            addr, reg_values = parse_line(line.rstrip())
            if addr:
                try:
                    addr = int(addr,16)
                except:
                    pass
                if rtrace.has_key(addr):
                    if len(reg_values) == 0:
                        rtrace[addr].append([])
                        continue 
                    for val in reg_values:
                        rtrace[addr].append(val)
                else:
                    rtrace[addr] = []
                    if len(reg_values) == 0:
                        rtrace[addr].append([])
                        continue 
                    for val in reg_values:
                        rtrace[addr].append(val)
    save_off(rtrace)
            
def save_off(rtrace):
    'save the data to a file named rtrace.json'
    with open("rtrace.json", 'w') as f:
        json.dump(rtrace, f)

def parse_line(line):
    'parses the rtrace log' 
    temp = line.split('\t')
    try:
        return (temp[0], list(temp[3:]))
    except:
        return None, None                 

##########################################################

def show_data():
    'prints all the modified registers for an address in a rtrace log'
    try:
        print rtrace_data[str(here())]
    except:
        print "Error: never called"

def show_next():
    'prints next called address. Kind of buggy on ret'
    try:
        temp = rtrace_data.items()[rtrace_data.keys().index(str(here())) + 1]
        print hex(int(temp[0]))
    except:
        print "Error: Could not be found"
    
def populate():
    'populate the IDB with counts and set values'
    import idautils
    import idaapi
    global rtrace_data
    rtrace_data = get_json()
    idaapi.add_hotkey("Shift-A", show_data)
    idaapi.add_hotkey("Shift-S", show_next)
    for func in idautils.Functions():
        flags = GetFunctionFlags(func)
        if flags & FUNC_LIB or flags & FUNC_THUNK:
            continue  
        dism_addr = list(FuncItems(func))
        for line in dism_addr:
            com = format_data(rtrace_data, line)
            if com:
                MakeComm(line, com)

def format_data(rtrace_data, line):
    try:
        instr_data = rtrace_data[str(line)]
    except:
        return None
    count = len(instr_data)
    # Empty lists are not hashable for sets. 
    try:
        data_count = len(set(instr_data))
    except:
        data_count = None
    if data_count:
        comment = "C: 0x%x, S: 0x%x" % (count, data_count)
    else:
        comment = "C: 0x%x" % (count)
    return comment
                    
            
def get_json():
    try:
        f = open("rtrace.json")
        js = json.load(f, object_pairs_hook=collections.OrderedDict)
    except e:
        print "ERROR: %s" % e
    return js

##########################################################
    
if __name__ == '__main__':
    try:
        sys.argv[1]
        run()
    except:
        populate()
        
# the sys.arv[1] exception is a hack to guess how the script is being
# called. If there is an exception it is being run in IDA.

To execute the code above pass the output log file to the script, copy the rtrace.json to the working directory of the script and the IDB and then execute the script in IDA. This will give an output as seen below. The C is for Count and S is for Set. Glancing over other functions it is easy to see which instructions are responsible for looping and decoding data.
Since the executable only had 17 functions it was easy to identify instructions that were not called and then investigate why they weren't. Notice the fourth and fifth block was not called. This is due to the returned values of what is called at EBX.


Since rtrace logs all modified register values it is easy to access those values. Included in the script is the ability to print all those values. This can be done by selecting the address and pressing Shift-A. To print the next address called, select the address and the press Shift-S. The arrow is the address of the selected line.


Once I followed the address in the output window it lead me to what called ExitProcess. Now all I needed to do was investigate the calls of EBX in the first block of the previous to see what is being checked and patch the ZF flag results to continue execution. What makes this sample interesting (and actually worth reading the code) is Upatre is using an undocumented technique to determine if it is running on a Windows NT 6.0 or higher. I'm unaware if this techniques works on Windows Vista. I have only tested on Windows XP SP3 (NT 5.1) and Windows 7 ( NT 6.1).


The malware calls RtlAcquirePebLock and NtCurrentTeb twice. On Windows XP when RtlAcquirePebLock is called the first time ECX, EDX and EAX is over written. ECX will be the return address of RtlAcquirePebLock, EDX will be the address of PEB FastPebLoclk which is a pointer to _RTL_CRITICAL_SECTION and EAX will be zero. On the second call only EAX will be over written. On Windows 7 when RtlAcquirePebLock is called EAX will become zero and ECX will be equal to the Thread Information Block (TEB) ClientId. On the second call to RtlAcquirePebLock EAX will be zero, EDX will be the TEB CliendId but ECX will be equal the the TEB. On the second call to RtlAcquirePebLock  if  ECX is equal to TEB or the return of NtCurrentTeb the sample is running on NT 6.0 or higher.

Below is the code rewritten in C++

#include <windows.h>
#include <stdio.h>
#include <iostream>

int getECX() {
 int value = 0;
 //Moves edx into 'value'
 _asm mov value, ecx;
 return value;
}

int getEAX() {
 int value = 0;
 //Moves edx into 'value'
 _asm mov value, eax;
 return value;
}

int getEDX() {
 int value = 0;
 //Moves edx into 'value'
 _asm mov value, edx;
 return value;
}
int main()
{
   HMODULE hModule = GetModuleHandleA(("ntdll.dll"));
   FARPROC pRtlAcquirePebLock = GetProcAddress(hModule, "RtlAcquirePebLock");

   HMODULE h2Module = GetModuleHandleA(("ntdll.dll"));
   FARPROC ntcur = GetProcAddress(h2Module, "NtCurrentTeb");

   pRtlAcquirePebLock();
   // ecx, eax & edx are modifed when RtlAcquirePebLockis called
   int ecxValue = getECX();
   int eaxValue = getEAX();
   int edxValue = getEDX();

   // call current teb
   ntcur();
   eaxValue = getEAX();

   if (eaxValue == ecxValue)
    std::cout << "EAX & ECX Match - second stage" << std::endl;
   if (eaxValue == edxValue)
    std::cout << "EAX & EDX Match - second stage" << std::endl;

   // lock again 
   pRtlAcquirePebLock();
   ecxValue = getECX();
   eaxValue = getEAX();
   edxValue = getEDX();

   // current teb 
   ntcur();
   eaxValue = getEAX();

   if (eaxValue == ecxValue)
   std::cout << "EAX & ECX Match - second stage aka > XP" << std::endl;
} 

The changes in the return values are caused by the difference in how RtlAcquirePebLock calls RtlEnterCriticalSection. In Windows XP RtlEnterCriticalSection is called by being passed a pointer  from PEB FastPebLockRoutine. Since the PEB is writable from user mode the FastPebLockRoutine, it can be over written to cause a heap overflow. See Refs 1 and 2. Below we can see the difference between XP and Win7 for RtlAcquirePebLock.
_RtlAcquirePebLock XP
_RtlAcquirePebLock Win7
Pretty interesting technique for avoiding Windows XP. Thanks to Hexacorn for help and feedback. If  I wanted to continue executing the malware I would either need to patch the return of the second call to RtlAcquirePebLock or patch the comparison. Please feel free to send me any feedback, criticism or notes via email (in the python code), hit me up on Twitter,  or leave a comment. Cheers.

Hash - 891F33FDD94481E84278736CEB891D1036564C03

References
[1] http://net-ninja.net/article/2011/Sep/03/heap-overflows-for-humans-102/
[2] http://www.exploit-db.com/papers/13178/
http://www.debugwin.com/home/fastpeblock
http://msdn.moonsols.com/winxpsp3_x86/PEB.html

Upatre: Sample Set Analysis


Hi. I recently wrapped up an analysis of Upatre. My original intention was to write a generic C2 decoder/extractor for the executables. After analyzing ten samples I realized it was not feasible due to the different encoding algorithms and obfuscation. After analyzing the ten samples I became interested in how Upatre obfuscates/encodes it's executables. My analysis is based off of 94 unique Upatre samples. My sample set might be little skewed because I grabbed the first 100 files sorted by file size with a type zip. Having the files in their original zip file allowed me to have their original file names. I would like to thank VirusTotal for access to the samples and Glenn Edwards for feedback.

Google Docs (HTML)
Git Repo (PDF)

PE Skeletons

I know this topic has been beaten to death but I thought I'd share a technique for detecting single byte XOR executables in file streams. Recently while looking at a file stream I instantly knew there was an encoded XOR executable file in it. This got me thinking, why can I spot this and can I script it up? Since executable files are a defined structure they have a standard skeleton to them. It's not always easy to see the skeleton if we just look at the hex bytes.


We can add some color via the following Python code. 

import matplotlib.pyplot as plt
import numpy as np
import sys

def main():
        data = open(sys.argv[1], 'rb').read()[:512]
        dlist = bytearray(data)
        print len(dlist)
        plotters = np.array(dlist)
        plotters.shape = (32,16)
        #plt.bone()
        plt.flag()
        plt.axis([0,16,0, 32])
        plt.pcolormesh(plotters)
        plt.show()
        plt.savefig('test.png')
        plt.close()

if __name__ == '__main__':
        main()
                   

The code reads the first 512 bytes of a file, puts each byte into a bytearray and then plots the color in the same structure as the hex dump. If we were to pass an executable file to this script we would get the following pretty picture. 
The bottom left hand corner byte is 0x4d 'M' the second is 0x5a 'Z' and so on. If we were to XOR the executable with  0x88 we would get the following image.
XOR 0x88 Key
If we were to think about the red in the first image and blue in the second image as negative space we would see the PE skeleton. Okay, that was cute, now let's see if we can detect this in a file stream. Since the Portable Executable have a standard structure. The beginning starts with 'MZ', jump 0x3C bytes, read four bytes to get the address of the PE, then check if "PE" is at the read offset. This is a highly dumb down version. Check out PE101 by Ange Albertini for an awesome introduction if my definition is unclear. Since these are standard steps all we have to do is check for the same structure but with XORed data.


import sys
import struct

# read file into a bytearray
byte = bytearray(open(sys.argv[1], 'rb').read())

# for each byte in the file stream, excluding the last 256 bytes
for i in range(0, len(byte) - 256):
        # KEY ^ VALUE ^ KEY = VALUE; Simple way to get the key 
        key = byte[i] ^ ord('M')
        # verify the two bytes contain 'M' & 'Z'
        if chr(byte[i] ^ key) == 'M' and  chr(byte[i+1] ^ key) == 'Z':
                # skip non-XOR encoded MZ
                if key == 0:
                        continue
                # read four bytes into temp, offset to PE aka lfanew
                temp = byte[(i + 0x3c) : (i + 0x3c + 4)]
                # decode values with key 
                lfanew = []
                for x in temp:
                        lfanew.append( x ^ key)
                # convert from bytearray to int value, probably a better way to do this
                pe_offset  = struct.unpack( '<i', str(bytearray(lfanew)))[0]
                # verify results are not negative or read is bigger than file 
                if pe_offset < 0 or pe_offset > len(byte):
                        continue
                # verify the two decoded bytes are 'P' & 'E'
                if byte[pe_offset + i ] ^ key == ord('P') and byte[pe_offset + i + 1] ^ key == ord('E'):
                        print "Encoded PE Found, Key %x, Offset %x" % (key, i)
~                                                                                 
Speed, false postives testing, etc are all probably areas of improvement for the code.

If we were to run this on the executable XORed with 0x88 we would be present with the following output Encoded PE Found, Key 88, Offset 0. Here is a script that will automatically find a XOR encoded executable and carve it out using the above code and Pefile.

Kind of a cool technique to use the Portable Executable structure to find XOR exes. It only works on single byte executables. Could be modified for 2 or 4 bytes. Not sure about anything higher. A brute force approach would probably be better for key byte size of anything higher than 4. The skeleton is prevalent when an executable is XORed with a key of five bytes in size. Using gray tones can help show the skeleton because the contrast is dulled.

Useful Links
http://tomchop.me/2012/12/yo-dawg-i-heard-you-like-xoring/
http://hiddenillusion.blogspot.com/2013/01/nomorexor.html
http://playingwithothers.com/2012/12/20/decoding-xor-shellcode-without-a-key/
http://www.cloudshield.com/blog/advanced-malware/how-to-efficiently-detect-xor-encoded-content-part-1-of-2/
http://digital-forensics.sans.org/blog/2013/05/14/tools-for-examining-xor-obfuscation-for-malware-analysis


Side Note:
I have to admit I'm a huge fan of using ByteArrays now. I wish I could have of learned of them sooner. They are very useful for writing decoders. It remove a lot of the four play of checking the computed size ( value & 0xFF), using ord() and using chr().

xxxswf.py updates

A new version of xxxswf.py has been pushed to it's repo. The current build handles the extracting and decompressing of LZMA compressed SWFs. In order to decompress ZWS SWFs pylzma will need to be installed.

__@____:~/projects/swfs/ZWS$ python xxxswf.py -x c026ebfa3a191d4f27ee72f34fa0d97656113be368369f605e7845a30bc19f6a 

[SUMMARY] Potentially 1 SWF(s) in MD5 d41d8cd98f00b204e9800998ecf8427e:c026ebfa3a191d4f27ee72f34fa0d97656113be368369f605e7845a30bc19f6a
 [ADDR] SWF 1 at 0x4008 - ZWS Header
  [FILE] Carved SWF MD5: 14c29705d5239690ce9d359dccd41fa7.swf

At least ninety percent of the code has been rewritten. A lot of bugs were fixed. The current build is 1.9.1. I'm still wanting to add more features and test newly added functionality but due to the increase use of malicious ZWS I decided to push this out. I have tested it against a couple of hundred malicious files and I have found no errors when running xxxswf.py from the command line. All of the traditional features when invoked from the command line have been tested and our stable. New features include being able to create an xxxswf instance and calling functions for prepping or converting the file stream and scanning the extracted SWF(s). If we wanted to decompress a Microsoft Zip+XML file and then extract embedded SWF(s), this would be a prepping function. I need to write more functions to figure out a good work flow. Once I get this sorted out I'll push out the 2.0 version. If you find any errors please send me an email, leave a comment or ping me on twitter.