Hooked on Mnemonics Worked for Me


I have been reversing Dyre in my spare time. I'm hoping to have a full analysis out in the next week or two. Something kind of annoying about Dyre is it uses what looks like a massive structure to store it's data and function pointers. For example in the image below we can see it it passing a handle stored at [eax+0x130] to WaitForSingleObject.
Manually tracing the code or searching for all cross references is kind of painful to find what populated the value. Since the displacement is kind of unique due to it's value of 0x130 or 304 it can be targeted very easily in IDAPython.

import idautils 
import idaapi
displace = {}
for func in idautils.Functions():
    flags = GetFunctionFlags(func)
    if flags & FUNC_LIB or flags & FUNC_THUNK:
    dism_addr = list(FuncItems(func))
    for curr_addr in dism_addr:
        op = None
        index = None 
        if idaapi.cmd.Op1.type == idaapi.o_displ:
            op = 1
        if idaapi.cmd.Op2.type == idaapi.o_displ:
            op = 2
        if op == None:
        if "bp" in idc.GetOpnd(curr_addr, 0):
            # ebp will return a negative number
            if op == 1:
                index = (~(int(idaapi.cmd.Op1.addr) - 1) & 0xFFFFFFFF)
                index = (~(int(idaapi.cmd.Op2.addr) - 1) & 0xFFFFFFFF)
            if op == 1:
                index = int(idaapi.cmd.Op1.addr)
                index = int(idaapi.cmd.Op2.addr)
        if index:
            if displace.has_key(index) == False:
                displace[index] = []
The above code will create a dictionary of all the displacement values in known functions. A simple for loop can be used to find the address and disassembly of all uses for the defined displacement value.

Python>for x in displace[0x130]: print hex(x), GetDisasm(x)
0x10004f12 mov     [esi+130h], eax
0x10004f68 mov     [esi+130h], eax
0x10004fda push    dword ptr [esi+130h]  ; hObject
0x10005260 push    dword ptr [esi+130h]  ; hObject
0x10005293 push    dword ptr [eax+130h]  ; hHandle
0x100056be push    dword ptr [esi+130h]  ; hEvent
0x10005ac7 push    dword ptr [esi+130h]  ; hEvent
With the addresses it makes it easy to find where the value is populated.

The dictionary created by the script is named displace. It will contain all displaced values.  Not super 1337 but still useful. Cheers.

Backtrace POC - Stack Strings

Example 1 Hex View
There are a number of tools that cover char strings in IDA. If you are not familiar with char strings it's a low hanging obfuscation technique to thwart analyst from viewing the strings inside of an executable. Some notable tools and posts on this topic are [1] & [2]. In the image above you can see the string DBG. Odds are if we were viewing the executable in a hex editor or using strings this wouldn't stick out.

Example 1 Assembly View
If we were watching the stack of the executable at run time we would see something constructed similar to the string/comment above.
Example 2
 The code can be run in two modes the first is by selecting the code and the double clicking the script in IDA (ALT+F9). In the example above we can see the string "W32Time". My code attempts to reconstruct the stack memory. The buffer can be accessed via a list object.str_buff. In the Output window above you can see the content of the buffer dumped to standard out. This makes it easy to format the data and access it via an index. The commented data is an example of how the string would look on the stack in Ollydbg. The second way to execute the code is to pass an address within a function to object.run( address ). This will try to rebuild the stack for the whole function. All of this is done statically. Char strings that are populated via registers (such as mov [ebp+var_c], bl when bl is 0x4f in the example 1 image) are traced back using backtrace.py. For more details on backtrace please see the the following link.

As previously mentioned this topic has already been covered. I'm posting this code because it's a good example of using backtrace.py. I had fun working on this one. The code handles all examples I have found so far. There is an issue with formatting constructed wide char strings. Not exactly sure of the best approach. I tried to keep the data flexible so it should be easy to write a function to format the data.

[1]. Automatic Recovery of Constructed Strings in Malware by Jay Smith of FireEye - link
[2]. Finding Byte Strings using IDAPython by Jason Jones of Arbor Networks - link 

Repo - Link

Code for reviewing

    Alexander Hanel 
    1  - should be good to go.
    Examples of using the backtrace library to rebuild strings

    * How to deal with printing wide char strings?
    * What is the size of the frame buffer if GetFrameSize returns something
      smaller than the frame/stack index or the IDA does not recognize the function?

    idaapi.o_phrase # Memory Ref [Base Reg + Index Reg]
    o_phrase   =  idaapi.o_phrase    #  Memory Ref [Base Reg + Index Reg]    phrase
    o_displ    =  idaapi.o_displ     #  Memory Reg [Base Reg + Index Reg + Displacement] phrase+addr

Useful Reads
import sys, os, logging, copy
from binascii import unhexlify
# Add the parent directory to Python Path
sys.path.append(os.path.realpath(__file__ + "/../../"))
# import the backtrace module
from backtrace import *

class Frame2Buff:
    def __init__(self):
        self.verbose = False
        self.func_start = idc.SelStart()
        # SelEnd() returns the following selected instruction
        self.func_end = SelEnd()
        self.esp = False
        self.ebp = False
        self.comment = True
        self.frame_size = None
        self.bt = None
        self.str_buff = None
        self.comment = True
        self.formatted_buff = ""
        self.format = True

    def run(self, func_addr=None):
        """ run and create Frame2Buff"""
        # check if code is selected or if using the whole function
        if self.func_start == BADADDR or self.func_end == BADADDR:
            if func_addr == None:
                if self.verbose:
                    print "ERROR: No addresses selected or passed"
                return None
        if func_addr:
            self.func_start = idc.GetFunctionAttr(func_addr, FUNCATTR_START)
            self.func_end = idc.GetFunctionAttr(func_addr, FUNCATTR_END)
        if self.func_start == BADADDR:
            if self.verbose:
                print "ERROR: Invalid address"
        self.frame_size = GetFrameSize(self.func_start)
            self.bt = Backtrace()
            self.bt.verbose = False
        except ImportError:
            print "ERROR: Could not import Backtrace - aborting"
        self.func_end = PrevHead(self.func_end)
        if self.format:
        if self.comment:

    def populate_buffer(self):
        curr_addr = self.func_start
        self.str_buff = list('\x00' * self.frame_size)
        while curr_addr <= self.func_end:
            index = None
            # check if instr is MOV, [esp|ebp + index], variable
            if idaapi.cmd.itype == idaapi.NN_mov and idaapi.cmd.Op1.type == idaapi.o_displ:
                if "bp" in idc.GetOpnd(curr_addr, 0):
                    # ebp will return a negative number
                    index = (~(int(idaapi.cmd.Op1.addr) - 1) & 0xFFFFFFFF)
                    self.ebp = True
                    index = int(idaapi.cmd.Op1.addr)
                    self.esp = True
                if idaapi.cmd.Op2.type == idaapi.o_reg:
                    # value needs to be traced back
                    self.bt.backtrace(curr_addr, 1)
                    # tainted means the reg was xor reg, reg
                    # odds are being used to init var.
                    if self.bt.tainted != True:
                        last_ref = self.bt.refsLog[-1]
                        data = idaapi.cmd.Op2.value
                        # tracked variable has been set to zero by xor reg, reg
                        curr_addr = idc.NextHead(curr_addr)
                elif idaapi.cmd.Op2.type != idaapi.o_imm:
                    curr_addr = idc.NextHead(curr_addr)
                    data = idaapi.cmd.Op2.value
                if data:
                        hex_values = hex(data)[2:]
                        if hex_values[-1] == "L":
                            hex_values = hex_values[:-1]
                        if len(hex_values) % 2:
                            hex_values = "0" + hex_values
                        temp = unhexlify(hex_values)
                        if self.verbose:
                            print "ERROR: Unhexlify Issue at %x %s (not added)" % (curr_addr, idc.GetDisasm(curr_addr))
                        curr_addr = idc.NextHead(curr_addr)
                    curr_addr = idc.NextHead(curr_addr)
                # GetFrameSize is not a reliable buffer size
                # If so append to buffer if index is less than
                # 2 * frame size. If more likely an error
                if self.ebp or self.esp:
                    cal_index = index + len(temp)
                    if cal_index > self.frame_size:
                        if cal_index < (self.frame_size * 2):
                            for a in range(cal_index - self.frame_size):
                                if self.verbose:
                                    print "ERROR: Frame size incorrect, appending"
                if self.ebp:
                    # reverse the buffer
                    temp = temp[::-1]
                    for c, ch in enumerate(temp):
                            self.str_buff[index - c] = ch
                            if self.verbose:
                                print "ERROR: Frame EBP index invalid: at %x" % (curr_addr)
                if self.esp:
                    for c, ch in enumerate(temp):
                            self.str_buff[index + c] = ch
                                print "ERROR: Frame ESP index invalid: at %x" % (curr_addr)
            curr_addr = idc.NextHead(curr_addr)
        # reverse the buffer to match index
        if self.ebp == True:
            self.str_buff = self.str_buff[::-1]

    def format_buff(self):
        self.formatted_buff = ""
        temp_buff = copy.copy(self.str_buff)

        if self.ebp == True:
            temp_buff = temp_buff[::-1]

        if self.str_buff:
            for index, ch in enumerate(temp_buff):
                    if ch == "\x00" and temp_buff[index + 1] != "\x00":
                        self.formatted_buff += " "
                if ch != "\x00":
                    self.formatted_buff += ch

    def comment_func(self):
        idc.MakeComm(self.func_end, self.formatted_buff)

    Create a buffer of the whole function

x = Frame2Buff()
x.run(here())  # func adddr

x = Frame2Buff()
x.run() # select data

Renaming Simple Functions

Simple Function

The above function is very simple. Let's ignore the actual code but think about the codes functionality from a generic standpoint. The code pushes arguments on to the stack, calls APIs, compares return values from the APIs and then returns one or zero. In most instance these simple functions do not need to be analyzed. By reading the API names most of the functionality can be inferred and easily renamed to something like "RegCreateAndSetValue".  After seeing these simple functions many times I realized that many of these functions could automatically be renamed. If broken down into steps it would look like this.
  1. API names from a function are extracted
  2. Sub-strings from the APIs are extracted
  3. Search for a common sub-string throughout all API names. 
  4. If a sub-string is common throughout all, create a name from the sub-strings. 
Step 1

    def get_apis(self, func_addr):
        flags = GetFunctionFlags(func_addr)
        # ignore library functions
        if flags & FUNC_LIB or flags & FUNC_THUNK:
            logging.debug("get_apis: Library code or thunk")
            return None
        # list of addresses
        dism_addr = list(FuncItems(func_addr))
        for instr in dism_addr:
            tmp_api_address = ""
            if idaapi.is_call_insn(instr):
                # In theory an API address should only have one xrefs
                # The xrefs approach was used because I could not find how to
                # get the API name by address.
                for xref in XrefsFrom(instr, idaapi.XREF_FAR):
                    if xref.to == None:
                        self.calls += 1
                    tmp_api_address = xref.to
                    logging.debug("get_apis: xref to %x found", tmp_api_address)
                # get next instr since api address could not be found
                if tmp_api_address == "":
                    self.calls += 1
                api_flags = GetFunctionFlags(tmp_api_address)
                # check for lib code (api)
                if api_flags & idaapi.FUNC_LIB == True or api_flags & idaapi.FUNC_THUNK:
                    tmp_api_name = NameEx(0, tmp_api_address)
                    if tmp_api_name:
                    self.calls += 1

Step 2 & 3

    def match_apis(self):
        self.matched = False
        api_set = set(self.apis)
        # Optional Threshold. Only check functions with more than 2 apis
        if self.calls <= self.threshold and len(self.apis) > 1:
            api_tokend  = []
            # for each api in function
            for api_name in api_set:
                # for each tokenized string in API name
                for item in self.tok.tokenizer(api_name):
                    if item is None or item is "A" or item is "W":
            # Count occurrence of strings.
            count_tmp = Counter(api_tokend)
            # if a common string is found in all APIs
            # return True and the count strings
            for string, count in count_tmp.items():
                if count == len(set(self.apis)):
                    self.matched = True
                    self.count_strings = count_tmp
                    logging.debug("match_apis: API count and API sub-string don't match")
            logging.debug("match_apis: calls above threshold or API count is 1")

A lot of the heavy lifting for parsing out the sub-strings is built into the tokenizer module in TT&SS. For more information and usage I'd recommend the following post

Step 4

    def create_string(self):
        if self.count_strings == "" or self.matched is False:
        # Sort strings by highest occurrence
        sort = sorted(self.count_strings, key=self.count_strings.get, reverse=True)
        name = ""
        # if a function contains all the same API multiple times
        # might be possible to modify to deal with wrapper code also
        if self.calls == 0 and len(set(self.apis)) == 1 and len(self.apis) > 1:
            self.func_name = self.apis[0] + str(len(self.apis))
        for each in sort:
            # ignore Wide or Ascii
            if each.upper() == "A" or each.upper() == "W":
            # Convert to CamelCase for easier reading and space
            tmp = each[0].upper() + each[1:]
            name += str(tmp)
        # replace white space with underscore
        name = name.replace(" ", "_")
        logging.debug("create_string: string created %s", name)
        self.func_name = name

If we were to apply that logic plus some other random stuff we would have the following..

I think this is pretty cool. I like the idea of combining other domains of knowledge such as Natural Language Text Processing to reversing. Sadly functions  simple functions or APIs that all contain a similar sub-string are rare. The rarity happens because a lot APIs that share similar functionality use generic APIs such as "CloseHandle" to close out a process. This API does not contain any of the sub-strings so it will fail the similarity test. I'm currently toying with an idea of using thresholds on matches or whitelisting certain APIs. Creating API sets as was used in the generic renaming of functions in IDAScope is another option. The main issue with that approach is categorizing of APIs by functionality. There are lot of little things for this project hence why I'm releasing it as a POC.  Below is the output of attempting to rename 456 functions in a Zeus IDB.

The VirtualProtect2 contains a "2" because the API was called twice from a function. The API names that end with an underscore and a value are for calculated names that happen multiple times.

The source POC is named w_sims.py and can be found in the POCS dir in the repo. The source also contains some code to identify wrapper functions. The code is currently setup to run the SimilarFunctions and the wrapper class on all the known functions. If you would like to run the wrapper class or experiment on other function tweak the execute options at the bottom of the code. The code is still being tweaked and fixed. I have been using this code off and on for a couple of weeks. I have seen some issues while importing the modules but  I think I got those ironed out. If anything breaks, you have feedback etc please leave a comment. 

Random Applocker Thoughts

While reading through the Windows Internals book I came across an interesting feature called AppLocker.
AppLocker provides a robust experience for IT administrators through new rule creation tools and wizards. For example, IT administrators can automatically generate rules by using a test reference computer and then importing the rules into a production environment for widespread deployment. The IT administrator can also export a policy to provide a backup of the production configuration or to provide documentation for compliance purposes. Your Group Policy infrastructure can be used to build and deploy AppLocker rules as well, saving your organization training and support costs.
The paragraph above is a good description of it's functionality. The tool focuses on policy based security. For example prevent kazaa.exe from running in our enterprise environment. Applocker has three ways of blocking files from executing. The three are publisher, path/file and hash. The publisher is by far the most interesting because it is a little more generic but still targets a decent characteristic. For example if Company A is targeted by Malware B that is always signed with a stolen or legitimate certificate XXX. Company A could create a policy or rule to target the publisher, product name, file name or file version of that signed file. If that rule is ever triggered the file could be blocker or logged as an event and then feed into a SIEM.  The second useful example is in the event of a mass spam campaign. If an organization received 15k emails with a zip attachment. It's almost guaranteed 1% of that population will execute the file within the zip. This espically true if inbox cleanup could take an hour or two.  Most email spam campaign attachments contain a static hash. If an organization had a user report the spam campaign an analyst could create a Applocker rule based on the file hash and push it out as a new policy rule. Odds are the turn around time on pushing out an Applocker policy rule would be faster than getting an AV signature update.

Dear Microsoft Employees,
In future releases of Applocker could you please have an option to use the parent process as a filter for rules?  For example in Haifei Li slide 21 in Exploring in the Wild: A Big Data Approach to Application Security Research (and Exploit Detection) [link], he mentions that out of half million Microsoft Documents the only time they saw MSCOMCTL.OCX loaded was during exploitation. If a rule could be created by an Applocker user to alert when MSCOMCTL.OCX was loaded by a Microsoft Application it could give early alerting on possible exploitation. The same concept could also be applied to Adobe Reader, Java and other commonly exploited third party applications.

I'm currently not working in an enterprise environment so I can't test these thoughts. Applocker is only available on premium versions of Windows 7 and up.

Kind of a cool feature. Here is a video overview. Anybody have any success stories using AppLocker?


    garts.py (Get all referenced text strings)
    garts.py is a simple string viewer for IDA. It will iterate through
    all known functions, look for possible string refercences and then
    print them. This is super helpful for dealing with strings in Delphi
    executables, finding missed strings, having the exact location of
    where a string is being used or where data is possibly going to be

    Example Output 
    Address      String
    0x1a701045                  <- new line char. Not the best starting example..         
    0x1a7010bd   #command
    0x1a701199   SOFTWARE\Microsoft\Windows\CurrentVersion\Run
    0x1a7011be   govShell

    Xref Example
    .text:1A7010BD                 push    offset aCommand ; "#command"
    .text:1A7010C2                 lea     eax, [ebp+var_110]
    .text:1A701199                 push    offset SubKey   ; "SOFTWARE\\Microsoft\\Windows\\CurrentVersi"...
    .text:1A70119E                 push    80000001h       ; hKey
    .text:1A7011A3                 call    ds:RegOpenKeyA

    The script also calls the helpful idautils.strings and then adds all
    the found strings to the viewer window.

    Any ideas, comments, bugs, etc please send me an email. Cheers. 

import idautils
class Viewer(idaapi.simplecustviewer_t):
    # modified version of http://dvlabs.tippingpoint.com/blog/2011/05/11/mindshare-extending-ida-custviews
    def __init__(self, data):
        self.fourccs = data

    def Create(self):
        title = "A Better String Viewer"
        idaapi.simplecustviewer_t.Create(self, title)
        c = "%s %11s" % ("Address", "String")
        comment = idaapi.COLSTR(c, idaapi.SCOLOR_BINPREF)
        for item in self.fourccs:
            addy = item[0]
            string_d = item[1]
            address_element = idaapi.COLSTR("0x%08x " % addy, idaapi.SCOLOR_REG)
            str_element = idaapi.COLSTR("%-1s" % string_d, idaapi.SCOLOR_REG)
            line = address_element + "  " +  str_element
        return True

    def OnDblClick(self, something):
        value = self.GetCurrentWord()
        if value[:2] == '0x':
            Jump(int(value, 16))
        return True

    def OnHint(self, lineno):
        if lineno < 2: return False
        else: lineno -= 2
        line = self.GetCurrentWord()
        if line == None: return False
        if "0x" not in line: return False
        # skip COLSTR formatting, find address
        addy = int(line, 16)
        disasm = idaapi.COLSTR(GetDisasm(addy) + "\n", idaapi.SCOLOR_DREF)
        return (1, disasm)

def enumerate_strings():
    display = []
    # interate through all functions 
    for func in idautils.Functions():
        flags = GetFunctionFlags(func)
        # ignore library code 
        if flags & FUNC_LIB or flags & FUNC_THUNK:
        # get a list of the addresses in the function. Using a range of < or >
        # if flawed when the code is obfuscated. 
        dism_addr = list(FuncItems(func))
        # for each instruction in the function 
        for line in dism_addr:
            temp = None
            val_addr = 0
            if GetOpType(line,0) == 5:
                val_addr = GetOperandValue(line,0)
                temp = GetString(val_addr, -1)
            elif GetOpType(line,1) == 5:
                val_addr = GetOperandValue(line,1)
                temp = GetString(val_addr, -1)
            if temp:
                # in testing isCode() failed to accurately detect if address was code
                # decided to try something a little more generic 
                if val_addr not in dism_addr and GetFunctionName(val_addr) == '':
                    if GetStringType(val_addr) == 3:
                        temp = GetString(val_addr, -1, ASCSTR_UNICODE)
                    display.append((line, temp))

    # Get the strings already found
    # https://www.hex-rays.com/products/ida/support/idapython_docs/idautils.Strings-class.html
    s = idautils.Strings(False)
    s.setup(strtypes=Strings.STR_UNICODE | Strings.STR_C)
    for i, v in enumerate(s):
        if v is None:
            display.append((v.ea, str(v)))

    sorted_display = sorted(display, key=lambda tup:tup[0])
    return sorted_display

if __name__ == "__main__":
    ok = enumerate_strings()

Link to Repo


For anyone else who doesn't want to manually carve out single byte XOR encoded executables.

C:\Documents and Settings\Administrator\Desktop\x>ex_pe_xor.py bad.bin
 * Encoded PE Found, Key 0x21, Offset 0x0
 * exe found at offset 0x0

C:\Documents and Settings\Administrator\Desktop\x>dir

04/30/2014  08:36 PM    <DIR>          .
04/30/2014  08:36 PM    <DIR>          ..
04/30/2014  08:36 PM            24,576 1.exe   <- carved
04/30/2014  05:44 PM            24,576 bad.bin
04/30/2014  08:06 PM             3,526 ex_pe_xor.py

Pefile must be installed.

## detects single byte xor encoding by searching for the 
## encoded MZ, lfanew and PE, then XORs the data and 
## uses pefile to extract the decoded executable. 
## written quickly/poorly by alexander hanel 

import sys
import struct
import pefile
import re
from StringIO import StringIO 

def get_xor():
    # read file into a bytearray
    byte = bytearray(open(sys.argv[1], 'rb').read())

    # for each byte in the file stream, excluding the last 256 bytes
    for i in range(0, len(byte) - 256):
            # KEY ^ VALUE ^ KEY = VALUE; Simple way to get the key 
            key = byte[i] ^ ord('M')
            # verify the two bytes contain 'M' & 'Z'
            if chr(byte[i] ^ key) == 'M' and  chr(byte[i+1] ^ key) == 'Z':
                    # skip non-XOR encoded MZ
                    if key == 0:
                    # read four bytes into temp, offset to PE aka lfanew
                    temp = byte[(i + 0x3c) : (i + 0x3c + 4)]
                    # decode values with key 
                    lfanew = []
                    for x in temp:
                            lfanew.append( x ^ key)
                    # convert from bytearray to int value, probably a better way to do this
                    pe_offset  = struct.unpack( '<i', str(bytearray(lfanew)))[0]
                     # verify results are not negative or read is bigger than file 
                    if pe_offset < 0 or pe_offset > len(byte):
                    # verify the two decoded bytes are 'P' & 'E'
                    if byte[pe_offset + i ] ^ key == ord('P') and byte[pe_offset + 1 + i] ^ key == ord('E'):
                            print " * Encoded PE Found, Key 0x%x, Offset 0x%x" % (key, i)
                            return (key, i)
    return (None, None)

def getExt(pe):
        if pe.is_dll() == True:
            return 'dll'
        if pe.is_driver() == True:
            return 'sys'
        if pe.is_exe() == True:
            return 'exe'
            return 'bin'
def writeFile(count, ext, pe):
            out  = open(str(count)+ '.' + ext, 'wb')
            print '\t[FILE ERROR] could not write file'
        # remove overlay or junk in the trunk
def xor_data(key, offset):
        byte = bytearray(open(sys.argv[1], 'rb').read())
        temp = ''
        for x in byte:
            temp += chr(x ^ key)
        return temp
def carve(fileH):
        if type(fileH) is str:
            fileH = StringIO(fileH)
        c = 1
        # For each address that contains MZ
        for y in [tmp.start() for tmp in re.finditer('\x4d\x5a', fileH.read())]:
                pe = pefile.PE(data=fileH.read())
            # determine file ext
            ext = getExt(pe)
            print ' *', ext , 'found at offset', hex(y) 
            c += 1
            ext = ''

def run():
    if len(sys.argv) < 2:
        print "Usage: ex_pe_xor.py <xored_data>"
    key, offset = get_xor()
    if key == None:
    data = xor_data(key, offset)

Upatre, rtrace and XP EOL

A couple of days while reading my RSS feeds I noticed an article entitled Daily analysis note: "Upatre" is back to SSL? on the Malware Must Die! blog. During their analysis they mentioned that they didn't solve the obfuscation. I had recently wrapped up an analysis of different obfuscation techniques that Upatre uses. So I thought I'd take a crack at it. After glancing at the sample for a couple of seconds I thought to myself it was a variant of the obfuscated dword XOR sample set. I did a quick educated guess hit F9 in ollydbg and nothing happened. Which is strange because typically Upatre will write the sample to the %TEMP% directory and then try to download it's payload. Instead of the expected activity, I was looking at ExitProcess. At this point my interest was peeked and I wanted to find the anti-debugging...but I didn't care enough to read the assembly code. The code looked somewhat obfuscated.

Rather than reversing the obfuscation I decided to take a different route. Ollydbg has a useful feature called Run Trace or rtrace for short. It's great for tracking down anti-debugging or logging changes to registers. It's not a feature I would recommend for tracing over large amounts of code. Creating a PinTool would be a better choice for this type of task but rtrace is good for small jobs. Here are three excellent reads on using Pintool for similar tasks 1, 2 and 3.

rtrace log
Rtrace logs the address, thread, instructions, and the register changes. This is very useful for if you want to review how specific registers are changed.  The rtrace output can be saved to a file in Ollydbg by selecting View, Run Trace, right-click, and Log to file. Since rtrace logs can be quite large it's best to only trace code that is worth logging. Breakpoints can be used for creating the boundaries of the trace data. Rather than working with rtrace log as a text file I decided to create an ordered dictionary and save it to a JSON using Python. Once the rtrace log is converted to a JSON I can populate the IDB with interesting values. Some values that I thought were interesting were the count of how many times an address got called, the amount of unique register values for an instruction. For example if the following instruction sub     ecx, 0D733EFF3h is called X number of times; how many different values of ECX were there? If the set is 1 but the instruction was called 6,000 different times we know it's a static value and not give it another thought. Here is the Python code that I hacked together. 

import sys
import json
import collections

def run():
    'this function creates an ordered dict saved to a json from an rtrace log'
    rtrace = collections.OrderedDict()
    # rtrace log passed as argument
    with open(sys.argv[1]) as f:
        for line in f:
            addr, reg_values = parse_line(line.rstrip())
            if addr:
                    addr = int(addr,16)
                if rtrace.has_key(addr):
                    if len(reg_values) == 0:
                    for val in reg_values:
                    rtrace[addr] = []
                    if len(reg_values) == 0:
                    for val in reg_values:
def save_off(rtrace):
    'save the data to a file named rtrace.json'
    with open("rtrace.json", 'w') as f:
        json.dump(rtrace, f)

def parse_line(line):
    'parses the rtrace log' 
    temp = line.split('\t')
        return (temp[0], list(temp[3:]))
        return None, None                 


def show_data():
    'prints all the modified registers for an address in a rtrace log'
        print rtrace_data[str(here())]
        print "Error: never called"

def show_next():
    'prints next called address. Kind of buggy on ret'
        temp = rtrace_data.items()[rtrace_data.keys().index(str(here())) + 1]
        print hex(int(temp[0]))
        print "Error: Could not be found"
def populate():
    'populate the IDB with counts and set values'
    import idautils
    import idaapi
    global rtrace_data
    rtrace_data = get_json()
    idaapi.add_hotkey("Shift-A", show_data)
    idaapi.add_hotkey("Shift-S", show_next)
    for func in idautils.Functions():
        flags = GetFunctionFlags(func)
        if flags & FUNC_LIB or flags & FUNC_THUNK:
        dism_addr = list(FuncItems(func))
        for line in dism_addr:
            com = format_data(rtrace_data, line)
            if com:
                MakeComm(line, com)

def format_data(rtrace_data, line):
        instr_data = rtrace_data[str(line)]
        return None
    count = len(instr_data)
    # Empty lists are not hashable for sets. 
        data_count = len(set(instr_data))
        data_count = None
    if data_count:
        comment = "C: 0x%x, S: 0x%x" % (count, data_count)
        comment = "C: 0x%x" % (count)
    return comment
def get_json():
        f = open("rtrace.json")
        js = json.load(f, object_pairs_hook=collections.OrderedDict)
    except e:
        print "ERROR: %s" % e
    return js

if __name__ == '__main__':
# the sys.arv[1] exception is a hack to guess how the script is being
# called. If there is an exception it is being run in IDA.

To execute the code above pass the output log file to the script, copy the rtrace.json to the working directory of the script and the IDB and then execute the script in IDA. This will give an output as seen below. The C is for Count and S is for Set. Glancing over other functions it is easy to see which instructions are responsible for looping and decoding data.
Since the executable only had 17 functions it was easy to identify instructions that were not called and then investigate why they weren't. Notice the fourth and fifth block was not called. This is due to the returned values of what is called at EBX.

Since rtrace logs all modified register values it is easy to access those values. Included in the script is the ability to print all those values. This can be done by selecting the address and pressing Shift-A. To print the next address called, select the address and the press Shift-S. The arrow is the address of the selected line.

Once I followed the address in the output window it lead me to what called ExitProcess. Now all I needed to do was investigate the calls of EBX in the first block of the previous to see what is being checked and patch the ZF flag results to continue execution. What makes this sample interesting (and actually worth reading the code) is Upatre is using an undocumented technique to determine if it is running on a Windows NT 6.0 or higher. I'm unaware if this techniques works on Windows Vista. I have only tested on Windows XP SP3 (NT 5.1) and Windows 7 ( NT 6.1).

The malware calls RtlAcquirePebLock and NtCurrentTeb twice. On Windows XP when RtlAcquirePebLock is called the first time ECX, EDX and EAX is over written. ECX will be the return address of RtlAcquirePebLock, EDX will be the address of PEB FastPebLoclk which is a pointer to _RTL_CRITICAL_SECTION and EAX will be zero. On the second call only EAX will be over written. On Windows 7 when RtlAcquirePebLock is called EAX will become zero and ECX will be equal to the Thread Information Block (TEB) ClientId. On the second call to RtlAcquirePebLock EAX will be zero, EDX will be the TEB CliendId but ECX will be equal the the TEB. On the second call to RtlAcquirePebLock  if  ECX is equal to TEB or the return of NtCurrentTeb the sample is running on NT 6.0 or higher.

Below is the code rewritten in C++

#include <windows.h>
#include <stdio.h>
#include <iostream>

int getECX() {
 int value = 0;
 //Moves edx into 'value'
 _asm mov value, ecx;
 return value;

int getEAX() {
 int value = 0;
 //Moves edx into 'value'
 _asm mov value, eax;
 return value;

int getEDX() {
 int value = 0;
 //Moves edx into 'value'
 _asm mov value, edx;
 return value;
int main()
   HMODULE hModule = GetModuleHandleA(("ntdll.dll"));
   FARPROC pRtlAcquirePebLock = GetProcAddress(hModule, "RtlAcquirePebLock");

   HMODULE h2Module = GetModuleHandleA(("ntdll.dll"));
   FARPROC ntcur = GetProcAddress(h2Module, "NtCurrentTeb");

   // ecx, eax & edx are modifed when RtlAcquirePebLockis called
   int ecxValue = getECX();
   int eaxValue = getEAX();
   int edxValue = getEDX();

   // call current teb
   eaxValue = getEAX();

   if (eaxValue == ecxValue)
    std::cout << "EAX & ECX Match - second stage" << std::endl;
   if (eaxValue == edxValue)
    std::cout << "EAX & EDX Match - second stage" << std::endl;

   // lock again 
   ecxValue = getECX();
   eaxValue = getEAX();
   edxValue = getEDX();

   // current teb 
   eaxValue = getEAX();

   if (eaxValue == ecxValue)
   std::cout << "EAX & ECX Match - second stage aka > XP" << std::endl;

The changes in the return values are caused by the difference in how RtlAcquirePebLock calls RtlEnterCriticalSection. In Windows XP RtlEnterCriticalSection is called by being passed a pointer  from PEB FastPebLockRoutine. Since the PEB is writable from user mode the FastPebLockRoutine, it can be over written to cause a heap overflow. See Refs 1 and 2. Below we can see the difference between XP and Win7 for RtlAcquirePebLock.
_RtlAcquirePebLock XP
_RtlAcquirePebLock Win7
Pretty interesting technique for avoiding Windows XP. Thanks to Hexacorn for help and feedback. If  I wanted to continue executing the malware I would either need to patch the return of the second call to RtlAcquirePebLock or patch the comparison. Please feel free to send me any feedback, criticism or notes via email (in the python code), hit me up on Twitter,  or leave a comment. Cheers.

Hash - 891F33FDD94481E84278736CEB891D1036564C03

[1] http://net-ninja.net/article/2011/Sep/03/heap-overflows-for-humans-102/
[2] http://www.exploit-db.com/papers/13178/