Hooked on Mnemonics Worked for Me

bkdump version 2

I have pushed the second version of bkdump to my BitBucket Repo. This version uses the same techniques described in a previous post but contains two new options to scan processes. The first new feature can scan a process by it's process id (PID) and the second will scan processes that are commonly injected into. These processes have historically been firefox.exe, iexplore.exe, chrome.exe and explorer.exe. To invoke the original  feature of bkdump which was to scan a dummy process using iexplore.exe we can pass "0" as an argument. This can be used to detect malware that inject into processes on process creation. If we had a machine infected with Citadel we would see the following. 
 The memory will be dumped to the working directory. In this example the memory is dumped to a file named 0.0x00140000.bin. Usually the first item in the file name is the PID but since we created a dummy process it is 0. If we wanted to scan a specific PID we would pass the integer (non-hexadecimal) value as an argument. If we had a machine infected with Poison Ivy that is injected into iexplore.exe with a PID of 3268 we would pass that integer as an argument.

If we were to open these dumps up in bintext we would see some common strings related to Poison Ivy.

 In the last scan were kind of lucky in the results. Since we are dumping all memory marked as RWX we can dump non-malicious memory. iexplore.exe likes memory sizes of 4096 while firefox loves memory sizes of 65,536. The last scan option takes no arguments. It will scan processes that are commonly injected into. If we had a machine infected with Citadel, Ramnit, Poison Ivy and Tilon we would see the following results. No need to look at it all...

C:\Documents and Settings\Administrator\Desktop\xxx>bkdump.exe
bkdump - simple RWX process dumper for commonly injected processes
         bkdump.exe 0 - to open a dummy iexplorer.exe
         bkdump.exe PID - to dump RWX Memory in a process
         bkdump.exe - to dump RWX memory of running firefox.exe, ie,
         explorer.exe, chrome.exe
         created by alexander.hanel

Dumping Process IEXPLORE.EXE with PID 3356
Suspicious Memory Block:
Addr: 0x20030000 Size:4096
Dumping Memory at 0x20030000 to 3356.0x20030000.bin   
Suspicious Memory Block:
Addr: 0x20010000 Size:4096
Dumping Memory at 0x20010000 to 3356.0x20010000.bin
Suspicious Memory Block:
Addr: 0x00aa0000 Size:4096
Dumping Memory at 0x00aa0000 to 3356.0x00aa0000.bin
Suspicious Memory Block:
Addr: 0x00a90000 Size:4096
Dumping Memory at 0x00a90000 to 3356.0x00a90000.bin
Suspicious Memory Block:
Addr: 0x00210000 Size:4096
Dumping Memory at 0x00210000 to 3356.0x00210000.bin
Suspicious Memory Block:
Addr: 0x00200000 Size:4096
Dumping Memory at 0x00200000 to 3356.0x00200000.bin
Suspicious Memory Block:
Addr: 0x001f0000 Size:4096
Dumping Memory at 0x001f0000 to 3356.0x001f0000.bin
Suspicious Memory Block:
Addr: 0x001e0000 Size:4096
Dumping Memory at 0x001e0000 to 3356.0x001e0000.bin
Suspicious Memory Block:
Addr: 0x00180000 Size:270336
Dumping Memory at 0x00180000 to 3356.0x00180000.bin
Suspicious Memory Block:
Addr: 0x00050000 Size:4096
Dumping Memory at 0x00050000 to 3356.0x00050000.bin
Suspicious Memory Block:
Addr: 0x00040000 Size:4096
Dumping Memory at 0x00040000 to 3356.0x00040000.bin
Suspicious Memory Block:
Addr: 0x00030000 Size:4096
Dumping Memory at 0x00030000 to 3356.0x00030000.bin
Suspicious Memory Block:
Addr: 0x00020000 Size:4096
Dumping Memory at 0x00020000 to 3356.0x00020000.bin

Dumping Process IEXPLORE.EXE with PID 3268
Suspicious Memory Block:
Addr: 0x20020000 Size:4096
Dumping Memory at 0x20020000 to 3268.0x20020000.bin
Suspicious Memory Block:
Addr: 0x00c30000 Size:4096
Dumping Memory at 0x00c30000 to 3268.0x00c30000.bin
Suspicious Memory Block:
Addr: 0x00c20000 Size:4096
Dumping Memory at 0x00c20000 to 3268.0x00c20000.bin
Suspicious Memory Block:
Addr: 0x00aa0000 Size:4096
Dumping Memory at 0x00aa0000 to 3268.0x00aa0000.bin
Suspicious Memory Block:
Addr: 0x00a90000 Size:4096
Dumping Memory at 0x00a90000 to 3268.0x00a90000.bin
Suspicious Memory Block:
Addr: 0x00a40000 Size:270336
Dumping Memory at 0x00a40000 to 3268.0x00a40000.bin
Suspicious Memory Block:
Addr: 0x00160000 Size:4096
Dumping Memory at 0x00160000 to 3268.0x00160000.bin
Suspicious Memory Block:
Addr: 0x00150000 Size:4096
Dumping Memory at 0x00150000 to 3268.0x00150000.bin

Dumping Process explorer.exe with PID 1688
Suspicious Memory Block:
Addr: 0x20020000 Size:4096
Dumping Memory at 0x20020000 to 1688.0x20020000.bin
Suspicious Memory Block:
Addr: 0x02af0000 Size:4096
Dumping Memory at 0x02af0000 to 1688.0x02af0000.bin
Suspicious Memory Block:
Addr: 0x02ac0000 Size:4096
Dumping Memory at 0x02ac0000 to 1688.0x02ac0000.bin
Suspicious Memory Block:
Addr: 0x02560000 Size:270336
Dumping Memory at 0x02560000 to 1688.0x02560000.bin
Suspicious Memory Block:
Addr: 0x02420000 Size:4096
Dumping Memory at 0x02420000 to 1688.0x02420000.bin
Suspicious Memory Block:
Addr: 0x01d90000 Size:4096
Dumping Memory at 0x01d90000 to 1688.0x01d90000.bin
Suspicious Memory Block:
Addr: 0x01750000 Size:4096
Dumping Memory at 0x01750000 to 1688.0x01750000.bin
Suspicious Memory Block:
Addr: 0x016b0000 Size:4096
Dumping Memory at 0x016b0000 to 1688.0x016b0000.bin
Suspicious Memory Block:
Addr: 0x016a0000 Size:4096
Dumping Memory at 0x016a0000 to 1688.0x016a0000.bin
Suspicious Memory Block:
Addr: 0x01620000 Size:4096
Dumping Memory at 0x01620000 to 1688.0x01620000.bin
Suspicious Memory Block:
Addr: 0x01600000 Size:4096
Dumping Memory at 0x01600000 to 1688.0x01600000.bin
Suspicious Memory Block:
Addr: 0x01500000 Size:4096
Dumping Memory at 0x01500000 to 1688.0x01500000.bin
Suspicious Memory Block:
Addr: 0x01470000 Size:4096
Dumping Memory at 0x01470000 to 1688.0x01470000.bin
Suspicious Memory Block:
Addr: 0x01460000 Size:4096
Dumping Memory at 0x01460000 to 1688.0x01460000.bin

Yes, a lot of files have been dumped but a lot of things can be done to parse out the good and the bad. The simplest would be checking for strings and MZ headers. The next version will have features to flag the bad ones. I have come up with some good ideas and found interesting anomalies during my research.  I don't know exactly how longit will take until I get the next version out. Odds are I will have to do a rewrite of the current code. I'm still learning C but I have learned that static type languages are much more difficult to modify than Python. Now I understand why teachers discussed program design so much; well that and linked list. The source code can be found here. The repo contains a compiled executable.

Autoit Script VM Detection

While looking at one of the samples reported to be apart of Norman's excellent "Unveiling an Indian Cyberattack Infrastructure - a special report" LINK; I saw something that caught my eye. I haven't looked at AutoIt scripting in a while but during dynamic analysis I saw nothing of interest. I recall reading an analysis by malware-lu using EXE2Aut to extract the AutoIt Script. After extracting the script the following routine caught my eye. The function speaks for itself. I have removed lines from the script so it won't compile to prevent cutting and pasting.

Func _checkvm()
 $strcomputer = "."
        __REMOVED
 $vmhit_count = 0
 $vmhit_details = ""
 If ProcessExists("VBoxService.exe") OR ProcessExists("VBoxTray.exe") OR ProcessExists("VMwareTray.exe") OR ProcessExists("VMwareUser.exe") Then _addvmhit($vmhit_count, $vmhit_details, "RUNNING SOFTWARE", "Found a Vbox or VMware guest OS service or tray process")
 If NOT IsObj($objwmiservice) Then
  Return ""
        __REMOVED
 $colitems = $objwmiservice.execquery("SELECT * FROM Win32_DiskDrive", "WQL", 16 + 32)
 If IsObj($colitems) Then
  For $objitem In $colitems
   $vreturn = $objitem.model
   Select 
    Case StringInStr($vreturn, "VBOX HARDDISK")
     _addvmhit($vmhit_count, $vmhit_details, "DISKS", 'Found device "VBOX HARDDISK"')
    Case StringInStr($vreturn, "QEMU HARDDISK")
     _addvmhit($vmhit_count, $vmhit_details, "DISKS", 'Found device "QEMU HARDDISK"')
    Case StringInStr($vreturn, "VMWARE VIRTUAL IDE HARD DRIVE")
     _addvmhit($vmhit_count, $vmhit_details, "DISKS", 'Found device "VMWARE VIRTUAL IDE HARD DRIVE"')
    Case StringInStr($vreturn, "VMware Virtual S SCSI Disk Device")
     _addvmhit($vmhit_count, $vmhit_details, "DISKS", 'Found device "VMware Virtual S SCSI Disk Device"')
   EndSelect
  Next
 EndIf
 $colitems = $objwmiservice.execquery("SELECT * FROM Win32_BIOS", "WQL", 16 + 32)
 If IsObj($colitems) Then
  For $objitem In $colitems
   Select 
                     __REMOVED
     _addvmhit($vmhit_count, $vmhit_details, "BIOS", "Found Vbox BIOS version")
    Case StringInStr($objitem.smbiosbiosversion, "virt")
     _addvmhit($vmhit_count, $vmhit_details, "BIOS", "Found Vbox BIOS version")
   EndSelect
  Next
 EndIf
 $colitems = $objwmiservice.execquery("SELECT * FROM Win32_Baseboard", "WQL", 16 + 32)
        __REMOVED
  For $objitem In $colitems
   Select 
    Case StringInStr($objitem.name, "Base Board") AND StringInStr($objitem.product, "440BX Desktop Reference Platform")
     _addvmhit($vmhit_count, $vmhit_details, "MOTHERBOARD", 'Found VMware-style motherboard, "440BX Desktop Reference Platform" / Name="Base Board"')
   EndSelect
  Next
 EndIf
 If $vmhit_count >= 2 Then
        __REMOVED
 Else
  Return ""
 EndIf
EndFunc

Interesting to see AutoIt scripts/executables being used as disposable installers for the first round of an attack. The script is 2500 lines long. I'm kind of surprised how much can be done by attackers using AutoIt scripts. Might be worth looking for UserAgents of "AutoItScript/".

Ramnit Analysis V1

I recently wrapped up an analysis of a Ramnit sample. Here is a download link for the PDF version and a link for the Google Docs version. The analysis is kind of a hybrid of an incident response report and a description of interesting low level features. The analysis does not focus on functionality or commands of the malware but how the malware interacts with parts of the operating system. The sample is from 2010. I was hoping to have downloaded one of the newer samples featured by on Microsoft's Malware Protection blog but that didn't happen on my first sample download. Sadly by the time I realized the compile date it was too late. I was already way too curious how the sample worked to quit. If anyone has a hash of one of the newer samples featured on MS's blog please shoot me an email or leave a comment.  I'd like to look at the MITB injection features. 

Disclaimer: The PDF was exported from Google Docs. I have noticed sometimes the images won't render properly in Firefox pdf.js.

Static Addresses, VirtualAlloc and Injected Processes

I have been recently working on updating bkdump.exe. The tool is a simple memory scanner that looks for memory marked as Read, Write and Execute (RWX) in processes. Version one would open up a dummy version of iexplore.exe using CreateProcess. Then scan the memory of iexplore.exe for any memory marked as RWX. If the memory is found it would dump it to a file. The new version can scan a specific process id (pid) and scan running processes of commonly injected processes such as firefox.exe,  iexplore.exe, chrome.exe and explorer.exe. As I have previously stated most of these concepts are not new. Many of these techniques have been implemented for detecting injected process in memory dumps for the Volatility project. I'll be releasing bkdump.exe in the next couple of days. I'd like to do some more testing. I'm new to C so I'm sure there will be some random quirks in my code but so far it's running and working.

During test cases of bkdump using families of malware I noticed something kind of interesting. A couple of families of malware can be detected by the size and allocated memory address of the injected process. Let's take for example Ramnit. I don't know how many version of Ramnit there are but I was able to identify five different variants of the injected process from 45 installers (Yes, the sample set is kind of small but my research machine is still packed up).  Let's check out a sample set of dumps from Ramnit installers that I was able to retrieve from VirusTotal Intelligence. The new version of bkdump can dump a process and save it to the working directory in the following syntax PID.0xAddress.bin. Below is the size and the file name.

Size          File Name                
36,864        1480.0x20020000.bin            
53,248        804.0x20030000.bin       
53,248        3944.0x20020000.bin       
53,248        3872.0x20030000.bin       
53,248        3788.0x20030000.bin       
53,248        3672.0x20030000.bin       
53,248        3276.0x20030000.bin       
53,248        3124.0x20030000.bin       
53,248        3000.0x20030000.bin       
53,248        2864.0x20030000.bin       
53,248        2804.0x20030000.bin       
53,248        2796.0x20030000.bin       
53,248        2776.0x20020000.bin       
53,248        2696.0x20030000.bin       
53,248        2600.0x20020000.bin       
53,248        2592.0x20020000.bin       
53,248        2520.0x20030000.bin       
53,248        2432.0x20030000.bin       
53,248        2268.0x20030000.bin       
53,248        1788.0x20030000.bin       
53,248        1476.0x20030000.bin       
53,248        1028.0x20030000.bin

57,344        528.0x20020000.bin       
57,344        3820.0x20020000.bin       
57,344        3452.0x20020000.bin       
57,344        3228.0x20020000.bin       
57,344        3068.0x20020000.bin       
57,344        2420.0x20020000.bin       
57,344        2148.0x20020000.bin       
57,344        156.0x20030000.bin   
       
135,168        3100.0x20020000.bin       
102,400        552.0x20020000.bin       
102,400        444.0x20020000.bin       
102,400        3832.0x20020000.bin       
102,400        3316.0x20020000.bin       
102,400        3220.0x20020000.bin       
102,400        3052.0x20020000.bin       
102,400        2816.0x20020000.bin       
102,400        2644.0x20020000.bin       
102,400        252.0x20020000.bin       
102,400        2456.0x20020000.bin       
102,400        2436.0x20020000.bin       
102,400        2020.0x20020000.bin       
102,400        1792.0x20020000.bin       
102,400        1496.0x20020000.bin   
   

Notice a recurring pattern in the size and address? If we open up the installer in Ollydbg and set a breakpoint on VirtualAlloc we can see why.


Ramnit is allocating memory at that address space. Let's change that value to from 20020000 to 200F0000. Now we can see it allocating memory at that specific offset in the injected process.



That didn't go very well. Basically what is happening is the programmer designed this variant to have a static base address of 0x02002000.  Here are the sizes and the allocated addresses of the ones I found.

Size     Address
36,864  0x20020000
53,248 
0x20020000, 0x20030000
57,344  0x20020000
135,168 0x20020000
102,400 0x20020000


Let's check out stats from some Zeus samples. Sorry for the different layout of the data. Old set.

Count - Size
12 - 159744
02 - 139264
01 - 102400
01 - 106496
02 - 208896
01 - 163840
03 - 090112
01 - 249856
01 - 217088
01 - 155648


Count - Address
20 - 0x024a0000
01 - 0x024d0000
04 - 0x016a1000
01 - 0x016a0000
01 - 0x025c0000


Addr: 0x024a0000 Size:159744
Addr: 0x024a0000 Size:159744
Addr: 0x024d0000 Size:139264
Addr: 0x024a0000 Size:155648
Addr: 0x024a0000 Size:139264
Addr: 0x016a1000 Size:102400
Addr: 0x016a0000 Size:106496
Addr: 0x024a0000 Size:208896
Addr: 0x024a0000 Size:159744
Addr: 0x025c0000 Size:159744
Addr: 0x024a0000 Size:208896
Addr: 0x024a0000 Size:163840
Addr: 0x024a0000 Size:159744
Addr: 0x016a1000 Size:90112
Addr: 0x024a0000 Size:159744
Addr: 0x016a1000 Size:90112
Addr: 0x016a1000 Size:90112
Addr: 0x024a0000 Size:249856
Addr: 0x024a0000 Size:159744
Addr: 0x024a0000 Size:159744
Addr: 0x024a0000 Size:159744
Addr: 0x024a0000 Size:159744
Addr: 0x024a0000 Size:159744
Addr: 0x024a0000 Size:217088
Addr: 0x024a0000 Size:159744


Kind of interesting.  I should have bkdump version 2 out in the next couple of days. Feel free to shoot me an email, leave a comment or ping me on twitter if you have any questions, comments or job offers :)

reiat.py - Using Data-Flow to Track Dynamically Loaded APIs


NOTE: Hello, I'm looking for a great employer to work remotely doing malware analysis & reverse engineering. I have recently moved back to Colorado to be closer to my family. Sorry but due to the move I will not be willing to relocate. If you work for an excellent company that has competitive pay and has a cool product or service please contact me via email (in the source code below). For a brief synopsis of my background please see the following link. Thanks.

Run-time dynamic linking is commonly seen when reverse engineering malware. It is useful when the programmer wants to load libraries without the use of the Windows loader. This technique can be used to save memory. A side effect of this approach is the libraries and the API names will not be in the import table of the portable executable file format. There are three steps in run-time dynamic linking. The first step is loading the library into memory,  done via LoadLibary. The second step is to get the address of exported function via GetProcAddress and lastly to call it. 

#include <stdio.h>
#include <windows.h>

typedef void (WINAPI *PGNSI)(HWND, LPCSTR, LPCSTR, UINT);

int main(void)
{
 PGNSI pGNSI;
 HANDLE hdll = LoadLibrary("User32.dll");
 if (NULL != hdll)
  printf("User32 loaded at %x\n", hdll);
  pGNSI = (PGNSI)GetProcAddress(hdll, "MessageBoxA");
  if (NULL != pGNSI)
   pGNSI(NULL, "MessageBoxA Was Called", "Yep", 0);
 return 0;
} 
 
 
The above C code loads the library User32.dll, gets the address of MessageBox using GetProcAddress and then calls the function. If we were to view the above C code in assembly compiled in Visual Studio we would get the following code.

Calling Locally Allocated Variables
In this simple example we can see that the return value (eax) of GetProcAddress is saved into [ebp+var_8] and then is called. All the code is contained in a single function and called via local variable. In most situations the API address is stored in a global variable. Take for example the following code.  The main thing to notice is the use of the API address getting passed to a dword.

Calling Global Variables
In the xrefs to the dword_1000F1E8 window we can see that we do not have any context to what the dword is. In order to find what the dword value is we would need to trace back to when the variable was populated (mov dword_1000F1E8, eax). Any malware analyst that has spent time reversing has come across this problem before. The first couple of times we will rename all the dwords manually, later we then realize this sucks and we should probably learn scripting in IDA. After that we make a bunch of one off scripts to populate the value for us. This works well if the pattern is consistent and we want to rename the dwords.  What if we didn't know the pattern? What if rather than calling a dword the code was calling a local variable and we wanted to comment it with the API name? Or better yet, what if we didn't want to write another script for renaming exported run-time functions. I have my doubts the later can be completed but we might as well try. 

Run-time dynamic linking APIs can be recovered typically in one of three methods. The first method uses global variables.We first locate where the dword that is being called is the destination (mov, dword_00XXXX, eax) . From there we will trace the source register (eax) to where it was populated. If the register we are interested becomes the destination, we will then start tracing the source register, this will continue until our source register is eax and the previous call is GetProcAddress. Once we found GetProcAddress we read the second argument (lpProcName) and now we can label or comment the call with the name of the API. An example of this can be seen in the Calling Global Variables image above. The second method is similar to the previously described method except the call is to a local variable rather than a dword. An example of this can be seen in Calling Locally Allocated Variables image from the MessageBoxA example. The third method is when the LoadLibrary and GetProcAddress are handled within a function and passed or returned back to the caller.. This method is not always static. Each programmer can decide to implement this is their own way. The programmer could have the function return the address, save the address in a buffer passed as an argument or etc, etc. This method gets more complicated because we have to reverse the function and understand how the address is being returned or saved off. 

In regards to these methods we only care about two things. The second argument of GetProcAddress and where we are calling the return of GetProcAddress. Since we are not calculating or manipulating the return address (xor, mul, add, etc) all we need to do is track when the data is moved around. The process of tracking data being used is called data-flow analysis and is a fundamental concept in compilers. Note: I learned about this concept last week. I found it by asking myself that great question of "What would a person smarter than myself call this?". Wikipedia for the win. Data-flow analysis is an extremely powerful tool in static analysis. We can use it to estimate/infer how data is being used during dynamic analysis and trace where data originated from. Okay time for some examples and code. 

reiat.py is a script that will use data-flow to rename or add comments to calls to APIs that were loaded via run-time dynamic linking. If we were to run reiat.py on the MessageBoxA we would see the following comment to where the function is called.
MessageBoxA Comment Added by reiat.py
It will track code the use of the variable all the way to the end of function. One example tracked the API after it had been moved five times and called a hundred bytes away.  Personally I think this is pretty cool. Here is the change on the global variables.

dwords renamed to the APIs
One tricky example is when the second argument is pushed before a call to another function such as GetModuleHandleA. This technique is common in Zeus.

Time for example of where the script fails. If the script fails the output will contain an Error message with the API name and location. This should help with tracking it down.  Here are two examples.

Odds are I will be continuing working on data-flow analysis in IDA. If you know of any good papers, code or have some comments please send me an email (source code below), ping me on twitter or leave a comment.  I'm still working on parts of the code but I'm hopeful it will be useful to others.

BitBucket Repository - LINK

Please download code from the repo link. New versions will not be updated below.

'''
Name:
        reiat.py

Version:
        0.2

Description:
        renames and add coments to apis that are are called via run-time dynamic analysis in IDA.
 To execute the script just call it in IDA 

Author:
        alexander<dot>hanel<at>gmail<dot>com

License:
reiat.py is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see
<http://www.gnu.org/licenses/>.

'''

from idaapi import *
import idautils
import idc

class getProcAddresser():
    def __init__(self):
        self.getProcAddressRefs = []
        self.registers = ['eax', 'ebx', 'ecx', 'edx', 'esi', 'edi', 'esp', 'ebp']

    def getRefs(self):
        'get all addresses of GetProcAddress'
        for addr in CodeRefsTo(LocByName("GetProcAddress"), 0):
            self.getProcAddressRefs.append(addr)

    def getlpProcName(self, GetProcAddress):
        'returns the address of the 2nd argument to GetProcAddress'
        pushcount = 0
        argPlacement = 2
        instructionMax = 10 + argPlacement
        currAddress = PrevHead(GetProcAddress,minea=0)
        while pushcount <= argPlacement and instructionMax != 0:
            if 'push' in GetDisasm(currAddress):
                pushcount += 1
                if pushcount == argPlacement:
                    return currAddress
            if 'GetModuleHandle' in GetDisasm(currAddress):
                    pushcount -= 1
            instructionMax -= 1
            currAddress = PrevHead(currAddress,minea=0)
        return None

    def getString(self, address):
        'reads the string value that is the second push'
        # note it will be useful to include back tracing code for variable reference. 
        api = GetString(GetOperandValue(address,ASCSTR_C), -1)
        if api == None:
            return None
        else:
            return api  

    def traceBack(self, address):
        funcStart = GetFunctionAttr(address, FUNCATTR_START)
        var = GetOpnd(address, 0)
        # return if digit is being pushed, likely an error on parsing
        if var.isdigit():
            return None
        # return, value is not being passed as a register. already checked
        # if offset in calling function
        if var not in self.registers:
            return None
        # get next address
        currentAddress = PrevHead(address)
        # get dism 
        dism = GetDisasm(currentAddress)
        # until end of function
        # Example:
        # mov ebp, offset aInternetconnec ; "InternetConnectA"
        # push    ebp
        while(currentAddress >= funcStart):
            # var = 'ebp', 
            if var in dism:
                # if operand == ebp, our tracked var is the destination
                if GetOpnd(currentAddress,0) == var:
                    mnem = GetMnem(currentAddress)
                    # if our tracked var is having something moved into it
                    if 'mov' in mnem or 'lea' in mnem:
                        # 4 scenarios on mov: string, digit, register unknown..
                        # 1. Check if destination is a string
                        # read operand 1 value, get address of "offset aInternetconnec"
                        value = GetOperandValue(currentAddress,1)
                        if value != None:
                            api = GetString(value, -1)
                            if api != None:
                                return api
                        # 2. Check if register
                        var = GetOpnd(currentAddress,1)
                        # 3. Check if digit
                        if var.isdigit() == True:
                            return None
                        # 4. Unknown
                        if var == None:
                            return None
                        
            currentAddress = PrevHead(currentAddress)
            dism = GetDisasm(currentAddress)
        return None

    def traceForwardRename(self, address, apiString):
        'address is call GetProcAddress, apiString is the API name'
        currentAddress = NextHead(address)
        funcEnd = GetFunctionAttr(address,  FUNCATTR_END)
        var = 'eax'
 lastref = ''
 lastrefAddress = None
        while currentAddress < funcEnd:
            dism = GetDisasm(currentAddress)
            # if we are not referencing the return from GetProcAddress
            # continue to next instuction
            if var not in dism:
                currentAddress = NextHead(currentAddress)
                continue
            #   mov     dword_1000F224, eax
            #   call    esi ; GetProcAddress
            #   push    offset aHttpaddreque_0 ; "HttpAddRequestHeadersW"
            #   push    dword_1000FD08  ; hModule
            #   mov     dword_1000F228, eax 
            # if we have the above instructions after GetProcAddress the code
            # is saving off the address of HttpAddRequestHeadersW.  
            if GetMnem(currentAddress) == 'mov' and GetOpnd(currentAddress,1) == var and GetOpType(currentAddress,0) == 2:
                # rename dword address
  status = True
                status = MakeNameEx(GetOperandValue(currentAddress,0), apiString, SN_NOWARN)
  if status == False:
   # some api names are already in use. Will need to be renamed to something generic. 
   # IDA will typically add a number to the function or api name. GetProcAddress_0
   status = MakeNameEx(GetOperandValue(currentAddress,0), str("__" + apiString), SN_NOWARN)
   if status == False:
    return None
                return currentAddress
     # tracked data is being moved into another destination
            if GetMnem(currentAddress) == 'mov' and GetOpnd(currentAddress,1) == var:
  lastref = var
  lastrefAddress = currentAddress
                var = GetOpnd(currentAddress,0)
            # add comments for call var
            # example:
            # call    ds:GetProcAddress
            # ...
            # call    eax
            if GetMnem(currentAddress) == 'call' and GetOpnd(currentAddress,0) == var:
                cmt = GetFunctionCmt(currentAddress,1)
                if apiString not in cmt:
                    cmt = cmt + ' ' + apiString
                    MakeComm(currentAddress, cmt)
                    return currentAddress
            
     # eax is usually over written by the the return value 
     if GetMnem(currentAddress) == 'call' and var == 'eax':
                return None
            currentAddress = NextHead(currentAddress)
        return None
    
    def rename(self):
        self.getRefs()
        for addr in self.getProcAddressRefs:
            lpProcNameAddr = self.getlpProcName(addr)
            if lpProcNameAddr == None:
                print "ERROR: Address of lpProcName at %s was not found" % hex(addr)
                continue
            lpProcName =  self.getString(lpProcNameAddr)
            if lpProcName == None:
                lpProcName = self.traceBack(lpProcNameAddr)
            if lpProcName == None:
                print "ERROR: String of lpProcName at %s was not found" % hex(addr)
                continue
            status = self.traceForwardRename(addr, lpProcName)
            if status == None:
                print "ERROR: Could not rename address at %s " % hex(addr)
                continue
            else:
                print "RENAMED %s at %s" % ( lpProcName, hex(status))
 

if __name__ == "__main__":
    ok = getProcAddresser()
    ok.rename()
        

Working with and Detecting Injected Processes

Occasionally it is nice to walk through code in a debugger. I love spending time in IDA but I have found that focusing on 100% static analysis slows down my learning process. If you read this blog you might have noticed that I spend a good amount of time reversing banking malware. Along with portable based document exploit analysis (not vulnerability) I find banking malware fascinating. The reason for this is because of all the different techniques and code that can reside in one executable. Take for example Cridex. It's code contains for XML parsing, form grabbing, monitoring registry traces, code injection and all sorts of interesting features. It's a lot of code to cover and it is easy to get lost. Debugging usually helps me to focus on one path of code execution. Sadly the authors of banking malware do not make debugging their software very easy, especially the injected process. Luckily there are tricks to help with debugging the injected process. Let's do a quick walk through of process injection. The malware will first find the process it wants to inject to, the malware will call OpenProcess, adjust it's token for the correct rights, call VirtualAlloc to allocate memory into the process, write into the process memory using WriteProcessMemory, call CreateRemoteThread and then we have code execution of the injected process. Finding the process to inject into can be done in a couple of ways. The malware can enumerate all processes then inject into selected processes or the malware can hook APIs related to the creation of new processes creation and then inject once those APIs get called. Cridex hooks NtCreateThread to ensure it injects into newly started processes.  A side effect of injecting into the creation of all new processes is the malware injects into all new processes. Let's use this side effect to find the entry point of the injected code of Cridex in Ollydbg. Firstly infect the machine with Cridex. Then Open up Ollydbg, Options > Debugging Options > Select System breakpoint. Then open up iexplorer.exe the folder path in Windows XP is (C:\Program Files\Internet Explorer\IEXPLORE.EXE). Then click View > Memory or ALT-M.

Memory of Injected Process
We will notice a block of memory that has Read, Write and Execute rights. If we set a breakpoint (F2) on this memory address Ollydbg will break on the entry point of the injected code.Remove breakpoint.
Start of Injected Code
In the image above we can see string references to LoadLibaryA and GetProcAddress which is used to rebuild the import table. At this point if we wanted to now where the Cridex executable and registry keys were stored, we could use Search for all referenced text strings in Ollydbg. This technique also works for Citadel but the memory won't be highlighted in red. If we search for a memory block with no owner and REW access we should be able to find it.

Disclaimer Time:
Horrible C code is ahead. I have not coded in C in years and even then it was not much.  I should probably remove that skill from my resume. I tried to include all links or references I used. The following code was for me to learn and get back into C. Nothing original. More of me just sketching out ideas. I would not apply these techniques for incident response. It would be much much better to use Volatility for something like this.

If we were to use a tool like Process Explorer we would not see the malware running on the machine. A lot of the time the watcher process is running in the memory space of explorer.exe. Usually the injected process is wanting to be injected into iexplorer.exe, firefox.exe or chorme.exe. If we wanted to detect the malware or dump the malware the best choice would be to help out the malware by starting a browser. Then enumerate all memory sections with a protection of RWE. If we find a block of memory with this protection we dump the block of memory to a file. After that we close down the browser and exit.

Let's walk through that process. We can get the address of iexplorer.exe on Windows XP and Windows 7 by reading the default value at the registry key HKEY_CLASSES_ROOT\applications\iexplore.exe\shell\open\command, once we have the file path we open iexplorer.exe in debug mode using CreateProcessA. This will ensure that iexplorer.exe doesn't start messing with the memory protections. At this point the malware will see the new process and inject into it. We can now use VirtualQueryEx to query the memory protection. If the protection is PAGE_EXECUTE_READWRITE we will use ReadProcessMemory to read the contents and finally dump the contects to a file with the name of the memory address. What would this look like on a machine infected with Citadel and Cridex?

Scan results on a machine infected with Cridex and Citadel
Cool. Now we have dumps of the files and we can do whatever we usually do with memory dumps. Below is the code. Warning It's only a POC. Which has only been tested on Windows XP. Which was compiled with Microsoft Visual C++ 2010. Which has dependencies. And it's a debug build. I know it just keeps getting worse.. Please do not download code from my blog. I don't update the code here, only on the repo. The repo contains an executable in the bin folder. The needed dependencies are also included in a zip.

Bibucket Repo - LINK


    #include <windows.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <tchar.h>
    /*
    Name:
        bkdump.exe
    Version:
        0.1
    Description:
        Starts iexplor.exe and then dumps any memory that has RWE rights. POC only tested on Win XP.
    Author:
        alexander<dot>hanel<at>gmail<dot>com
    License:
    bkdump is free software: you can redistribute it and/or modify it
    under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful, but
    WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
    General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program. If not, see
    <http://www.gnu.org/licenses/>.
    Notes:
        * Best technique for finding dynamically loaded dlls in a process space?
    To Do:
    Useful Links:
        http://www.youtube.com/watch?NR=1&v=lwFIC7It3Fc&feature=endscreen <- most of the code came from here. Awesome Video!
        http://www.catch22.net/tuts/undocumented-createprocess
        http://www.blizzhackers.cc/viewtopic.php?p=2483118
        http://cboard.cprogramming.com/windows-programming/102965-help-mbi-baseaddress-loop.html
    */
    typedef struct _MEMBLOCK
    {
        HANDLE hProc;
        unsigned char *addr;
        int size;
        unsigned char *buffer;
        struct _MEMBLOCK *next;
        
    } MEMBLOCK;
    MEMBLOCK* create_memblock (HANDLE hProc, MEMORY_BASIC_INFORMATION *meminfo)
    {    // used to create the membloc
        MEMBLOCK *mb = malloc(sizeof(MEMBLOCK));
        if (mb)
        {
            mb->hProc = hProc;
            mb->addr = meminfo->BaseAddress;
            mb->size = meminfo->RegionSize;
            mb->buffer = malloc(meminfo->RegionSize);
            mb->next = NULL;
        }
        return mb;
    }
    void free_memblock (MEMBLOCK *mb)
    {
        if (mb)
        {
            if (mb->buffer)
            {
                free (mb->buffer);
            }
            free (mb);
        }
    }
    unsigned int peek (HANDLE hProc, int data_size, unsigned int addr)
    {
    unsigned int val = 0;
    if (ReadProcessMemory (hProc, (void*)addr, &val, data_size, NULL) == 0)
    {
    printf ("peek failed\r\n");
    }
    return val;
    }
    char * getIePath()
    {
        // example of reading registry http://www.codersource.net/Win32/Win32Registry/RegistryOperationsusingWin32.aspx
        char lszValue[MAX_PATH];
        char **newValue = lszValue + 1;
        char *strkey;
        HKEY hKey;
        LONG returnStatus;
        DWORD dwType = REG_SZ;
        DWORD dwSize = MAX_PATH;
        returnStatus = RegOpenKeyEx(HKEY_CLASSES_ROOT,TEXT("applications\\iexplore.exe\\shell\\open\\command"), 0L, KEY_READ, &hKey);
        if (returnStatus == ERROR_SUCCESS)
        {
            returnStatus = RegQueryValueExA(hKey, NULL, NULL, &dwType, (LPBYTE)&lszValue, &dwSize);
            if(returnStatus == ERROR_SUCCESS)
            {
                RegCloseKey(hKey);
                if( ( strkey=strstr(lszValue, "%1" ) ) !=NULL)
                    *(strkey=strkey-2)='\0';
                printf("iexplorer.exe path is %s", newValue);
                // newValue was the easiest way I could find to remove the first char. I miss python
                return newValue;
            }
            else
            {
            printf("ERROR: Registry IE Path not Found");
            }
        }
        else
        {
            printf("ERROR: Registry IE Path not Found");
        }
        RegCloseKey(hKey);
        return NULL;
    }
    MEMBLOCK* create_scan ( unsigned int pid)
    {
        char path[MAX_PATH];
        MEMBLOCK *mb_list = NULL;
        MEMORY_BASIC_INFORMATION meminfo;
        unsigned char *addr = 0;
        STARTUPINFO si;
        PROCESS_INFORMATION pi;
        ZeroMemory(&si, sizeof(si));
        si.cb = sizeof(si);
        strcpy(path,getIePath());
        if(!CreateProcessA(path , NULL, NULL, NULL, FALSE, DEBUG_PROCESS, NULL, NULL, &si, &pi))
            printf("\nSorry! Broke on CreateProcess()\n\n");
        else
        {
            printf("\nDummy Process Started");
        }
        if (pi.hProcess)
        {
            while (1)
            {
                if (VirtualQueryEx(pi.hProcess, addr, &meminfo, sizeof(meminfo)) == 0)
                { // query addresses, reads all meomory including non-commited
                    break;
                }
                if (meminfo.Protect & PAGE_EXECUTE_READWRITE)
                {
                    MEMBLOCK *mb = create_memblock (pi.hProcess, &meminfo);
                    if (mb)
                    {
                        mb->next = mb_list;
                        mb_list = mb;
                    }
                }
                addr = ( unsigned char*)meminfo.BaseAddress + meminfo.RegionSize;
            }
        }
        return mb_list;
    }
    void free_scan (MEMBLOCK *mb_list)
    {
        CloseHandle(mb_list->hProc);
        while ( mb_list)
        {
            MEMBLOCK *mb = mb_list;
            mb_list = mb_list->next;
            free_memblock (mb);
        }
    }
    void dump_scan_info ( MEMBLOCK *mb_list)
    {
        MEMBLOCK *mb = mb_list;
        char *buffer = (char*) malloc(mb->size);
        while (mb)
        {
            char *buffer = (char*) malloc(mb->size);
            FILE *fp;
            char filename[15];
            sprintf(filename, "0x%08x.bin", mb->addr);
            fp=fopen(filename, "wb");
            printf ("\nSuspicious Memory Block:\nAddr: 0x%08x Size:%d\r\n", mb->addr, mb->size);
            if (ReadProcessMemory(mb->hProc,(void*)mb->addr, buffer, mb->size, NULL) != 0)
            {
                printf ("Dumping Memory at 0x%08x", mb->addr);
                fwrite(buffer,1, mb->size, fp);
                fclose(fp);
            }
            else
                printf("Error Could Not Dump Memory");
            mb = mb->next;
        }
    }
    int main(int argc, char *argv[])
    {
        
        MEMBLOCK *scan = create_scan(0);
        if (scan)
        {
            dump_scan_info (scan);
            free_scan (scan);
        }
        /*
        
        */
        return 0;
    }
Source code highlighted with hilite.me

pe-carv.py - ASCII Hex and Overlays

Last month there was an interesting forum discussion on OpenRCE about carving and converting ASCII Hex encoded executable to binary.
ASCII HEX Encoded Executable
We can see the start of MZ header at offset 0x16. 'M' = 0x4d, 'Z' = 0x5a. This encoding technique is commonly used by malicious Microsoft Documents (Excel, PowerPoint, etc) for encoding an executable payload. This technique can be a little annoying because we will have to manually carve out the data and then convert it from ASCII hex to bin. Along with a couple of other options pe-carv.py has been updated to help with carving out encoded executables in the ASCII Hex format.

pe-carv.py
Usage: pe-carv.py [options] <carving.file>

Options:
  -h, --help            show this help message and exit
  -o OUTPUT, --output=OUTPUT
                        output file name
  -a, --ascii_blob      read as hex ascii blob
  -v, --verbose         print MZ location
  -l, --overlay         get overlay, default 1024 bytes
  -s SIZE, --size=SIZE  size of overlay

Quick overview. -o, --output can used to name the output file. -a, --a can be used to convert the whole file from ascii hex to binary (if not ascii hex the byte will be ' '), -v --verbose prints the offset of the where the executable was found, -l --overlay will read 1024 bytes after what Pefile finds as the file size, -s --size is used along with overlay to define a custom size. For example, let's say we have a binary file that contains a ASCII hex executable, an embedded executable and then another ASCII hex executable. Different encodings will have to be handled separately.

pe-carv.py -o hello -v -a -l -s 10 test.bin
        * exe found at offset 0xb
        * exe found at offset 0x8ed83
dir

03/09/2013  10:52 PM    <DIR>          .
03/09/2013  10:52 PM    <DIR>          ..
03/09/2013  10:52 PM           525,834 hello-1.exe
03/09/2013  10:52 PM           525,834 hello-2.exe
03/09/2013  10:17 PM             5,483 pe-carv.py
03/09/2013  10:17 PM         2,270,982 test.bin

-o hello is the output file name. A count is added in case there are multiple file names. -v is for verbose to show where the files are located . -a is for treating the file as an ascii blob. -l is to get the overlay with a size of 10. The size is specified by the -s 10 (int) and the file name. The -a is a little slower than the other options. This is due to the file being read and converted two bytes at a time. Not the most efficient way. Now for a simple example of grabbing an embedded executable with no options.

pe-carv.py test.bin
dir

03/09/2013  10:57 PM    <DIR>          .
03/09/2013  10:57 PM    <DIR>          ..
03/09/2013  10:57 PM            69,120 1.exe
03/09/2013  10:17 PM             5,483 pe-carv.py
03/09/2013  10:17 PM         2,270,982 test.bin

 If there are any suggestions or feedback please shoot me an email. My email address is in the code. Please download the code from the repo.

BitBucket Repo - LINK

Thank you gN3mes1s, legola, anonymouse and others on OpenRCE.