A couple of days ago there was a blog post from EFF about malware used by the Syrian Government. The malware is a common remote access trojan by the name of DarkComet RAT. I didn't care to analyze the RAT but I was more interested in what steps the attackers took to obfuscate the malware. Plus out of morbid curiosity I wanted to know if I could write a script to extract the C2. For a more detailed analysis of DarkComet I would recommend the following link by @quequero. This post won't cover any of the indicators of compromise and behavior of the malware. Those can be found in the provided links. In this post I will cover the steps needed to carve out the files, decode them and then something kind of random to extract the C2. The original file name was "اسماء بعض المسلحين في سورية والخارج المطلوبين لدى النظام السوري2012_m-fdp.scr,". The Virustotal results can be found here. The executable contains a PDF icon to trick the user into executing the file. The executable is a Rar installer. Rather than executing the file we can use 7zip to extract out the embedded files.

Three files were extracted. StartMenu.dll is a blob of binary data, system.exe is a UPXed packed executable and the PDF is a valid PDF. System.exe is too small to be a remote access trojan and StartMenu.dll is blob of data. It's safe to assume that System.exe decrypts and executes StartMenu.dll. Might as well try the standard UPX -d first on system.exe, success. Once system.exe is unpacked we can use PEID to identify that it's written in Microsoft Visual Basic. If we were to open this up in IDA, we will see the typical  ThunRTMain, VB crap functions and two other non-standard VB functions. One is related to loading a DLL while the second contains a bunch of logical bit shift operations. It's safe to assume this is our decryption routine for StartMenu.dll. The assembly and my notes can be seen below.

.text:00403094 .text:00403094 ; =============== S U B R O U T I N E ======================================= .text:00403094 .text:00403094 ; Attributes: bp-based frame .text:00403094 .text:00403094 _decrypt proc near ; CODE XREF: sub_404417+9Ep .text:00403094 .text:00403094 cp_sizeOfBuffer = dword ptr -60h .text:00403094 cp_keylen = dword ptr -5Ch .text:00403094 var_58 = dword ptr -58h .text:00403094 var_54 = dword ptr -54h .text:00403094 var_40 = dword ptr -40h .text:00403094 var_38 = dword ptr -38h .text:00403094 var_30 = byte ptr -30h .text:00403094 keylen = dword ptr -2Ch .text:00403094 count = dword ptr -28h .text:00403094 sizeOfBuffer = dword ptr -24h .text:00403094 _key? = dword ptr -20h .text:00403094 _key_index = dword ptr -1Ch .text:00403094 var_18 = byte ptr -18h .text:00403094 index = dword ptr -14h .text:00403094 var_10 = dword ptr -10h .text:00403094 var_8 = dword ptr -8 .text:00403094 var_4 = dword ptr -4 .text:00403094 addr = dword ptr 8 .text:00403094 arg_4 = dword ptr 0Ch .text:00403094 .text:00403094 push ebp .text:00403095 mov ebp, esp .text:00403097 push ecx .text:00403098 push ecx .text:00403099 push offset __vbaExceptHandler .text:0040309E mov eax, large fs:0 .text:004030A4 push eax .text:004030A5 mov large fs:0, esp .text:004030AC push 54h .text:004030AE pop eax .text:004030AF call __vbaChkstk .text:004030B4 push ebx .text:004030B5 push esi .text:004030B6 push edi .text:004030B7 mov [ebp+var_8], esp .text:004030BA mov [ebp+var_4], offset dword_4010F8 .text:004030C1 mov edx, [ebp+arg_4] .text:004030C4 lea ecx, [ebp+_key?] .text:004030C7 call __vbaStrCopy .text:004030CC push [ebp+_key?] .text:004030CF call __vbaLenBstr .text:004030D4 mov [ebp+keylen], eax .text:004030D7 mov eax, [ebp+addr] .text:004030DA push dword ptr [eax] .text:004030DC push 1 .text:004030DE call __vbaLbound .text:004030E3 mov [ebp+count], eax .text:004030E6 mov eax, [ebp+addr] .text:004030E9 push dword ptr [eax] .text:004030EB push 1 .text:004030ED call __vbaUbound ; get size of the data .text:004030F2 mov [ebp+sizeOfBuffer], eax .text:004030F5 mov eax, [ebp+keylen] .text:004030F8 mov [ebp+var_58], eax .text:004030FB mov [ebp+var_54], 1 .text:00403102 mov [ebp+_key_index], 1 .text:00403109 jmp short loc_403114 .text:0040310B ; --------------------------------------------------------------------------- .text:0040310B .text:0040310B _inter_key: ; CODE XREF: _decrypt+149j .text:0040310B mov eax, [ebp+_key_index] .text:0040310E add eax, [ebp+var_54] ; increment by 1 .text:00403111 mov [ebp+_key_index], eax .text:00403114 .text:00403114 loc_403114: ; CODE XREF: _decrypt+75j .text:00403114 mov eax, [ebp+_key_index] .text:00403117 cmp eax, [ebp+var_58] ; length of the key .text:0040311A jg loc_4031E2 .text:00403120 mov [ebp+var_38], 1 .text:00403127 mov [ebp+var_40], 2 .text:0040312E lea eax, [ebp+var_40] .text:00403131 push eax .text:00403132 push [ebp+_key_index] .text:00403135 push [ebp+_key?] ; "E8rBCjUuQrmWX6h2fqPfZtLeWp2sAbzCwadHxIG0cOHHRmB11d19lrTg0SYI" .text:00403138 call rtcMidCharBstr .text:0040313D mov edx, eax .text:0040313F lea ecx, [ebp+var_30] .text:00403142 call __vbaStrMove .text:00403147 push eax .text:00403148 call rtcAnsiValueBstr ; convert string to ascii .text:0040314D xor ax, 0FFh ; not an and... .text:00403151 mov [ebp+var_18], al .text:00403154 lea ecx, [ebp+var_30] .text:00403157 call __vbaFreeStr .text:0040315C lea ecx, [ebp+var_40] .text:0040315F call __vbaFreeVar .text:00403164 mov eax, [ebp+sizeOfBuffer] .text:00403167 mov [ebp+cp_sizeOfBuffer], eax ; szie .text:0040316A mov eax, [ebp+keylen] .text:0040316D mov [ebp+cp_keylen], eax .text:00403170 mov eax, [ebp+count] .text:00403173 mov [ebp+index], eax ; label incorrect..but applicable later .text:00403176 jmp short loc_403181 .text:00403178 ; --------------------------------------------------------------------------- .text:00403178 .text:00403178 _loop: ; CODE XREF: _decrypt+140j .text:00403178 mov eax, [ebp+index] .text:0040317B add eax, [ebp+cp_keylen] .text:0040317E mov [ebp+index], eax ; cur_byte = data_index+len(key) .text:00403181 .text:00403181 loc_403181: ; CODE XREF: _decrypt+E2j .text:00403181 mov eax, [ebp+cp_keylen] .text:00403184 sar eax, 31 .text:00403187 xor eax, [ebp+index] .text:0040318A mov ecx, [ebp+cp_keylen] .text:0040318D sar ecx, 31 .text:00403190 xor ecx, [ebp+cp_sizeOfBuffer] .text:00403193 cmp eax, ecx ; eax = address start of buffer .text:00403195 jg short break_out ; if cur_byte > sizeofBuffer: JMP .text:00403197 mov eax, [ebp+addr] .text:0040319A mov eax, [eax] ; eax = address of buffer .text:0040319C mov ecx, [ebp+addr] .text:0040319F mov ecx, [ecx] ; ecx = address of buffer .text:004031A1 mov edx, [ebp+index] ; edx = index ...[ecx+14] = 0 .text:004031A4 sub edx, [ecx+14h] ; edx = index .text:004031A7 mov eax, [eax+0Ch] ; eax = address of data buffer .text:004031AA mov al, [eax+edx] ; al = [bufferaddr+index] .text:004031AD xor al, [ebp+var_18] .text:004031B0 movzx eax, al .text:004031B3 mov ecx, [ebp+index] .text:004031B6 and ecx, 0FFh ; ecx = index & 0xff .text:004031BC xor eax, ecx ; _ .text:004031BC ; Summary: .text:004031BC ; (buffer[buff+index] ^ (count%len(key) ^ 0xff)) ^ (index & 0xff) .text:004031BE mov ecx, [ebp+addr] .text:004031C1 mov ecx, [ecx] .text:004031C3 mov edx, [ebp+addr] .text:004031C6 mov edx, [edx] .text:004031C8 mov esi, [ebp+index] .text:004031CB sub esi, [edx+14h] .text:004031CE mov ecx, [ecx+0Ch] ; pointer address .text:004031D1 mov [ecx+esi], al ; save off byte, ecx = addr, esi = index .text:004031D4 jmp short _loop .text:004031D6 ; --------------------------------------------------------------------------- .text:004031D6 .text:004031D6 break_out: ; CODE XREF: _decrypt+101j .text:004031D6 mov eax, [ebp+count] .text:004031D9 inc eax .text:004031DA mov [ebp+count], eax .text:004031DD jmp _inter_key .text:004031E2 ; --------------------------------------------------------------------------- .text:004031E2 .text:004031E2 loc_4031E2: ; CODE XREF: _decrypt+86j .text:004031E2 push offset loc_403203 .text:004031E7 jmp short loc_4031FA .text:004031E9 ; --------------------------------------------------------------------------- .text:004031E9 .text:004031E9 loc_4031E9: ; DATA XREF: .text:00401104o .text:004031E9 lea ecx, [ebp+var_30] .text:004031EC call __vbaFreeStr .text:004031F1 lea ecx, [ebp+var_40] .text:004031F4 call __vbaFreeVar .text:004031F9 retn .text:004031FA ; --------------------------------------------------------------------------- .text:004031FA .text:004031FA loc_4031FA: ; CODE XREF: _decrypt+153j .text:004031FA ; DATA XREF: .text:00401100o .text:004031FA lea ecx, [ebp+_key?] .text:004031FD call __vbaFreeStr .text:00403202 retn .text:00403203 ; --------------------------------------------------------------------------- .text:00403203 .text:00403203 loc_403203: ; DATA XREF: _decrypt:loc_4031E2o .text:00403203 mov ecx, [ebp+var_10] .text:00403206 mov large fs:0, ecx .text:0040320D pop edi .text:0040320E pop esi .text:0040320F pop ebx .text:00403210 leave .text:00403211 retn 8 .text:00403211 _decrypt endp ; sp-analysis failed .text:00403211

The next step is to create a decoder in Python. Quick summary of the code above. We have a blob of data and a password of "E8rBCjUuQrmWX6h2fqPfZtLeWp2sAbzCwadHxIG0cOHHRmB11d19lrTg0SYI".  A combination of XOR and AND bit operations are used to decode the blob of binary data using the key. Sorry for the brevity but describing code isn't my strong point. The below Python code should help illustrate how it's decoded.
import sys
from StringIO import StringIO
if sys.platform == "win32":
    import os, msvcrt
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
    
def decode(data, key):
    if type(data) is str:
        d = StringIO(data)
    l = len(key)
    idx = 0
    byte = data.read(1)
    while 1:
        if not byte:
            break
        k = ord(key[idx%l]) ^ 0xff
        stage1 = (ord(byte) ^ k)
        stage2 = stage1 ^( idx & 0xff)
        sys.stdout.write(chr(stage2))
        idx += 1
        byte = data.read(1)
    return
                   
k = "E8rBCjUuQrmWX6h2fqPfZtLeWp2sAbzCwadHxIG0cOHHRmB11d19lrTg0SYI"
f = open(sys.argv[1],'rb')
decode(f,k)
f.close()
Note: the password can be arbitrarily chosen by the attacker.  The script will need the file StartMenu.dll passed to it and the output piped to a file.


As seen in the image, the decoded.exe is packed and can be unpacked using UPX. A quick upx -d decoded.exe and now we have the original executable.  A quick view of the strings reveals no IP. We could sit down and sort through the 3k functions trying to find the C2 routine. But we already know the C2 is an IP. It's going to be loaded into memory at some point after execution, so let's cheat. What's the easiest way to get the C2? Yep, double click, dump, strings. The double Ds. Let's implement a generic approach. We have an unpacked exe, let's grab some strings, create a yara signature, execute the executable, scan all running processes, and if found scan the memory block for an IP using a regular expression.

Here are some nice dumb strings for the yara signature. 

rule DarkComet : RAT
{
    strings: 
        $a = "#botCommand%Respond"
    condition:
        $a 
}
Now we just need a tool to dump out the memory for each process. For this task we can use mdmp by Vlad Ioan Topan. The python bindings for mdmp is super simple. The following code will enumerate all processes using psutil, dump them to a buffer, then scan them with Yara and run a regex against the memory for the pattern of an IP. Sure we could monitor the network traffic or something similar. Scapy would be a good solution. The pedantics will probably point out some issues around speed, and processing cycles...which is correct but this is a simple POC.
# Created by alexander<dot>hanel<at>gmail<dot>com
# License: Free game, do whatever you wish with it. I don't care. 
import imp
import sys
import os
import re
import socket
sys.path.append(os.getcwd())
import warnings
# ignore the warning from psutil 
warnings.filterwarnings("ignore", category=DeprecationWarning)

# the following modules will need to be installed.
try:
    imp.find_module('yara')
    import yara
    imp.find_module('pymdmp')
    import pymdmp
    imp.find_module('psutil')
    import psutil
except ImportError as error :
    print '[IMPORT ERROR] %s - aborting' % error
    sys.exit()
    
class POORSCAN():
    def __init__(self):
        self.error = False
        self.importTest()
        self.ip = []
        self.psList = psutil.get_process_list()
        # add white listed processes here 
        self.whitelist = ['System', 'system',  'python']
        self.ps = self.removeWhiteListedPS()
        
    def importTest(self):
        'validate non-standard libraies are installed'
        try:
            imp.find_module('yara')
            import yara
            imp.find_module('pymdmp')
            import pymdmp
            imp.find_module('psutil')
            import psutil
        except ImportError as error :
            print '[IMPORT ERROR] %s - aborting' % error
            self.error = False
            return


    def removeWhiteListedPS(self):
        'remove whitelisted processes'
        tmp = []
        for x in self.psList:
            b = True
            for w in self.whitelist:
                if w in x.name.lower():
                        b = False
            if b == True:
                tmp.append(x)
        return tmp

    def yaraScan(self,dump):
        'scan section memory using yara'
        m = ''
        try:
            r = yara.compile(r'rules.yara')
        except:
            pass
            print '\t[ERROR] Yara compile error - aborting scan'
            return None
        try:
            m = r.match(data=dump)
        except:
            pass
        return m

    def checkforIP(self,data):
        'simple regex search for an ip'
        # Using socket to check for valid ips saves time over using a complicated regex 
        ip = re.compile(r'[0-9]+(?:\.[0-9]+){3}')
        match = ip.findall(str(data))
        for x in match:
            try:
                socket.inet_aton(x)
            except socket.error:
                continue
            self.ip.append(x)

    def scanMem(self):
        print 'Scanning Started...'
        for x in self.ps:
            print 'Scanning Process: %s' % x.name
            # probably should add some error stuff here..not very speedy either...
            dumped = pymdmp.dump(pymdmp.SEL_BY_PID, pymdmp.DUMP_ALL_MEM, 0, processID=x.pid)
            hit = False
            for section in dumped:
                match = self.yaraScan(section[4])
                if match:
                    hit = True
                    print "MATCH: in %s pid %s" % (x.name, x.pid)
            if hit:
                for section in dumped:
                    self.checkforIP(section[4])
                self.ip.sort()
                print 'Found IPs:'
                for i in self.ip:
                    print i
            self.ip = []
        print "Scanning Complete..."
        
if __name__ == '__main__':
    x = POORSCAN()
    x.scanMem()

Note: if we were concerned about mult-byte or unicode chars we would need change the regular expression options (re.Unicode).

Here is the output.
Scanning Started...
Scanning Process: procexp.exe
Scanning Process: cmd.exe
Scanning Process: smss.exe
Scanning Process: csrss.exe
Scanning Process: winlogon.exe
Scanning Process: services.exe
Scanning Process: lsass.exe
Scanning Process: explorer.exe
Scanning Process: vmacthlp.exe
Scanning Process: svchost.exe
Scanning Process: svchost.exe
Scanning Process: svchost.exe
Scanning Process: svchost.exe
Scanning Process: TPAutoConnect.exe
Scanning Process: svchost.exe
Scanning Process: spoolsv.exe
Scanning Process: VMwareTray.exe
Scanning Process: vmtoolsd.exe
Scanning Process: jqs.exe
Scanning Process: vmtoolsd.exe
Scanning Process: TPAutoConnSvc.exe
Scanning Process: IEXPLORE.EXE
MATCH: in IEXPLORE.EXE pid 1796
Found IPs:
0.0.0.0
...
2.5.4.9
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
216.6.0.28
255.255.255.255
255.255.255.255
255.255.255.255
5.1.0.0
5.1.0.0
...
6.0.0.0
6.0.0.0
6.0.0.0
Scanning Process: alg.exe
Scanning Complete...

Notice that DarkComet injects it's code into IEXPLORER.exe. The C2 is the 216.6.0.28 IP. Here are the full scan results. There are a good amount of false positives but they are easy to sort out. A more productive approach would be to do all this from the command line and spend the time to figure out how DarkComet stores the C2 in the executable. Still kind of fun though.

6 comments:

  1. Just a simple question, how much time you have spent on this work? (i mean all of the analysis and scripting only).

    Thanks.

    ReplyDelete
    Replies
    1. Off and on total, maybe five or six hours.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. It's a slim article, but notice the StartMenu.DLL as no relation with darkcomet

    ReplyDelete
    Replies
    1. Agreed on both parts. Also, I guess the term "authors" is a little ambiguous. I'll change that to the "attackers". We don't want it to sound like you (the author) is directly involved in the Syrian Attacks.

      Delete
  4. Nice work Alexander. I saw this on twitter and thought I'd drop you a note just in case you haven't heard of Volatility and its yarascan plugin [1]. By just suspending your VM after running the malware, you can scan anywhere in process or kernel memory for yara signatures (also dump processes, dlls, kernel drivers, hidden or unhidden, etc).

    [1]. http://code.google.com/p/volatility/wiki/CommandReferenceMal22#yarascan

    ReplyDelete