.text:10004E99 mov byte_1003B03C, al
.text:10004E9E movsx ecx, byte_1003B03C
.text:10004EA5 imul ecx, 0A2h
.text:10004EAB mov byte_1003B03C, cl
.text:10004EB1 movsx edx, byte_1003B03C
.text:10004EB8 xor edx, 0A4h
.text:10004EBE mov byte_1003B03C, dl
.text:10004EC4 movsx eax, byte_1003B03C
.text:10004ECB cdq
.text:10004ECC mov ecx, 0C8h
.text:10004ED1 idiv ecx
.text:10004ED3 mov byte_1003B03C, al
.text:10004ED8 xor eax, eax
.text:10004EDA jmp short loc_10004F01
.text:10004EDC ; ---------------------------------------------------------------------------
.text:10004EDC movsx edx, byte_1003B03C
.text:10004EE3 or edx, 0D2h
.text:10004EE9 mov byte_1003B03C, dl
.text:10004EEF movsx eax, byte_1003B03C
.text:10004EF6 imul eax, 0C1h
.text:10004EFC mov byte_1003B03C, al
The old version did not know that AL is the lower address of EAX due to the use of string comparison. The new version does a simple check of the register name and it's purpose. Note: there will be some issues if AH is moved into AL or other similar operations. I didn't code that logic in. If we were to back trace the code above we would have the following output.
Python>s.backtrace(here(),1) 0x10004efc mov byte_1003B03C, al 0x10004ef6 imul eax, 0C1h 0x10004eef movsx eax, byte_1003B03C 0x10004ee9 mov byte_1003B03C, dl 0x10004ee3 or edx, 0D2h 0x10004edc movsx edx, byte_1003B03C 0x10004ed3 mov byte_1003B03C, al 0x10004ec4 movsx eax, byte_1003B03C 0x10004ebe mov byte_1003B03C, dl 0x10004eb8 xor edx, 0A4h 0x10004eb1 movsx edx, byte_1003B03C 0x10004eab mov byte_1003B03C, cl 0x10004ea5 imul ecx, 0A2h 0x10004e9e movsx ecx, byte_1003B03C 0x10004e99 mov byte_1003B03C, al
The code also tracks how some general purpose instructions manipulate different registers. Most of them are simple due to the x86 standard of instruction destination source format. Not all of them are though. I spent a good amount of time wondering what variables to back trace when following instructions such as DIV. Is EAX or the DIV operand more important back trace? I went with the operand but in the future I plan on creating back split trace that will track EAX and the operand passed to DIV. Odds are there are still more general purpose instructions I need to check for. XADD is a pretty cool instruction. The shortest Fibonacci can be written using XADD.
This version was written in order for me to crack an obfuscation technique that I have seen lately. Using backtrace.py and the last line of the dead code blocks I'm able to identify most of the junk code and variables. I'm sure there are flaws (like not tracing push or pops...future release) but so far it is working well for me. I hope the code is of use to others. If you have any recommendations, thoughts, etc please shoot me an email (line 20 of the source code) or ping me on twitter.
No comments:
Post a Comment