CYS 525: Introduction to Reverse
Engineering
Introduction part
Prepared by Dr. Fahd Alhaidari
Contents
❑ What is Reverse Engineering (RE)?
❑ Why do we need Reverse Engineering?
❑ Scope and Tasks of Reverse Engineering
❑ Legality and Ethics
❑ Different Approaches
❑ Reversing Tools
❑ Tutorials
What is Reverse Engineering ?
❑ You have an unexpected case:
o You finished one course project using Java
o Your program runs OK
o But, by accident, you delete the java file
o How to hand in your project?
❑ Reverse Engineering
What is Reverse Engineering ?
Forward Engineering Reverse Engineering
Requirements
Design
Source Code
Behavior
What is Reverse Engineering?
❑ The process of extracting the knowledge or design blueprints from
anything man-made.
❑ Reverse engineering is the process of discovering the technological
principles of a device, object, or system through analysis of its
structure, function, and operation.
❑ Software Reverse Engineering
o About opening up a program’s “box” and looking inside
o No screwdrivers needed, but integrates several arts of
▪ Code breaking
▪ Puzzle solving
▪ Programming
▪ Logical analysis
“Software Cracking is a scientific process as
much as it is an art,” by Greg Hoglund
❑ Describe Phenomenon, Formulate Hypothesis, Validate the
Hypothesis, Take measurements, Analyze
❑ Repetitive process until it converges to a solution
❑ Using the proper tools skillfully
Why Reverse Engineering?
“Sometimes, the best way to advance
is in reverse,”
By Eldad Eilam
Why do it?
Find Academic
Discover Vulnerabilities Research
Trade
Secrets
[Copy]
Protection
Can be used for good...
Understand malware Analyse
Patch Binary Understand legacy code (design recovery)
Protocols
…or not-so-good
and Remove usage restrictions from software
Alter Behavior Find and exploit flaws in software
Cheat at games, etc.
Scope and Task of Reverse Engineering
❑ Redocumentation and/or document generation
❑ Recovery of design approach and design
details at any level of abstraction
❑ Identifying reusable components and
components that need restructuring
❑ Recovering business rules
❑ Understanding high-level system description.
History
❑ Started as analyzing hardware in an attempt to gain an
advantage.
❑ Law of Software Revolution (Lehman, 1980)
❑ Fundamental strategies for program comprehension (Brooks,
1983)
❑ Taxonomy of Reverse Engineering (Chikofsky&Cross, 1990)
❑ WCRE (Working Conference on R.E., 1990)
❑ The first time this was applied to a piece of malware was in 1987.
❑ Disassembled and neutralized the Charlie virus.
Is reversing legal?
Is RE legal?
❑ Bottom line
o What is “reverse engineering” used for?
❑ What social and economic impact does
RE have on society?
o Depends on what RE is used for …
❑ Whether a RE scenario is legal or not
depends on many factors.
13
US/EU DMCA Laws Exceptions
❑ Interoperability
❑ Security auditing and testing
❑ Educational purposes
❑ Government investigation
❑ Regulation compliance
❑ Protection of privacy
❑ Evaluation of encryption technology
What is Software Cracking?
Challenges
So where’s the catch?
00401000 push ebp
00401001 mov ebp, esp
❑ Low-level is, well, low level… 00401003 push
00401004 push
00401005 and
ecx
ecx
dword ptr [ebp-4], 0
00401009 push esi
0040100A mov esi, [ebp+8]
0040100D push edi
0040100E push esi
0040100F
for (Serial = 0, i = 0; i < strlen(UserName); i++) call
{ ds:[00402008h]
CurChar = (int) UserName[i]; 00401015 mov edi, eax
Serial += CurChar; 00401017 xor edx, edx
00401019
Serial = (((Serial << 1) && 0xFFFFFFFE) test >> edi,
|| ((Serial edi1));
31) &&
0040101B jle
Serial = (((Serial * CurChar) + CurChar) ^ CurChar); 00401047h
} 0040101D movsx ecx, byte ptr [edx+esi]
UserSerial = ~((UserSerial ^ 0x1337C0DE) 00401021 add
- 0xBADC0DE5); [ebp-4], ecx
00401024 mov [ebp-8], ecx
00401027 rol dword ptr [ebp-4], 1
0040102A mov eax, ecx
0040102C imul eax, [ebp-4]
00401030 mov [ebp-4], eax
00401033 mov eax, [ebp-8]
00401036 add [ebp-4], eax
00401039 xor [ebp-4], ecx
0040103C inc edx
0040103D cmp edx, edi
0040103F jl 0040101Dh
00401041 cmp dword ptr [ebp-4], 0
00401045 jnz 00401063h
00401047 push 0
So where’s the catch?
❑ Low-level is, well, low level…
❑ Needle in a haystack
o Average opcode size:
3 bytes
o Average executable size:
500KB
o There are executables,
libraries, drivers….
So where’s the catch?
❑ Low-level is, well, low level…
❑ Sometimes, the code resists
o Packers and compressors
o Obfuscators
❑ Sometimes, the code fights back
o Detect reversing tools
o Detect VMs and emulators
A Battle of Wits
❑ Author writes code
❑ Reverser reverses it
❑ Author creates an anti-reversing technique
❑ Reverser bypasses it
❑ And so on…
So what do you need
in order to be
a good reverser?
What makes a good reverser?
Qualities Knowledge
•Assembly Language
•Patient
•Some High-Level programming
•Outside-the-Box Thinking
• Best: origin of binary
•Operating System Internals
• API
• Data Structures
• File Structures
•Good scripting skills
•Anti-Debugging Tricks
SRE Necessary Skills
❑ Working knowledge of target assembly code
❑ Experience with the tools
o IDA Pro ⎯ sophisticated and complex
o OllyDbg ⎯ limited compared to IDA pro
❑ Knowledge of Windows Portable Executable
(PE) file format
❑ Boundless patience and optimism
❑ SRE is a tedious, labor-intensive process!
Required knowledge of CPU and
assembly
Compilation Process
26
27
Executable Format
❑ Windows: PE, EXE, DLL
❑ Unix/Linux: a.out, COFF, and ELF
❑ Specifies a standard format for mapping code on disk
into a complete executable image in the memory that
consists of code, data, stack, heap (for malloc), and
all the libraries
❑ http://www.wotsit.org/ is a greate resource for all type
of file formats
28
ELF Layout
PE Layout
The CPU
Registers – storage locations, values move around
here
Processor Status/Flag Register – keeps track of
flags that are set during calculations
Program Counter – address of current instruction
Stack Pointer – keeps track of call stack
Assembly language is an abstraction
of hexadecimal code
Reversing Tools
RE Tools
❑ System monitoring tools:
o Network activity, file access, register access.
❑ Debuggers:
o Reversers use debuggers in disassembly mode to set breakpoints and step through a
program’s execution
❑ Disassemblers:
o Translate binary code to assembly language code.
❑ Decompilers:
o Attempt to produce high-level code (e.g., C) from an executable binary file. A reverse
compiler!
o Perfect decompilation impossible for most platforms.
❑ reverse engineering prevention tools
RE Tools
Examples:
o Hex editors: WinHex, Tsearch
o Decompilers: REC, DJ
o Disassemblers/Debuggers: IDAPro, OllyDbg,
Win32Dasm, BORG
o RE prevention tools
▪ Code obfuscators: Y0da’s Cryptor, NFO
Debuggers
Why is a Debugger Needed?
❑ Disassembler gives static results
o Good overview of program logic
o User must “mentally execute” program
o Difficult to jump to specific place in the code
❑ Debugger is dynamic
o Can set break points
o Can treat complex code as “black box”
o Not all code disassembles correctly
❑ Disassembler and debugger both required for
any serious SRE task
GUI and much more: Turbo Debugger
OllyDbg
Disassemblers
Disassemblers/Debuggers
•Convert binary code into its assembly
equivalent.
•Extract ASCII strings and used libraries.
•View memory, stack and CPU registers.
•Run the program (with breakpoints).
•Edit the assembly code at runtime.
Disassemblers/Debuggers: OllyDbg
IDA-Pro
❑ Started as an Interactive Dis-Assembler, enabling user interaction with the
disassembler’s decisions.
❑ Slowly evolved into an automatic RE tool:
o Built-in full-control script language
o Library recognition (including user-generated)
o Function prototype information
▪ Display
▪ Propagate throughout the code
o Support for plug-ins
o Support for Python scripting
o Multi-architecture, cross-platform support
o Full incorporation with built-in and external debuggers
43
IDA-generated function interactions at high level
Hex-Editor
Hex Editors: WinHex
Decompilers
• Decompile a binary programs into
readable source code.
• Replace all binary code that could not
be decompiled with assembly code.
Executable Decompiler Source Code
Resource Editor
Code Obfuscators Tools
Code Obfuscators Tools
• Encrypts the code of a program so you
cannot view it in assembly.
• Disallows the program to be run if it detects a
known disassembler or debugger running.
Anti-
debugging
Obfuscators Obfuscation techniques GUI
Y0da's Cryptor x x x
NFO x x
Diagram of Code Obfuscators
Executable Executable
Code Obfuscator (Encrypted)
INT 21 ♣↓↨☻¶╩♥•◘▲
Tutorials…
serial number!
SRE Example
❑ We consider a very simple example
❑ This example only requires disassembler (IDA
Pro) and hex editor
o Trudy disassembles to understand code
o Trudy also wants to patch the code
❑ For most real-world code, also need a
debugger (OllyDbg)
SRE Example
❑ Program requires serial number
❑ But Trudy doesn’t know the serial number!
❑ Can Trudy get the serial number from exe?
SRE Example
❑ IDA Pro disassembly
❑ Looks like serial number is S123N456
SRE Example
❑ Try the serial number S123N456
❑ It works!
❑ Can Trudy do better?
SRE Example
❑ Again, IDA Pro disassembly
❑ And hex view…
SRE Example
❑ test eax,eax does AND of eax with itself
o Flag bit set to 0 only if eax is 0
o If test yields 0, then jz is true
❑ Trudy wants jz to always be true!
❑ Can Trudy patch exe so jz always holds?
SRE Example
❑ Can Trudy patch exe so that jz always true?
xor ← jz always true!!!
Assembly Hex
test eax,eax 85 C0 …
xor eax,eax 33 C0 …
59
SRE Example
❑ Edit serial.exe with hex editor
serial.exe
serialPatch.exe
❑ Save as serialPatch.exe
SRE Example
❑ Any “serial number” now works!
SRE Example
❑ Back to IDA Pro disassembly…
serial.exe
serialPatch.exe
62
SRE Attack Mitigation
SRE Attack Mitigation
❑ Can make attacks more difficult
o Anti-disassembly techniques
▪ To confuse static view of code
o Anti-debugging techniques
▪ To confuse dynamic view of code
o Tamper-resistance
▪ Code checks itself to detect tampering
o Code obfuscation
▪ Make code more difficult to understand
Anti-disassembly Example
❑ Suppose actual code instructions are
inst 1 jmp junk inst 3 inst 4 …
❑ What the disassembler sees
inst 1 inst 2 inst 3 inst 4 inst 5 inst 6 …
❑ This is example of “false disassembly”
Code Obfuscation
❑ Goal is to make code hard to understand
❑ Opposite of good software engineering!
o For example, spaghetti code
❑ Much research into more robust obfuscation
o Example: opaque predicate
int x,y
:
if((x−y)∗(x−y) > (x∗x−2∗x∗y+y∗y)){…}
o The if() conditional is always false
❑ Attacker wastes time analyzing dead code
Packing Files
❑ The code is compressed, like a Zip file
❑ This makes the strings and instructions
unreadable
❑ All you'll see is the wrapper – small code that
unpacks the file when it is run
Thanks for Listening?