Assembly Language
Intel and AMD 32-bit
Architecture (x86)
Things I dont intend to cover
Yeahsorry, folks, dont have a lot of time.
Privileged instructions
Standalone source files and PWB
Vector instructions (MMX, SSE[2], 3DNow!)
Instruction encodings
How to write code for processors prior to 386
A Brief History of VLSI
4004 (71), 8008 (72)
8086 (78), 8088 (79)
80186/88 (82)
80286 (82), 80386 (85)
80486 (89)
Pentium class/586 (93)
The Daily Register
8/16
bits
32 bits
EAX
AH
| AL
EBX
BH
| BL
ECX
CH
| CL
EDX
DH
| DL
ESI
SI
EDI
DI
ESP
SP
EBP BP
SS
CS
DS
ES
FS
GS
EIP
IP
EFLAGSFLAGS
Moving On
mov
mov
mov
mov
mov
mov
mov
<dest>, <src>
eax, dwMyVar
eax, 65h
eax, 0FFFFFFFFh
eax, [ebx]
eax, [eax+4]
dwMyVar, esi
The Meaning of Brackets
On a variable, brackets have no effect
mov
eax, [dwMyVar]
On a register, brackets dereference a pointer
mov
eax, [eax]
A displacement can be indicated in two ways
mov
eax, [eax+8]
mov eax, [eax]8
There are more things that can be done with brackets
which Ill illustrate when we get to the instruction LEA
(Load Effective Address)
rithmetic
add eax, ebx
sub eax, ebx
mul edx
imul edx
inc
eax
dec eax
adc, sbb, neg
eax += ebx;
eax -= ebx;
eax *= edx;
(signed version)
eax++;
eax--;
A House Divided
[i]div <divisor>
Dividend Divisor
AX
8 bits
DX:AX 16 bits
EDX:EAX 32 bits
Quotient
AL
AX
EAX
Remainder
AH
DX
EDX
A Lil Bit of Bit Manipulation
and
or
xor
not
or
jz
eax, ebx
eax, 3
ecx, 69h
ebx
ah,ah
lbl_AHIsZero
eax&=ebx;
eax|=3;
ecx^=0x69;
ebx=~ebx;
Shifting Things Around
shl/sal eax, 8
eax<<=8;
shr
eax, 6
eax>>=6;
sar
ecx, 7
replicate sign bit
rol
esi, 11
esi=(esi>>21)|(esi<<11)
ror
esi, 21
esi=(esi>>21)|(esi<<11)
rcl, rcr rotate through CF
shl
eax, cl
eax<<=cl;
Being Effective
lea
eax, MyPtr
(mov
eax, OFFSET MyPtr)
lea
edi, [ebx+edi]
lea
eax, [esp+10]
lea
ecx, [eax*2+eax+6]
lea
eax, MyPtr[eax+4][esi*2]
[base[*scale]][+displacement][+index]
Sizing Things Up
movzx/movsx eax, bh
mov
ax, WORD PTR [MyPtr+6]
inc
BYTE PTR [eax]
cbw
(al->ax)
cwd,cwde
(ax->dx:ax, ax->eax)
cdq
(eax->edx:eax)
Flags
sub,and cmp,test ; just without changing dest
There are dozens of flags; you only need to know a few.
Carry
if theres a carry or borrow
Parity
if low-order bits have even parity
Zero
if result is zero
Sign
if result is negative
Overflow
if result is too large or small
Direction
string operations should go down
Getting Around
Unconditional:
JMP dest
Conditional (165) :
JCXZ, JECXZ, LOOP
JC/JB/JNAE, JNC/JNB/JAE, JBE/JNA, JA/JNBE
JE/JZ, JNE/JNZ, JS, JNS
JL/JNGE, JGE/JNL, JLE/JNG, JG/JNLE
JO, JNO, JP/JPE, JNP/JPO
Interrupts:
int 2Eh
into
Addressing Modes
Segment overrides and related issues will be ignored
Register:
eax, ecx, ebp
Immediate:
5, 0x78
Direct memory:
MyVar, [MyVar+2]
Indirect memory:
[eax], [eax+esi+7]
Direct:
jmp label
Register Indirect:
jmp ebx
Memory Indirect:
jmp [ebx]
Relative:
jmp short $+2
Stacking Up
esp, ebp, ss are used to reference the stack
esp points to the top of the stack (last pushed value), while
ebp points to whatever you want, but usually the frame
pointer
The stack grows downwards in memory
The call instruction automatically pushes the return address
ret alone pops the return address and jumps to it
ret with an immediate operand also pops X bytes of
arguments
The Stack Continues to Grow
push and pop perform the typical ADT operations
In 32-bit code, push and pop always change esp by 4 bytes,
regardless of the size of the operand.
pushfd and popfd will push and pop the eflags register; this
is very useful for directly manipulating flags
(you can use lahf and sahf to transfer directly between AH
and the low byte of eflags, if thats all you want)
pushad and popad will save and restore the 8 GP registers
The stack can be used to effectively mov between segment
registers
Calling Conventions
Today, arguments are almost universally pushed last-argument-first;
this accommodates varargs. (If you remember Windows 3.1, the
PASCAL calling convention was first-argument-pushed-first.)
Return values are in eax for most data types
_stdcall and _thiscall (except with varargs) let the called function
clean up the stack at the end of a call
_cdecl lets the caller clean up the stack after a function call returns
_fastcall is something thats used to mimic the speed of pure assembly
programs, and therefore is generally irrelevant to real assembly
programs. I dont have any details on it.
All calling conventions engage in some degree of name-mangling
when going from source code to object code.
Prologue and Epilogue
Typical prologue:
push ebp
mov ebp,esp
sub esp,LOCALSIZE
Typical epilogue:
pop ebp
ret
<or> ret x, where x is an immediate specifying bytes to pop
In MS VC++, you can tell the compiler to omit prologue and epilogue code
(almost always because you want to write it yourself in assembly) by
specifying the attribute _declspec(_naked)
Generally, temporary registers are saved and restored in these areas too
If you omit the frame pointer, a lot of this goes away
SEH adds a bunch of additional lines, but Im still researching it.
String Instructions
stosb/stosw/stosd stores identical data to a buffer
cmps{b/w/d} compares two buffers
scas{b/w/d} scans a buffer for a particular byte
movs{b/w/d} copies a buffer
ins{b/w/d} and outs{b/w/d} involve I/O ports and are only listed here because
theyre considered string instructions
lods{b/w/d} loads data from memory
All string instructions except lods* can, and usually are, used with repeat
prefixes.
The direction flag determines which way the pointers are moved.
edi is always the destination pointer and esi is always the source pointer
eax/ax/al are used with stos*, lods*, and scas* for single data items
flags can be set by cmps*, of course
Prefixes
lock is useful for multiprocessor systems, but will not be
discussed here.
rep* is generally used with string instructions, to repeat an
instruction a maximum of ecx times
rep is unconditional
repe/repz and repnz/repne are conditional, based, of course,
on the zero flag
stos*, movs*, ins*, and outs* can use unconditional repeats
scas* and cmps* can use conditional repeats
Instruction Set 8086/88
AAA
CBW
CWD
IMUL
JB JBE
JLEJMP
JNG
JO JP
LEA
LOOPNZ
NOT
RCR
ROR
SHR
XCHG
AAD
CLC
DAA
IN
JC
JNA
JNGE
JPE
LES
LOOPZ
OR
REP
SAHF
STC
XLAT
AAM
CLD
DAS
INC
JCXZ
JNAE
JNL
JPO
LOCK
MOV
OUT
REPE
SAL
STD
XOR
AAS
CLI
DEC
INT
JE
JNB
JNLE
JS
LODSB
MOVSB
POP
REPNE
SAR
STOSB
ADC
CMC
DIV
INTO
JG
JNBE
JNO
JZ
LODSW
MOVSW
POPF
REPNZ
SBB
STOSW
ADD
CMP
ESC
IRET
JGE
JNC
JNP
LAHF
LOOP
MUL
PUSH
REPZ
SCASB
SUB
AND
CMPSB
HLT
JA
JL
JNE
JNS
LDS
LOOPE
NEG
PUSHF
RET
SCASW
TEST
CALL
CMPSW
IDIV
JAE
JNZ
LOOPNE
NOP
RCL
ROL
SHL
WAIT
Instruction Set (p. 2)
80186/88:
BOUND ENTER
OUTSW POPA
INS
PUSHA
INSB
INSW
LEAVE
OUTS
OUTSB
LAR
SIDT
LGDT
SLDT
LIDT
SMSW
LLDT
STR
LMSW
VERR
LSL
VERW
80286:
ARPL
LTR
CLTS
SGDT
Instruction Set 80386
BSF
CWDE
MOVSX
SETA
SETL
SETNG
SETO
STOSD
BSR
INSD
MOVZX
SETAE
SETLE
SETNGE
SETP
BT
JECXZ
OUTSD
SETB
SETNA
SETNL
SETPE
BTC
LFS
POPAD
SETBE
SETNAE
SETNLE
SETPO
BTR
LGS
POPFD
SETC
SETNB
SETNO
SETS
BTS
LODSD
PUSHAD
SETE
SETNBE
SETNP
SETZ
CDQ
LSS
PUSHFD
SETG
SETNC
SETNS
SHLD
CMPSD
MOVSD
SCASD
SETGE
SETNE
SETNZ
SHRD
Instruction Set (p. 4)
80486:
BSWAP
CMPXCHG
INVD
INVLPG WBINVD XADD
RDMSR
RDTSC
Pentium I:
CMPXCHG8B
CPUID
RSM
WRMSR
Other Stuff:
CLFLUSH
CMOV* CR0
CR2
CR3
CR4
DR0-7
LMXCSR LFENCE MFENCE PAUSE PREFETCH*
STMXCSR
SYSENTER
SYSEXIT UD2
SFENCE
The Road Ahead
Floating-point instructions
Vector instructions
Standalone assembly file directives?
Structured exception handling?
Disassembly techniques?