KAIST - CS230
Introduction to
System Programming
Spring, 2022
Topics:
Staff, text, and policies
Overview
A Tour of Computer Systems
Basic Computer Systems
1 CS230 S’22
class01.ppt
Carnegie Mello
Course Theme:
Abstraction Is Good But Don’t Forget Reality
Many CS courses emphasize abstraction
Abstract data types
Asymptotic analysis
These abstractions have limits
Especially in the presence of bugs
Need to understand details of underlying implementations
Useful outcomes from taking CS230
Become more effective programmers
Able to find and eliminate bugs efficiently
Able to understand and tune for program performance
Prepare for later “systems” classes in CS & EE
Compilers, Operating Systems, Networks, Computer
Architecture, Embedded Systems, etc.
2
Great Reality #1:
Ints are not Integers, Floats are not Reals
Example 1: Is x2 ≥ 0?
Float’s: Yes!
Int’s:
40000 * 40000 1600000000
50000 * 50000 ??
Example 2: Is (x + y) + z = x + (y + z)?
Unsigned & Signed Int’s: Yes!
Float’s:
(1e20 + -1e20) + 3.14 --> 3.14
3
1e20 + (-1e20 + 3.14) --> ?? Source: xkcd.com/571
Great Reality #2:
You’ve Got to Know Assembly
Chances are, you’ll never write programs in assembly
Compilers are much better & more patient than you are
But: Understanding assembly is key to machine-level
execution model
Behavior of programs in presence of bugs
High-level language models break down
Tuning program performance
Understand optimizations done / not done by the compiler
Understanding sources of program inefficiency
Implementing system software
Compiler has machine code as target
Operating systems must manage process state
Creating / fighting malware
x86 assembly is the language of choice!
4
Great Reality #3: Memory Matters
Random Access Memory Is an Unphysical
Abstraction
Memory is not unbounded
It must be allocated and managed
Many applications are memory dominated
Memory referencing bugs especially pernicious
Effects are distant in both time and space
Memory performance is not uniform
Cache and virtual memory effects can greatly affect program
performance
Adapting program to characteristics of memory system can
lead to major speed improvements
5
Memory Referencing Bug Example
typedef struct {
int a[2];
double d;
} struct_t;
double fun(int i) {
volatile struct_t s;
s.d = 3.14;
s.a[i] = 1073741824; /* Possibly out of bounds */
return s.d;
}
fun(0) 3.14
fun(1) 3.14
fun(2) 3.1399998664856
fun(3) 2.00000061035156
fun(4) 3.14
fun(6) Segmentation fault
Result is system specific 6
Memory Referencing Bug Example
typedef struct { fun(0) 3.14
int a[2]; fun(1) 3.14
double d; fun(2) 3.1399998664856
} struct_t;
fun(3) 2.00000061035156
fun(4) 3.14
fun(6) Segmentation fault
Explanation:
Critical State 6
? 5
? 4
d7 ... d4 3 Location accessed by
fun(i)
d3 ... d0 2
struct_t
a[1] 1
a[0] 0
7
Carnegie Mello
Memory Referencing Errors
C and C++ do not provide any memory protection
Out of bounds array references
Invalid pointer values
Abuses of malloc/free
Can lead to nasty bugs
Whether or not bug has any effect depends on system and
compiler
Action at a distance
Corrupted object logically unrelated to one being accessed
Effect of bug may be first observed long after it is generated
How can I deal with this?
Program in Java, Ruby, Python, ML, …
Understand what possible interactions may occur
Use or develop tools to detect referencing errors
8 (e.g.
Valgrind, http://valgrind.org/)
Great Reality #4: There’s more to
performance than asymptotic complexity
Constant factors matter too!
And even exact op count does not predict performance
Easily see 10:1 performance range depending on how code
written
Must optimize at multiple levels: algorithm, data
representations, procedures, and loops
Must understand system to optimize performance
How programs compiled and executed
How to measure program performance and identify
bottlenecks
How to improve performance without destroying code
modularity and generality
9
Memory System Performance
Example
void copyij(int src[2048][2048], void copyji(int src[2048][2048],
int dst[2048][2048]) int dst[2048][2048])
{ {
int i,j; int i,j;
for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++)
for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++)
dst[i][j] = src[i][j]; dst[i][j] = src[i][j];
} }
4.3ms 81.8ms
2.0 GHz Intel Core i7 Haswell
Hierarchical memory organization
Performance depends on access patterns
Including how step through multi-dimensional array
10
A Tour of Computer
Systems
class01.ppt 11
Computer Systems
Computer Systems
Hardware and systems software to run application programs
Affect correctness and performance of programmer’s
programs
You will learn
How to avoid strange numerical errors caused by number
representation systems
How to optimize your C code
How procedure call is implemented and to avoid security
holes
How to write your own Unix shell, own web server, etc.
Begin the journey
with hello.c
12
http://en.wikipedia.org/wiki/ASCII
Information is Bits + Context
1 #include <stdio.h>
2
The hello program 3 int main()
4{
5 printf (“hello, world\n”);
6}
The ASCII text representation of hello.c
# i n c l u d e <sp> < s t d i o . h > \n \n i n t <sp>
35 105 110 99 108 117 100 101 32 60 115 116 100 105 111 46 104 62 10 10 105 110 116 32
m a i n ( ) \n { \n <sp> <sp> <sp> <sp> p r i n t f ( “ h e
109 97 105 110 40 41 10 123 10 32 32 32 32 112 114 105 110 116 102 40 34 104 101
l l o , <sp> w o r l d \ n ” ) ; \n }
108 108 111 44 32 119 111 114 108 100 92 110 34 41 59 10 125
All information is represented as a bunch of bits
Disk files, programs, user data, data transferred across
network
Different context – the same sequence of bytes
Integer, floating-point, character string, or machine instruction
13
Programs are translated by other
programs into different forms unix> ./hello
[1] unix> gcc –o hello hello.c printf.o hello, world
unix>
hello.c Pre- hello.i Compiler hello.s Assembler hello.o Linker hello
processor
(cc1) (as) (ld)
(cpp)
Source Modified Assembly Relocatable Executable
program source program object object
(text) program (text) programs program
(text) (binary) (binary)
[2] *insert stdio.h into [3] unix> gcc –S hello.c [4] unix> as –o hello hello.s
program text, hello.i [5] *merge hello.o with
unix> vi hello.s printf.o
unix> gcc –E -o hello.i hello.c
unix> vi hello.i [6] try gdb
unix> gcc –g –o hello hello.c
unix> gdb hello
gdb > r
Starting program:
hello, world
gdb > disassemble main
Dump of assembler code for function main:
0x0000000000400498 <main+0>: push %rbp
0x0000000000400499 <main+1>: mov %rsp,%rbp
0x000000000040049c <main+4>: mov $0x400598,%edi
0x00000000004004a1 <main+9>: callq 0x4003c0 <puts@plt>
0x00000000004004a6 <main+14>: leaveq
0x00000000004004a7 <main+15>: retq
End of assembler dump.
14
Processors read and interpret
instructions stored in memory
CPU
Register file Hardware Organization
PC ALU
System bus Memory bus
I/O Main
Bus interface memory
bridge
I/O bus Expansion slots for
other devices such
USB Graphics Disk as network adapters
controller adapter controller
hello executable
Mouse Keyboard Display stored on disk
15
Disk
Reading hello command from keyboard
CPU
Register file
PC ALU
System bus Memory bus
I/O Main "hello"
Bus interface memory
bridge
I/O bus Expansion slots for
other devices such
USB Graphics Disk as network adapters
controller adapter controller
Mouse Keyboard Display
User Disk
types 16
"hello"
Loading executable from disk into
memory
CPU
Register file
PC ALU
System bus Memory bus
I/O Main "hello,world\n"
Bus interface memory hello code
bridge
I/O bus Expansion slots for
other devices such
USB Graphics Disk as network adapters
controller adapter controller
hello executable
Mouse Keyboard Display stored on disk
Disk 17
Writing “hello, world” from memory
to display
CPU
Register file
PC ALU
System bus Memory bus
I/O Main "hello,world\n"
Bus interface memory hello code
bridge
I/O bus Expansion slots for
other devices such
USB Graphics Disk as network adapters
controller adapter controller
Mouse Keyboard Display hello executable
"hello,world\n" Disk stored
18
on disk
Caches matters
Processor memory gap increases
a few hundreds bytes of register file / millions of bytes in
main memory (but 100 times faster than main memory)
Register file
CPU chip L1
cache
(SRAM) ALU
Cache bus System bus Memory bus
Main
L2 cache Memory memory
Bus interface
(SRAM) bridge (DRAM)
Cache memories
Programmer can exploit caches to improve the
performance of their programs
19
Memory hierarchy
L0:
Smaller,
faster,
Registers CPU registers hold words retrieved
and from cache memory.
costlier
(per byte)
L1: On-chip L1
storage cache (SRAM) L1 cache holds cache lines retrieved
devices from the L2 cache.
L2: Off-chip L2
cache (SRAM) L2 cache holds cache lines
retrieved from memory.
L3: Main memory
(DRAM)
Main memory holds disk
Larger,
blocks retrieved from local
slower,
disks.
and
cheaper
(per byte) L4: Local secondary storage
storage (local disks)
devices Local disks hold files
retrieved from disks on
remote network servers.
L5: Remote secondary storage
(distributed file systems, Web servers)
20
The operating system manages the
hardware
Who accesses keyboard, display, disk?
Operating system : protection and hardware interface
Application programs
Software
Operating system
Processor Main memory I/O devices Hardware
Layered view of a computer system
Processes
Virtual memory
Files
Processor Main memory I/O devices
Abstractions provided by an operating system
21
The operating system manages the
hardware
Process (later Thread)
operating system’s abstraction for a running program
multiple processes can run concurrently on the same system
each process appears to have exclusive use of the hardware
shell hello
Time process process
Application code
Context
OS code switch
Application code
OS code Context
switch
Application code
Process Context Switching
22
The operating system manages the
hardware Memory
0xffffffff
invisible to
Virtual memory Kernel virtual memory user code
0xc0000000
Process virtual User stack
address space in Linux (created at runtime)
Memory mapped region forprintf() function
0x40000000
shared libraries
Files
sequence of bytes Run-time heap
(created at runtime by malloc)
every I/O device
is modeled as a file Read/write data
Loaded from the
keyboard, network, etc. hello executable file
Read-only code and data
read/write 0x08048000
23
0
Unused
Systems communicate with other
systems using networks
Network applications
Email,
instant messaging,
WWW, FTP, telnet, etc.
Run hello in remote machine A network is another I/O device
1. User types 2. Client sends "hello"
"hello" at the string to telnet server 3. Server sends "hello"
keyboard Local Remote string to the shell, which
telnet telnet runs the hello program,
client server and sends the output
4. Telnet server sends to the telnet server
5. Client prints "hello, world\n" string
"hello, world\n" to client
string on display 24
Intel Core i7 64 bit Processor
QuickPath Interconnect
Direct Media Interface (DMI)
25
Intel Core i7 Motherboard
26
NVIDIA GPU(Graphics Processor Unit)
27
NVIDIA GPU(Graphics Processor Unit)
28
Good Luck!
29
Computer Systems
In Practice
- Smartphones -
class01.ppt 30
What is Android
A software stack for mobile devices
Developed by Google and Open Handset Alliance (OHA)
Include a set of software components: Linux kernel,
libraries, application framework, applications
Open Handset Alliance 31
Google Android – H/W
(Galaxy S20 Ultra)
32
https://ko.ifixit.com/Teardown/Samsung+Galaxy+S20+Ultra+Teardown/131607
Google Android – H/W
(Galaxy S20 Ultra)
33
https://ko.ifixit.com/Teardown/Samsung+Galaxy+S20+Ultra+Teardown/131607
Google Android – H/W
(Galaxy S20 Ultra)
34
https://ko.ifixit.com/Teardown/Samsung+Galaxy+S20+Ultra+Teardown/131607
Qualcomm Snapdragon SIP
Block Diagram
https://www.qualcomm.com/media/documents/files/snapdragon-
855-mobile-platform-product-brief.pdf 35
Google
Android –
S/W Platform
https://developer.android.com/guide/platform
36
Google Android – Service Creation
37
Android Applications & Development
Android Studio
38
https://developer.android.com/studio
Evolution of ARM Architecture
39
Samsung Exynos 5 Dual
40
Mobile Device Hardware (Arndale)
41
Mobile Device Hardware
Hardware Schematics
42
Arm Assembly Language
ARM Programming Model
43
Arm Assembly Language
ARM Instruction Set
44
Bootloader – U-Boot
SDRAM
Memory
Reset 0x1000
CPU Kernel
Root
File-
System
BMODE 00
ByPass PROM
0x20000000 ROM Bootloader
uboot
Bootloader
uboot
Kernel
Optional
Image
compressed
Root File
System
45
(Embedded) Linux
46
(Embedded) Linux
47
Android Linux Kernel
Kernel: Linux Kernel 2.6.x + patches
Patch Features Description
Alarm android.app.AlarmManager
Low Memory Killer Android Out of Memory Killer
Ashmem Android / Anonymous Shared Memory Subsystem, android.os.MemoryFile
Kernel Debugger
Binder 1. Driver to facilitate inter-process communication (IPC).
2. High Performance through shared memory.
3. Per-process thread pool for processing requests.
4. Reference counting, and mapping of object references across processes.
5. Synchronous calls between processes.
Power Management 1. Built on top of standard Linux Power Management (PM).
2. More aggressive power management policy.
3. Components make requests to keep the power on through “wake locks”.
4. Supports different types of wake locks
Logger Support Android Logcat Utility : Using logcat Commands, Filtering Log Output,
Controlling Log Output Format, Viewing Alternative Log Buffer,
Listing of logcat Command Options
EABI Embedded Application Binary Interface
48
Android 101
Activity Service Content Provider
Music is commanded to play it by
Activity but the program to play is in a
service
49
Android Internals
Dalvik
Virtual machine on Android mobile devices
Runs applications which have been converted into a
compact Dalvik Executable (.dex) format suitable for
systems that are constrained in terms of memory and
processor speed
Unlike most virtual
machines and true
Java VMs which are
stack machines,
the Dalvik VM is
a register-based
architecture
50
Android Internals
WebKit Project
WebKit
WebKit
Open Source Web Engine
1.8M lines of code, mostly C++ WebCore
All the logic, m ost of the code
Chrome, Safari, iPhone, S60, Android…
Platform/ JS Engine
~10% of worldwide browser market
WebKit
150 committers, 80 active
Thin library, OS interaction layer
Rendering, History, Public API
Main Feature
WebCore
Display Web Content
Following Links Rendering, Layout, Styling,
Managing a back-forward list Parsing, Network logic, Painting,
DOM Bindings, everything you
Managing history of Pages recently visited
think of…
JavaScriptCore / V8 51
JavaScript Engine, JS Execution
Android Internals
SQLite
• A open source embedded relational
database management system
• Widely used on memory constrained
devices like cellphones, PDAs, MP3 players, etc
• Android also uses SQLite as a database management library
52
Android Internals
SGL/OpenGL-ES
53
Android Internals
Media Framework
54