Data Representation in Computers Instructor: Ch.
Bilal Ahmad Khan
Company
LOGO
Data Representation in Computers
Data Representation in Computers
Data Representation in Computers
Data Representation in Computers
Data Forms
Human communication
Includes language, images and sounds
Computers
Process and store all forms of data in binary format
Conversion to computer-usable representation using data formats
Define the different ways human data may be represented, stored and processed by a computer
6
Data conversion and representation
Common Data Representations
Type of Data
Alphanumeric Image (bitmapped)
Standard(s)
BCD, ASCII, EBCDIC, Unicode
GIF
(graphical image format) TIF (tagged image file format) PNG (portable network graphics) PostScript, JPEG, SWF (Macromedia Flash), SVG PostScript, TrueType WAV, AVI, MP3, MIDI, WMA PDF (Adobe Portable Document Format), HTML, XML Quicktime, MPEG-2, MPEG-4, RealVideo, WMV
9
Image (object) Outline graphics and fonts Sound Page description Video and Sound
Alphanumeric Data
Groups of data:
Characters: A, B, , Z and a, b,, z
Numbers/digits: 0 9 Punctuations: !, ;, :, ? etc
Special purpose characters: $, @, #, *, , &
Four coding systems /standards to represent above types:
BCD (Binary-Coded Decimal)
ASCII (American Standard Code for Information Interchange) EBCDIC (Extended Binary Coded Decimal Interchange Code) Unicode
10
Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode
Next 2 slides
11
Binary-Coded Decimal (BCD)
Four bits per digit
Note: the following 6 bit patterns are not used: 1010 1011 1100 1101 1110 1111
Digit 0 1 2
Bit pattern 0000 0001 0010
3
4 5
0011
0100 0101
6
7 8 9
0110
0111 1000 1001
12
BCD: Example
709310 = ? (in BCD)
7 0 9 3
0111
0000
1001 Or
0011
0111000010010011
13
Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode
Next 13 slides
14
ASCII Features
Developed by ANSI (American National Standards Institute) Defined in ANSI document X3.4-1977 7-bit code 8th bit is unused (or used for a parity bit or to indicate extended character set) 27 = 128 different codes Two general types of codes:
95 are Printing codes (displayable on a console) 33 are Control codes (control features of the console or communications channel)
Represents
Latin alphabet, Arabic numerals, standard punctuation characters Plus small set of accents and other European special characters (Latin-I ASCII)
15
ASCII Table
000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI 001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US 010 ! " # $ % & ' ( ) * + , . / 011 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 100 @ A B C D E F G H I J K L M N O 101 P Q R S T U V W X Y Z [ \ ] ^ _ 110 ` a b c d e f g h i j k l m n o 111 p q r s t u v w x y z { | } ~ DEL
16
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
ASCII Table
000 001 0000 NULL DLE 0001 SOH DC1 0010 STX DC2 0011 ETX DC3 0100 EDT DC4 0101 ENQ NAK 0110 ACK SYN 0111 BEL ETB 1000 BS CAN 1001 HT EM 1010 LF SUB 1011 VT ESC Least significant bit 1100 FF FS 1101 CR GS 1110 SO RS 1111 SI US 011 0 ! 1 " 2 # 3 Most significant $ 4 bit % 5 & 6 ' 7 ( 8 ) 9 * : + ; , < = . > / ? 010 100 @ A B C D E F G H I J K L M N O 101 P Q R S T U V W X Y Z [ \ ] ^ _ 110 ` a b c d e f g h i j k l m n o 111 p q r s t u v w x y z { | } ~ DEL
17
ASCII Table
e.g., a = 1100001
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI
001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
010 ! " # $ % & ' ( ) * + , . /
011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
100 @ A B C D E F G H I J K L M N O
101 P Q R S T U V W X Y Z [ \ ] ^ _
110 ` a b c d e f g h i j k l m n o
111 p q r s t u v w x y z { | } ~ DEL
18
ASCII Table
95 Printing codes
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI
001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
010 ! " # $ % & ' ( ) * + , . /
011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
100 @ A B C D E F G H I J K L M N O
101 P Q R S T U V W X Y Z [ \ ] ^ _
110 ` a b c d e f g h i j k l m n o
111 p q r s t u v w x y z { | } ~ DEL
19
ASCII Table
33 Control codes
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI
001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
010 ! " # $ % & ' ( ) * + , . /
011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
100 @ A B C D E F G H I J K L M N O
101 P Q R S T U V W X Y Z [ \ ] ^ _
110 ` a b c d e f g h i j k l m n o
111 p q r s t u v w x y z { | } ~ DEL
20
ASCII Table
Alphabetic codes
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI
001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
010 ! " # $ % & ' ( ) * + , . /
011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
100 @ A B C D E F G H I J K L M N O
101 P Q R S T U V W X Y Z [ \ ] ^ _
110 ` a b c d e f g h i j k l m n o
111 p q r s t u v w x y z { | } ~ DEL
21
ASCII Table
Numeric codes
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI
001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
010 ! " # $ % & ' ( ) * + , . /
011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
100 @ A B C D E F G H I J K L M N O
101 P Q R S T U V W X Y Z [ \ ] ^ _
110 ` a b c d e f g h i j k l m n o
111 p q r s t u v w x y z { | } ~ DEL
22
ASCII Table
Punctuation, etc.
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI
001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
010 ! " # $ % & ' ( ) * + , . /
011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
100 @ A B C D E F G H I J K L M N O
101 P Q R S T U V W X Y Z [ \ ] ^ _
110 ` a b c d e f g h i j k l m n o
111 p q r s t u v w x y z { | } ~ DEL
23
ASCII Table
MSD LSD
0 1 2 3 4 5 6 7 8
0
NUL SOH STX ETX EOT ENQ ACJ BEL BS
1
DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN
2
SP ! # $ % & (
3
0 1 2 3 4 5 6 7 8
4
@ A B C D E F G H
5
P Q R S T U V W X
7
p
a b c d e f g h
W r s t u v w x
7416 111 0100
9
A B C D E F
HT
LF VT FF CR SO SI
EM
SUB ESC FS GS RS US
)
* + , . /
9
: ; < = > ?
I
J K L M N O
Y
Z [ \ ] ^ _
i
j k l m n o
y
z
{
|
} ~ DEL
24
Example: Hello, world
H e l l o , w o r l d
= = = = = = = = = = = =
Binary 1001000 1100101 1101100 1101100 1101111 0101100 0100000 1110111 1100111 1110010 1101100 1100100
= = = = = = = = = = = =
Hexadecimal 48 65 6C 6C 6F 2C 20 77 67 72 6C 64
= = = = = = = = = = = =
Decimal 72 101 108 108 111 44 32 119 103 114 108 100
25
EASCII
28
EASCII
29
Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode
Next 3 slides
30
EBCDIC
8-bit code Developed by IBM for mainframes computers
ASCII EBCDIC
Rarely used today, common in archival data
Character codes differ from ASCII Conversion software to/from ASCII available
Space
2016
4016
4116
C116
6216
8216
31
EBCDIC Table
(1 of 2)
32
EBCDIC Table
(2 of 2)
33
Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode
Next 2 slides
34
Unicode
Most common 16-bit form represents 65,536 characters
EASCII is a subset of Unicode
Values 0 to 255 in Unicode table
Multilingual: defines codes for
Nearly every character-based alphabet Chinese, Japanese and Korean alphabets
Allows software modifications for local-languages representations
35
Two-byte Unicode Assignment Table
36