C Programming Language
- first session -
C Programming Language First Part
The evolution of C What is C ?
Strengths and weaknesses
Overview of C programming
Lexical elements
The C preprocessor
Q&A
What is C ?
The 'C' programming language was originally developed for and implemented on the
UNIX operating system, by Dennis Ritchiein 1971.
One of the best features of C is that it is not tied to any particular hardware or system.
This makes it easy for a user to write programs that will run without any changes on
practically all machines.
C is often called a middle-level computer language as it combines the elements of
high-level languages with the functionalism of assembly language.
The evolution of C
When
Who, where
Comments
mid-1960
Ken Thompson, Bell
Laboratories
Developed B language (based on BCPL which in turn is
based on Algol-60) for UNIX Operating System
development
A higher-level programming language was needed for
further development of the system (prior, assembly
language was used)
1971
Denis Ritchie, Bell
Laboratories
Developed an extended version of B called NB (New B)
and then changed its name into C
1973
UNIX was rewritten in C
1978
Brian Kernighan and
Denis Ritchie
The C Programming Language 1st edition, referred as
K&R - it was the de facto standard
The evolution of C
When
Comments
1983
ANSI (American National Standards Institute) starts the standardization process
of the C programming language
1989
Standard adopted as ANSI Standard X3.159-1989 ANSI-C
1990
Standard adopted by ISO as an international standard ISO/IEC 9899:1990
C89 or C90
1995
Amendment 1 extension to C89 C89 with Amendamne1 or C95
1999
New C standard adopted as ISO/IEC 9899:1999 C99
Changes are in the spirit of C they do not change the fundamental nature of
the language
C99 isnt yet universal and it will take time before all compilers are C99
compliant
Latest Infos on http://www.open-std.org
Strengths and weaknesses
Strengths
Weaknesses
Efficiency: programs execute fast and use
limited amounts of memory
Programs can be error-prone: C allows
some programming mistakes to go
undetected (e.g. the use of & operator)
Portability: programs may run on different
computing systems
Programs can be difficult to understand:
C allows to write programs in a quite cryptic
manner (e.g. a[(i>>1) + j] += i>>j;)
Power: C has a large collection of types and
operators
Programs can be difficult to modify: Large
programs can be hard to modify if they
havent been designed with maintenance in
mind
Flexibility: C may be used to write programs
for embedded systems or for commercial
data processing systems
Standard library: has plenty of functions for
I/O, string processing, storage allocation, etc.
Overview of C programming
EXAMPLE
A C program is composed of one or
more source files, each of which
contains some part of the entire C
program: typically some number of
external functions
Common declarations are often
collected into header files and are
included into the source files with a
special #include command
One external function must be
named main and this function is
where the program starts
#include <stdio.h>
#define SIZE = 10
int size(int a[SIZE])
{
int ret;
ret=printf("size of array is:%d\n", sizeof(a));
return ret;
}
int main()
{
int a[SIZE];
(void)size(a);
return 0;
}
Q: Will the program compile?
Q: What would be the output of the
program?
Q: What would be the value returned by the
function size?
Overview of C programming: Embedded vs. Desktop Programming
Overview of C programming
A C compiler independently processes each
source file and translates the C program text
into instructions understood by the computer
The output of the compiler is usually called
object code or an object module
When all source files are compiled, the object
modules are given to a program called the
linker
The linker resolves references between the
modules, adds functions from the standard
run-time library
The linker produces a single executable
program which can then be invoked or run
C source
file
C source
file
Compile
Compile
Object
file
Object
file
Link
Executable
module
Library
Overview of C programming A C Program
Lexical elements
In C source programs the blank (space), end-of-line,
vertical tab, form feed, and horizontal tab (if present) are
known collectively as whitespace characters.
Comments are also whitespace
The end-of-line character or character sequence marks the
end of source program lines. In some implementations, the
formatting characters carriage return, form feed, and (or)
vertical tab additionally terminate source lines and are
called line break characters
EXAMPLE
if (a==b) X=1; el\
se X=2;
Is equivalent to the single line
if (a == b) X=1; else X=2;
EXAMPLE
#define nine (3*3)
A source line can be continued onto the next line by
ending the first line with a reverse solidus or backslash
(\) character. The backslash and end-of-line marker are
removed to create a longer, logical source line
Is equivalent to
#define nine /* this
is nine
*/ (3*3)
Lexical elements
Comments:
Traditionally, a comment begins with an occurrence of
the two characters /* and ends with the first
subsequent occurrence of the two characters */
Beginning with C99, a comment also begins with the
characters // and extends up to (but does not include)
the next line break
Comments are not recognized inside string or
character constants or within other comments
Comments are removed by the compiler before
preprocessing
Standard C specifies that all comments are to be
replaced by a single space
EXAMPLE
// Program to compute the squares of
// the first 10 integers
#include <stdio.h>
void Squares ( /* no arguments */ )
{
int i;
/*
Loop from 1 to 10,
printing out the squares
*/
for (i=1; i<=10; i++)
printf("%d //squared// is %d\n,i,i*i);
}
EXAMPLE
To cause the compiler to ignore large parts
of a C program, it is best to enclose the
parts to be removed with the preprocessor
commands
#if 0
#endif
rather than insert /* before and */ after the
text.
Lexical elements
An identifier or name, is a sequence of Latin capital and small letters, digits, and the
underscore character
An identifier must not begin with a digit, and it must not have the same spelling as a keyword. C
is case sensitive
Standard C further reserves all identifiers beginning with an underscore and followed by either
an uppercase letter or another underscore
C89 requires implementations to permit a minimum of 31 significant characters in identifiers, and
C99 raises this minimum to 63 characters
External identifiers those declared with storage class extern may have additional spelling
restrictions: C89 requires a minimum capacity of only six characters, not counting letter case.
C99 raises this to 31 characters
The C preprocessor - overview
The C preprocessor is a simple macro processor that conceptually
processes the source text of a C program before the compiler proper reads the
source program
The preprocessor is controlled by special preprocessor command lines, which
are lines of the source file beginning with the character #
The preprocessor typically removes all preprocessor command lines from the
source file and makes additional transformations on the source file as
directed by the commands
The syntax of preprocessor commands is completely independent of
(although in some ways similar to) the syntax of the rest of the C language
The C preprocessor - overview
The preprocessor does not parse the source
text, but it does break it up into tokens for
the purpose of locating macro calls
Standard C permits whitespace to precede
and follow the # character on the same
source line
C source
file
Preprocess
Modified C
source file
Compile
Preprocessor lines are recognized before
macro expansion
Object code
The C preprocessor
Command
Meaning
#define
Define a preprocessor macro
#undef
Remove a preprocessor macro definition.
#include
Insert text from another source file.
#if
Conditionally include some text based on the value of a constant expression.
#ifdef
Conditionally include some text based on whether a macro name is defined.
#ifndef
Conditionally include some text with the sense of the test opposite to that of #ifdef.
#else
Alternatively include some text if the previous #if, #ifdef , #ifndef, or #elif test failed.
#endif
Terminate conditional text.
#line
Supply a line number for compiler messages.
defined
Preprocessor function that yields 1 if a name is defined as a preprocessor macro and 0 otherwise; used in #if and #elif.
# operator
Replace a macro parameter with a string constant containing the parameter's value.
## operator
Create a single token out of two adjacent tokens.
#pragma
Specify implementation-dependent information to the compiler.
#error
Produce a compile-time error with a designated message.
The C preprocessor - #define
The #define preprocessor command causes a name
(identifier) to become defined as a macro to the
preprocessor
EXAMPLES
A sequence of tokens, called the body of the macro, is
#define BLOCK _SIZE 0x100
#define TRACK _SIZE (16-BLOCK_ SIZE)
associated with the name
#define product(x,y) ((x)*(y))
The #define command has two forms:
An object like macro takes no arguments. It is
invoked by mentioning its name
A function like macro declares the names of formal
parameters within parentheses separated by commas
The left parenthesis must immediately follow
the name of the macro with no intervening
whitespace
#define incr(v,low,high) \
for ((v) = (low); (v) < = (high); (v) ++))
#ifndef MAXTABLESIZE
#define MAXTABLESIZE 1000
#endif
The C preprocessor - #define
Once a macro call has been expanded, the scan for macro calls resumes at the beginning
of the expansion so that names of macros may be recognized within the expansion for the
purpose of further macro replacement
Macros appearing in their own expansion-either immediately or through some intermediate
sequence of nested macro expansions-are not re-expanded in Standard C
EXAMPLE
#define plus(x,y) add(y,x)
#define add(x,y) (x)+(y)
the invocation plus(plus(a,b),c) is expanded as shown next
Step
1
2
3
4
5
Result
plus(plus(a,b),c )
add(c,(plus(a,b))
((c)+(plus(a,b)))
((c)+(add(b,a)))
((c)+(((b)+(a))))
#define sqrt(x) ( (x) <0 ? sqrt (-(x)) : sqrt (x))
The C preprocessor - #define
Macros operate purely by textual substitution of
EXAMPLES
tokens:
#define SQUARE(x) x*x
The invocation
SQUARE (z+1)
will be expanded into: z+1*z+1
WHICH IS NOT WHAT WAS INTENDED SOLUTION:
This can lead to surprising results if care is not
taken:
As a rule, it is safest to always parenthesize
each parameter appearing in the macro body
#define SQUARE(x) ((x)*(x))
The invocation
SQUARE (z++)
will be expanded into: (z++)*(z++)
WHICH HAS THE SIDE EFECT OF DOUBLE
INCREMENTING z
The entire body, if it is syntactically an
expression, should also be parenthesized.
Function like macros does not allow the debug
process to trace or step into them. The errors
inside them are hard to find by debugging.
SOLUTION: USE A TRUE FUNCTION NOT A FUNCTION
LIKE MACRO
int square(int x)
{
return x*x;
}
Q: What would be the disadvantage of using a function
instead of a function like macro?
The C preprocessor - #include
The #include preprocessor command causes the entire contents of a specified source text file
to be processed as if those contents had appeared in place of the #include command
The #include command has the following forms in Standard C:
#include <char-sequence>
searches for the file in certain standard places according to implementation-defined search
rules
#include char-sequence
will also search in the standard places, but usually after searching some local places, such
as the programmer's current directory
The C preprocessor - #if,#endif usecase
The preprocessor conditional commands allow lines of
source text to be passed through or eliminated by the
preprocessor on the basis of a computed condition
EXAMPLE
#define X86 0
#define ARM 0
#define PPC 1
#if X86
#endif
#if ARM
The preprocessor replaces any name in the #if expression
that is not defined as a macro with the constant 0
#endif
#if PPC
#endif
X86-dependent code
ARM-dependent code
PPC -dependent code
EXAMPLE
#define X86 1
#undef ARM
#undef PPC
The expressions that may be used in #if and #elif commands
#ifdef X86
include integer constants and all the integer arithmetic,
#endif
#ifdef ARM
relational, bitwise and logical operators
#endif
#ifdef PPC
#endif
X86-dependent code
ARM-dependent code
PPC -dependent code
C Programming Language Second Part
Data representation
Types
Conversions
Q&A
Data representation
All data objects in C except bit fields are represented at run time in the computer's memory in an
integral number of abstract storage units
Each storage unit is made up of some fixed number of bits, each of which can assume either of
two values, denoted 0 and 1
Each storage unit must be uniquely addressable and is the same size as type char
The C Standard also calls storage units bytes: a storage unit consisting of exactly eight bits
The size of a data object is the number of storage units occupied by that data object
Data representation
The addressing model most natural for C is one in
which each character (byte) in the computer's memory
can be individually addressed
i = 305419896 = 0x12345678 (4 bytes)
Little endian
Memory addresses
Memory locations
Computers using this model are called byte-
0x1200
0x78
addressable computers
0x1201
0x56
Byte order establishes which byte of storage is
0x1202
0x34
considered to be the "first" one in a larger piece:
0x1203
0x12
"little-endian" architectures: the address of a 32bit integer is also the address of the low-order
byte of the integer (Intel convention)
Big endian
Memory addresses
Memory locations
0x1200
0x12
"big-endian" architectures the address of a 32-bit
0x1201
0x34
integer is the address of the high-order byte of the
0x1202
0x56
integer (Motorola convention)
0x1203
0x78
Data representation
Others impose alignment restrictions on certain
data types, requiring that objects of those types
occupy only certain addresses
It is not unusual for a byte-addressed computer, for
example, to require that 32-bit (4-byte) integers be
located on addresses that are a multiple of four
The C programmer is not normally aware of
alignment restrictions because the compiler takes
care to place data on the appropriate address
boundaries
i = 305419896 = 0x12345678 (4 bytes)
Aligned on word boundary
Memory locations
Memory addresses
0x1200
0x78
0x1201
0x56
0x1202
0x34
0x1203
0x12
Not aligned on word boundary
Memory locations
Memory addresses
0x1201
0x78
0x1202
0x56
0x1203
0x34
0x1204
0x12
Data Types
Data Types & sizes
Integer types on a 16-bit machine
Standard C specifies the minimum precision for most
integer types:
Type
short int
Type char must be at least 8 bits wide
unsigned short int
type short must be at least 16 bits wide,
int
type long must be at least 32 bits wide
unsigned int
type long long must be at least 64 bits wide.
Standard C requires that: int not be shorter than short int
and long int not be shorter than int
long int
Smallest value
Largest Value
-32768
32767
65535
-32768
32767
65535
-2147483648
2147483647
4294967295
unsigned long int
Integer types on a 32-bit machine
Type
short int
unsigned short int
int
unsigned int
long int
unsigned long int
Smallest value
Largest Value
-32768
32767
65535
-2147483648
2147483647
4294967295
-2147483648
2147483647
4294967295
Data Types
The character type in C is an integral type
Character type specified:
char
signed char
unsigned char
For reasons of efficiency, C compilers are free to treat type char in either of two ways:
Type char may be a signed integral type equivalent to signed char
Type char may be an unsigned integral type equivalent to unsigned char
EXAMPLES
unsigned char uc = -1;
signed char sc = -1;
char c = -1;
int i = uc, j = sc, k = c;
i must have the value 255
j must have the value 1
it is implementation-defined whether k has the value 255 or - 1
Data Types
FIoating-point type specifiers:
float
double
long double (C89)
C does not dictate the sizes to be used for the floating-point types or even that they be different
The programmer can assume that the values representable in type float are a subset of those in type
double, which in turn are a subset of those in type long double
EXAMPLES
double d;
static double pi;
float coefficients [8] ;
long double epsilon;
Types - pointers
For any type T, a pointer type "pointer to T" may be formed
Pointer types are referred to as object pointers or function pointers depending on whether T is an object
type or a function type.
A value of pointer type is the address of an object or function of type T.
The two most important operators used in conjunction with pointers are:
the address operator, &, which creates pointer values
indirection operator, *, which dereferences pointers to access the object pointed to
The size of a pointer is implementation-dependent and in some cases varies depending on the type of the
object pointed to
Types - pointers
EXAMPLES
int i, j, *ip;
ip = &i;
i = 22;
j = *ip; /* j now has the value 22 */
*ip = 17; /* i now has the value 17 */
Variables
i
Memory addresses
0x1200
Memory locations
0x00
0x00
0x1220
0x00
0x00
ip
0x1240
0x00
0x00
Types - pointers
EXAMPLES
int i, j, *ip;
ip = &i;
i = 22;
j = *ip; /* j now has the value 22 */
*ip = 17; /* i now has the value 17 */
Variables
i
Memory addresses
0x1200
Memory locations
0x00
0x00
0x1220
0x00
0x00
ip
0x1240
0x00
0x12
Types - pointers
EXAMPLES
int i, j, *ip;
ip = &i;
i = 22;
j = *ip; /* j now has the value 22 */
*ip = 17; /* i now has the value 17 */
Variables
i
Memory addresses
0x1200
Memory locations
0x16
0x00
0x1220
0x00
0x00
ip
0x1240
0x00
0x12
Types - pointers
EXAMPLES
int i, j, *ip;
ip = &i;
i = 22;
j = *ip; /* j now has the value 22 */
*ip = 17; /* i now has the value 17 */
Variables
i
Memory addresses
0x1200
Memory locations
0x16
0x00
0x1220
0x16
0x00
ip
0x1240
0x00
0x12
Types - pointers
EXAMPLES
int i, j, *ip;
ip = &i;
i = 22;
j = *ip; /* j now has the value 22 */
*ip = 17; /* i now has the value 17 */
Variables
i
Memory addresses
0x1200
Memory locations
0x11
0x00
0x1220
0x16
0x00
ip
0x1240
0x00
0x12
Types - pointers
The need for a generic data pointer that can be converted to any object pointer type arises
occasionally in low-level programming.
Standard C introduced the type void* as a "generic pointer
Type void* is considered to be neither an object pointer nor a function pointer
EXAMPLES
void *generic_ptr;
int *int_ptr;
char *char_ptr;
generic_ ptr = int_ptr; /* OK */
int_ptr = generic_ptr; /* OK */
int_ ptr = char_ ptr; /* invalid */
Int_ptr = (int*) char_ptr; /* OK */
void *memcpy(void *s1, const void *s2, size_t n);
Types - Arrays
If T is any C type except void or a function type, then the type "array of T" may be declared
The length of the array may be specified by any integer constant expression
Values of this type are sequences of elements of type T. All arrays are 0-origin.
These values, known as elements, can be individually selected by their position within the array
Subscripting or indexing is used to access a particular element
EXAMPLE
int A[3];
A[0] = 100;
A[1] = 200;
A[2] = 300;
is equivalent with
int A[3] = { 0x0064, 0x00C8, 0x012C };
Variables
A[0]
Memory addresses
0x1200
Memory locations
0x64
0x00
A[1]
0x1202
0xC8
0x00
A[2]
0x1204
0x2C
0x01
Types - Arrays
C doesnt require that subscript bounds to be checked; if a subscript goes out of range, the
programs behavior is undefined.
An array subscript may be an integer expression
The sizeof operator can determine the size of an array (in bytes); it returns a value equal with the product
between the length of the array and the length of one element.
EXAMPLE
int a[10], i;
for ( i = 0; i <= 10 ; i++ )
a[ i ] = 0;
Q: What is wrong with the code above?
Q: What is the value of sizeof operator
applied to array a?
Types - Arrays
In C there is a close correspondence between types "array of T " and "pointer to T":
When an array identifier appears in an expression, the type of the identifier is converted from
"array of T " to "pointer to T", and the value of the identifier is converted to a pointer to the first
element of the array
The only exceptions to this conversion rule is when the array identifier is used as an operand of
sizeof or address (&) operators, in which case sizeof returns the size of the entire array and &
returns a pointer to the array (not a pointer to a pointer to the first element)
Variables
EXAMPLE
A[0]
Memory addresses
0x1200
ip = &a[0];
0x00
0x00
int a[10], *ip;
ip = a;
It is exactly as :
Memory locations
ip
0x1214
0x00
0x12
Types - Arrays
In C there is a close correspondence between types "array of T " and "pointer to T":
array subscripting is defined in terms of pointer arithmetic. That is, the expression a [ i ] is
defined to be the same as * ((a) + (i)), where a is converted to &a [0]
a[i] is the same as ip[i]; so, any pointer may be subscripted just like an array
Variables
A[0]
Memory addresses
0x1200
i = ip[ 1 ]; /* i is equal with a[1] */
i = *(ip+2); /* i is equal with a[2] */
0x00
0x00
EXAMPLE
int a[10], *ip, i;
ip = a;
Memory locations
0x00
ip
0x1214
0x00
0x12
0x00
0x12
Types - Arrays
Multidimensional arrays are declared as arrays of arrays
The language places no limit on the number of dimensions an array may have
C stores the arrays in row-major order with row 0 first, then row 1 and so forth
The first dimension of an array may be left
empty
Variables
Memory addresses
A[0][0]
0x1200
EXAMPLE
*(a+1)
is a pointer to the first 3element subarray
is a pointer to the second
3-element subarray
is a pointer to the first integer in
that subarray.
*(a+1) +2
is a pointer to the third
integer in the second 3- element
subarray.
*(*(a+1) +2) is the third integer in the
second 3-element subarray
row 0
A[0][1]
A[0][2]
0x1204
0x02
0x00
0x1206
0x03
0x00
A[1][0]
0x1208
0x04
0x00
A[1][1]
row 1
a[1][2] is the same as *(*(a+1)+2)
a+1
0x01
0x00
int a[2][3] = { { 1, 2, 3 },
{ 4, 5, 6 } };
Memory locations
0x120A
0x05
0x00
A[1][2]
0x120C
0x06
0x00
Types
The type "function returning T " is a function type,
where T may be any type except "array of ... " or
"function returning ....
Functions may not return arrays or other functions,
although they can return pointers to arrays and
functions
Functions may be introduced in only two ways:
A function definition can create a function, define
its parameters and return value, and supply the
body of the function
EXAMPLE
extern int f(), (*fp) (int, int), (*apf [ ] ) (double);
i
i
A function declaration can introduce a reference
to a function object defined elsewhere
The only operations on an expression of function type
are converting it to a function pointer and calling it
fpl = f; /*
fp2 = &f; /*
int i, j, k;
i = f(14);
= (*fp) (j, k);
= (*apf [ j ])(k);
extern int f () ;
int (*fpl) (), (*fp2) ();
implicit conversion to
explicit manufacture of
pointer */
a pointer */
Types - void
The type void has no values and no operations
Type void is used:
as the return type of a function, signifying that the function returns no value
in a cast expression when it is desired to explicitly discard a value
to form the type void *, a "universal" data pointer
in place of the parameter list in a function declarator to indicate that the function takes no arguments
EXAMPLE
void main( void )
void func( int i )
int func( void )
(void)(x++ || y--);
void *memcpy(void *s1, const void *s2, size_t n);
Conversions
The C language provides for values of one type to be converted to values of other types under several
circumstances:
A cast expression may be used to explicitly convert a value to another type
An operand may be implicitly converted to another type in preparation for performing some arithmetic or
logical operation
An object of one type may be assigned to a location (lvalue) of another type, causing an implicit type
conversion
An actual argument to a function may be implicitly converted to another type prior to the function call
A return value from a function may be implicitly converted to another type prior to the function return
Conversions
When two values must be operated on in combination, they are first converted
according to the usual binary conversions to a single common type, which is
also typically the type of the result
Usual binary conversions (choose first that applies)
If either operand has type
And the other operand has type
Standard C converts both to
long double
any real type
long double
double
any real type
double
float
any real type
float
any unsigned type
any unsigned type
the unsigned type with the greater rank
any signed type
any signed type
the signed type with the greater rank
any unsigned type
any signed type of less or equal rank
the unsigned type
any unsigned type
a signed type of greater rank that can
represent all values of the unsigned type
the signed type
any unsigned type
a signed type of greater rank that cannot
represent all values of the unsigned type
the unsigned version of the signed type
any other type
any other type
(no conversion)
C Keywords