(Ebook - PDF) - Programming - C and C++ in Five Days
(Ebook - PDF) - Programming - C and C++ in Five Days
Philip Machanick
Computer Science Department
University of the Witwatersrand
2050 Wits
South Africa
[email protected]
please contact the author for commercial use
copyright Philip Machanick 1994
c o n t e n t s
Preface ......................................................................................4
Part 1Overview.......................................................................5
file structure..................................................................................................5
simple program...........................................................................................5
a few details.................................................................................................6
hands-onenter the program on page 2...................................7
part 2Language Elements ......................................................8
functions.......................................................................................................8
types..............................................................................................................9
statements....................................................................................................10
hands-ona larger program......................................................12
part 3Style and Idioms ...........................................................13
switch............................................................................................................13
loops..............................................................................................................13
arguments....................................................................................................13
pointers and returning values...................................................................14
arrays, pointer arithmetic and array arguments....................................14
hands-onsorting strings..........................................................17
part 4Structured Types...........................................................19
struct..............................................................................................................19
typedef..........................................................................................................19
putting it together: array of struct..............................................................21
hands-onsorting employee records........................................22
part 5Advanced Topics...........................................................23
preprocessor................................................................................................23
function pointers..........................................................................................24
traps and pitfalls..........................................................................................25
hands-ongeneralizing a sort ..................................................27
part 6Programming in the Large.............................................28
file structure revisited.................................................................................28
maintainability.............................................................................................28
portability......................................................................................................29
hiding the risky parts..................................................................................29
performance vs. maintainability...............................................................29
hands-onporting a program from Unix....................................31
part 7Object-Oriented Design.................................................32
identifying objects.......................................................................................32
object relationships....................................................................................32
entities vs. actions.......................................................................................33
example: event-driven program...............................................................33
design tasksimple event-driven user interface.......................34
part 8OOD and C....................................................................35
language elements.....................................................................................35
example........................................................................................................35
hands-onimplementation........................................................38
part 9Object-Oriented Design and C++..................................39
OOD summary.............................................................................................39
objects in C++..............................................................................................39
stream I/O.....................................................................................................41
differences from C.......................................................................................41
hands-onsimple example .......................................................44
part 10Classes in More Detail................................................45
constructors and destructors.....................................................................45
inheritance and virtual functions..............................................................46
information hiding.......................................................................................46
static members............................................................................................46
hands-onadding to a class.....................................................48
part 11style and idioms..........................................................49
access functions..........................................................................................49
protected vs. private...................................................................................49
usage of constructors.................................................................................50
hands-onimplementing a simple design.................................52
part 12Advanced Features .....................................................53
mixing C and C++.......................................................................................53
overloading operators................................................................................53
memory management................................................................................54
multiple inheritance....................................................................................55
cloning..........................................................................................................56
hands-on3-D array class........................................................58
part 13Design Trade-Offs.......................................................59
case studyvector class...........................................................................59
defining operators vs. functions...............................................................59
when to inline..............................................................................................59
the temporary problem...............................................................................60
hands-onvector class using operators ...................................61
part 14More Advanced Features and Concepts .....................62
templates......................................................................................................62
exceptions....................................................................................................63
virtual base classes....................................................................................63
future feature: name spaces.....................................................................64
libraries vs. frameworks.............................................................................64
i n d e x......................................................................................65
Preface
C was developed in the 1970s to solve the problem of implementing the UNIX operating
system in a maintainable way. An unexpected consequence was that UNIX also became
relatively portable. Consequently, some think of UNIX the first computer virus, but this is
erroneous. There are major technical differences between UNIX and a virus.
C was designed a time when computer memories were small, especially on the low-
end computers for which UNIX was originally designed. At the same time, compiler-
writing techniques were not as well developed as they are today. Most of the code
optimization technology of the time was oriented towards making FORTRAN floating-point
programs as fast as possible, and tricks used in modern compilers to keep registers in
variables as long as possible, and to minimize the number of times and array index must
be computedto give two exampleswere still to be developed.
As a consequence, to make C viable for operating system development, the language
has many features which are unsafe, and with todays compiler technology, unnecessary.
Even so, only the best compilers, typically found on UNIX systems, implement really
good code generation, and typical PC compilers are not as good. Part of the reason is to
be found in the instruction set of the Intel 80x86 processor line, which has very few
general-purpose registers, and a large range of equivalent instructions to choose from in
some circumstances.
These notes introduce C with a modern style of programming, emphasizing avoidance
of the most risky features, while explaining where their use may still be appropriate. The
intended audience is experienced programmers who may be used to a more structured
language, such as Pascal, Ada or Modula2; differences from such languages are noted
where appropriate or useful.
As a bridge to C++, object-oriented design is introduced with C as a vehicle. This
illustrates how object-oriented design is a separate concept from object-oriented
languageseven though an object-oriented language is clearly the better implementation
medium for an object-oriented design.
The notes are divided into 14 parts, each of which is followed by a hands-on or
discussion session. The first half is about C, concluding with object-oriented design and
how it relates to C. C++ is introduced as a better way of implementing object-oriented
designs.
The notes are intended to supply enough material to absorb in a week; some sources of
further information include:
Brian W Kernighan and Dennis M Richie. The C Programming Language (2nd edition), Prentice-Hall,
Englewood Cliffs, NJ, 1988. ISBN 0131103628.
Margaret A Ellis and Bjarne Stroustrup. The Annotated C++ Reference Manual, Addison-Wesley, Reading,
MA, 1990. ISBN 0201514591.
Stanley B Lippman. C++ Primer (2nd edition), Addison-Wesley, Reading, MA, 1989. ISBN 0201
179288.
Grady Booch. Object-Oriented Design with Applications, Addison-Wesley, Reading, MA, 1991. ISBN 0
201565277.
acknowledgement
Andrs Salamon proof read this document and suggested some clarifications.
Part 1Overview
file structure
C source files are organized as compilable files and headers. A header file contains
declarations; a compilable file imports these declarations, and contains definitions.
A definition tells the compiler what code to generate, or to allocate storage for a
variable whereas a declaration merely tells the compiler the type associated with a name.
Headers are generally used for publishing the interface of separately compiled files.
In UNIX its usual to end compilable file names with . c and headers with . h.
A compilable file imports a header by a line such as (usually for system headers):
#i ncl ude <st di o. h>
or (usually for your own headers):
#i ncl ude " empl oyees. h"
The difference between the use of <> and " " will be explained later.
When the header is imported, its as if the #i ncl ude line had been replaced by the
contents of the named file.
cautionin C, this is only a conventionbut one that should not be
broken: the header could contain anything, but it should only contain
declarations and comments if your code is to be maintainable
simple program
Only a few more points are needed to write a program, so heres an example:
#i ncl ude <st di o. h>
voi d mai n( i nt ar gc, char * ar gv[ ] )
{ i nt i ;
f or ( i =0; i < ar gc; i ++)
pr i nt f ( " command l i ne ar gument [ %d] = %s\ n" , i , ar gv[ i ] ) ;
}
The first line imports a system header, for standard input and output. The second line
is the standard way of declaring a main program. A main program can return a result of
type i nt , though this one doesnt actually return a value, hence the voi d.
The main program has two arguments, the first of which is a count of command-line
arguments from the command that started the program. The second is a pointer to an array
of strings each of which is a separate command-line argument. By convention, the first
string is the name of the program. Note the syntax: a * is used to declare a pointer, and
an empty pair of square brackets is used to denote a variable-sized array.
The next thing to notice is the use of curly brackets for a begin-end block.
The main program declares a variable i , and uses it as a loop control variable.
Note the convention of counting from zero, and using a < test to terminate the loop.
This convention is useful to adhere to because C arrays are indexed from zero.
The f or loop control actions must be in parentheses, and the initialization, test and
increment are separated by semicolons.
The body of the loop is a single statement in this case, so no {} are needed to group
statements into the body.
The body of the loop uses library function pr i nt f ( ) to produce output. The first
string is used to format the arguments that follow. The %d in the format string causes the
next argument to be printed as a decimal integer, and the %s causes the final argument to
be printed as a string. A \ n terminates a line of output.
what does the program do? No prizes for the answer.
a few details
In C, there is no distinction between functions and procedures. There is a distinction
between statements and expressions, but it is usually possible to use an expression as a
statement. For example, the following is legal C code (note use of / * */ for
comments):
voi d mai n( i nt ar gc, char * ar gv[ ] )
{ i nt i ;
i +1; / * t hi s i s an expr essi on * /
}
This is a silly example, but there are cases where the result of an expression is not
needed, just its side effect (i.e., what it changes in the global environment).
Functions that do not return a result are declared to return type voi d, which is a
general non-type also used to specify pointers that cant be dereferenced, among other
things.
A few examples of expressions that might be considered statements in other
languages:
assignmentdone with = in Cso for example, its possible to do a string of
initializations in one go (comparison for equality uses ==):
i nt i , j ;
i = j = 0; / * j = 0 i s an expr essi on: r et ur ns new val ue of j * /
procedure callalways a function call in C, even if no value is returned (on the other
hand a value-returning function can be called as if it were a procedure call, in which
case the value is thrown away)
increment (var _ name++) and decrement (var _ name- - ): the exact behaviour of these
constructs is too complex for an introduction; they are explained more fully later
Unlike some languages (e.g., LISP, Algol-68) that treat everything as expressions,
most other statements in C cannot be used to return a value. These include selection (i f
and swi t ch), loops (whi l e, f or and dowhi l ethe latter like a Pascal repeat) and {}
blocks.
6
hands-onenter the program on page 2
aims: learn to use the editor and compiler; get a feel for C syntax.
caution: C is case-sensitivebe careful to observe capitalization (for
example: r and R are different variable namesbut someone
maintaining your code wont thank you if you exploit this feature)
7
part 2Language Elements
functions
Functions in C do not have to have a return type specified: the default is i nt . It is a good
convention however to put the type in even in this case. Functions that are called as
procedures (i.e., return no value) are declared as returning voi d.
A functionas we shall see latercan be stored as a variable or passed as a
parameter, so a function has a type like any other value.
The complete specification of the functions type is given by a prototype, specifying
the functions return type, name and argument types, for example:
voi d sor t ( i nt dat a[ ] , i nt n) ;
It is permissible to leave out the names of the arguments:
voi d sor t ( i nt [ ] , i nt ) ;
This is not good practice: names make the purpose of the arguments more obvious.
Prototypes are usually used in headers, but can be used if the function is called before
its defined. As we shall see later, prototypes are also used in C++ class declarations.
In C, parameter passing is by value: values of arguments are copied to the function.
To pass by reference (Pascal var parameters), you create a pointer to the parameter in the
call. This is done using the & operator, which creates a pointer to its operand. For
example:
voi d swap ( i nt * a, i nt * b)
{ i nt t emp;
t emp = * a;
* a = * b;
* b = t emp;
}
/ * cal l ed somewher e: * /
i nt f i r st , second;
/ * gi ve t hem val ues, t hen: * /
swap ( &f i r st , &second) ;
Inside the function, a and b are pointers to i nt s (i nt * ). To access their values in the
function they must be dereferenced. The Pascal dereference operator is ^; Cs is * . A
notational convenience: you write a variable in a declaration the same way as you write it
when you dereference it. This makes it easy to remember where to put the * .
In the call, the variables f i r st and second are not of a pointer type, so a pointer to
the values they hold has to be created explicitly using the & operator.
What would happen if the example changed as follows?
voi d swap ( i nt a, i nt b)
{ i nt t emp;
t emp = a;
a = b;
b = t emp;
}
/ * cal l ed somewher e: * /
i nt f i r st , second;
/ * gi ve t hem val ues * /
swap ( f i r st , second) ; / * what does t hi s act ual l y do? * /
To return a value from a function:
r et ur n val ue; / * i mmedi at el y exi t s f unct i on * /
Functions returning voi d can use r et ur n with no value to exit immediately (e.g. on
discovering an error condition).
Unlike in Pascal, functions can only be global, though their names can be restricted to
file scope by declaring them st at i c (see Part 6 for more on st at i c).
types
A few C types have sneaked in without introduction.
Now its time to be more explicit.
Weve already seen the types i nt (integer) and i nt * (pointer to integer).
These types correspond to integer types in most other languages though as usual it is
possible for different compilers on the same machine to have different conventions as to
the size of a given type.
In C it is often assumed that an i nt is the size of a machine address, though when we
reach the portability section in Part 6 we shall see that this can be a problem.
Type char is a single byte integer type.
Integer types can be qualified by shor t , l ong or unsi gned (or any combinations of
these that make sense and are supported by the compilere.g., shor t l ong doesnt
make sense). You can leave i nt out if a qualifier is used. The Standard is vague about
which must be supported and the sizes they may be. Often, l ong and i nt are the same.
(Originally, i nt was 16 bits and l ong 32 bits; when the transition to 32 bits was made,
many compiler writers left l ong as 32 bits.)
There is no boolean type: any integer value other than zero tests as true.
exercise: look up the sizes for your compiler
even better exercise: use si zeof ( t ype_name) to find out
note: sizeof has to have parentheses if called on a type name, but not if
called on an expressions, e.g., si zeof sor t dat a
Reals are represented by type f l oat . Inconsistently, the type for a double-precision
f l oat is doubl e (and not l ong f l oat ). Extended precision reals are l ong doubl e.
To tell the compiler an integer constant is a l ong, it has an L at the end of the number
(not necessarily capital
If you ask me it makes more sense for variables to be case insensitive and to insist on a capital L in this
case rather than vice-versa.
9
F r e d \ 0 unused
caution: if you declare an array of char to store a string, make sure its
1 bigger than the longest string you need to store to allow for the
terminating null character
statements
C has the usual collection of assignments, selection, iteration and procedure call
statements. We will not dwell on details of syntax here; more detail will be considered in
Part 3.
caution: its easy to write i f ( i =0) instead of i f ( i ==0) . Whats the
effect of this error? C has no boolean type. If i is an i nt , the compiler
wont report an error: i =0 is a valid i nt expression. Good compilers
issue a warning: send yours back if it doesnt
There are two selection statements, i f and swi t ch:
i f ( n==0) / * not e " ( ) " but no " t hen" * /
pr i nt f ( " no dat a\ n" ) ;
el se / * el se i s opt i onal * /
{ / * use {} f or mor e t han one st at ement * /
aver age = t ot al / n;
pr i nt f ( " Aver age = %d\ n" , aver age) ;
}
or the same thing with a swi t ch:
swi t ch ( n) / * agai n t he ( ) i s needed * /
{ / * t he cases must be encl osed i n {} * /
case 0: / * can be any const ant i nt expr essi on * /
pr i nt f ( " no dat a\ n" ) ;
br eak;
def aul t : / * i n ef f ect case " anyt hi ng" * /
aver age = t ot al / n; / * not e no {} needed * /
pr i nt f ( " Aver age = %d\ n" , aver age) ;
br eak; / * not st r i ct l y necessar y * /
}
The swi t ch is quite complicated. The br eak is used to quit; without it you fall
through to the next case. There can be more than one case, e.g.,
case ' a' : case ' e' : case ' i ' : case ' o' : case ' u' :
pr i nt f ( " vowel \ n" ) ;
br eak;
This is a degenerate case of falling through with no br eak.
A br eak after the last case (def aul t in the first example) isnt needed, but ending
every case with a br eak makes it easier to be avoid errors and to modify the code later.
caution: the rules of always putting in a br eak and never falling through
from one case to another derive from many years of experience of
maintaining codedont break them
Loops present slightly fewer possibilities for getting into trouble.
A whi l e is reasonably straightforward:
whi l e ( i > 0)
i - - ; / * decr ement i * /
10
As is a do- whi l e:
do
{ i - - ;
} whi l e ( i >0) ;
exercise: how does the behaviour of the above two loops differ?
The f or loop is a touch more complicated. Its control consists of initialization, test and
increment:
f or ( i =f i r st ( ) ; i <l ast ( ) && i > cut _ of f ( ) ; i ++)
; / * do somet hi ng * /
is one example (notice how parameterless functions are called with empty
parentheses).
aside: in C andis written &&; or is | | ; single versions & and | are
bitwise operations; ! is not (1 if operand is 0, 0 otherwise)
All loops can be broken by br eak (exits the innermost loop enclosing the br eak) or
cont i nue (goes to the control computation, skipping the rest of the innermost loop body).
These constructs are a bit better than unrestricted goto; see hints on usage later.
11
hands-ona larger program
Given the partially written program below, fix the indicated bug, and add
code to count the number of negative, zero and positive numbers in the
data read in. You should use swi t ch, i f and at least one loop construct
#i ncl ude <st di o. h>
/ * bug: shoul d check i f s i s zer o * /
i nt si gn ( i nt s)
{ r et ur n abs( s) / s;
}
mai n ( )
{ i nt dat a [ 10] ,
i , n,
negat i ves, zer os, posi t i ves;
n = si zeof dat a / si zeof ( i nt ) ;
negat i ves = zer os = posi t i ves = 0;
pr i nt f ( " Ent er %d number s : " , n) ;
/ * need l oop on i f r om 0 t o n- 1 ar ound t hi s * /
/ * r ead i n t he dat a * /
scanf ( " %d" , &dat a[ i ] ) ;
/ * now count negat i ves , zer os, posi t i ves * /
pr i nt f ( " negat i ves=%d, zer os=%d, posi t i ves=%d\ n" ,
negat i ves, zer os, posi t i ves) ;
}
12
part 3Style and Idioms
switch
Weve been through most of the important features of the swi t ch already. Perhaps the
most important point to emphasize is that this statement can easily become very large and
clumsy, and a disciplined approach is necessary.
If you find yourself programming with large complex swi t ch statements, its time to
clean up you design: look for a simpler way of expressing the problem.
When we look at object-oriented programming, we shall see an alternative: the use of
dynamic dispatch, in which the type of the object determines what kind of action should
be carried out. Remember this point for now: if you are using C++ and end up with a lot
of big clumsy swi t ch statements, reconsider your design.
loops
When we look at arrays and pointers, some interesting strategies for writing loops to go
through an array quickly will come up. In this section well stick to variants on loop
behaviour different from languages such as Pascal.
In situations such as operating systems, event-driven user interfaces and simulations,
where termination is unusual (system shutdown, quitting application, or special action to
clean up the simulation) its useful to have a loop which goes on forever, with termination
almost an exception (error) condition.
C doesnt have an explicit construct for this but many C programmers simply write
whi l e ( 1)
{ / * do al l ki nds of t hi ngs * /
i f ( good_ r eason_ t o_ qui t ( ) )
br eak;
}
This kind of use of br eak is acceptable programming style. However you should take
care not to abuse this feature, as loops with multiple exit points are hard to debug and
maintain. Ideally the loop should either have exactly one exit point, or any additional exit
points should only handle very rare conditions. Otherwise, you are back to unstructured
programming as if you had used an unrestricted goto construct.
The cont i nue statement is less commonly used, and on the whole it is probably better
to use an i f to skip over unwanted parts of a loop. I have never used cont i nue and have
never encountered a situation where I felt it could be useful. If you find a need for it,
rethink the problem. Its not good style to use a rarely used featureyou will confuse
maintainers.
arguments
Parameter passing in C seems superficially very simple but there are some serious traps
and pitfalls. The worst of these is the way the pass by value rule interacts with the way
arrays are implemented.
In C, an array is in fact a pointer which (usually) happens to have had memory
allocated for it automatically. Consequently, when an array is passed as a parameter, what
is actually copied on the function call is not the whole array but just a pointer to its first
element. This is very efficient compared with copying the whole array but it also means
its easy to write a function that alters an array, forgetting that the original is being altered,
and not a copy.
caution: theres no such thing in C as passing a whole array by value:
only a pointer is copied and the actual elements are overwritten if the
function changes them
double caution: I lied: if an array is a field in a st r uct (see Part 4), it is
copied when the st r uct is passed as a parameter
pointers and returning values
When you want to change the value of an actual parameter (the one in the call), you must
send a pointer to the parameter, rather than its value. This wouldnt be a problem if C had
strict type checking, but since it doesnt things can go wrong.
If you are using an older C compiler that doesnt conform to the ANSI standard, it
may take a very lax view of what is or isnt a pointer, or of mixing different types of
pointers. Luckily recent compilers conforming to the ANSI standard of 1989 do better
type checking, and C++ is even more strict.
Even with newer compilers however, there are cases where type checking isnt done.
The worst case is with functions designed to take a variable number of arguments, such as
scanf ( ) , which does formatted input. For example, in the code
i nt i ;
scanf ( " %d" , i ) ; / * shoul d be &i * /
scanf ( ) should read an integer in decimal format into i from standard input, but i
should in fact be &i . As a result of this error, scanf ( ) uses i as a pointer with
unimaginable consequences. A compiler cannot in general detect such errors. C++ has
another I/O mechanism, so these routines dont have to be usedbut in C its vital to
check parameters in functions taking a variable number of arguments very carefully.
Its a good strategy to put all input (where this problem is worst) in one place to make
checking easier.
arrays, pointer arithmetic and array arguments
Brief mention has been made of how C treats arrays as pointers. The array definition
i nt a[ 100] ;
has almost
the same effect as defining a pointer and allocating space for 100 i nt s:
i nt * a = ( i nt * ) mal l oc( si zeof ( i nt ) * 100) ;
A few points about mal l oc( ) :
it returns voi d* , and so must be coerced to the correct pointer typehence the
( i nt * ) which converts the type of the expression to pointer to integer
it must be given the correct size in bytes (remember: a string needs one extra byte for
the terminating null character)
the resulting memory cannot be assumed to be initialized
if memory cannot be allocated, the NULL pointer is returned; in principle every call to
mal l oc( ) should check for this. It may not always be necessary (e.g. if you have
carefully calculated how much free memory you have)but this is not something to
be casual about
memory allocated by mal l oc( ) can be deallocated by f r ee( pt r _ name) : be careful to
call f r ee( ) only on a valid pointer which has been allocated by mal l oc( )
Back to arrays: an array access a[ i ] is equivalent to pointer arithmetic. C defines
addition of a pointer and an i nt as returning an address that integral number of units from
the original address (the unit is the size of the object pointed to). For example:
i nt * a; / * di scussi on bel ow assumes si zeof ( i nt ) == 4 * /
doubl e * d; / * and si zeof ( doubl e) == 8 * /
The main difference: the arrays value, while a pointer, isnt an l-value (more on l-values in Part 13).
14
a = ( i nt * ) mal l oc ( si zeof ( i nt ) * 100) ;
d = ( doubl e* ) mal l oc ( si zeof ( doubl e) * 20) ;
a ++; / * shor t f or a = a + 1 * /
d += 2; / * shor t f or d = d + 2 * /
results in as value changing by 4 (to point to the next i nt ), while ds changes by 16
(2 doubl es further on in memory).
In terms of pointer arithmetic, since an arrays value is a pointer to its first element,
a[ i ] is equivalent to
* ( a+i )
Because pointer arithmetic is commutative, the above can be rewritten as
* ( i +a)
orbizarrely:
i [ a]
What use is this? If you have a loop accessing successive array elements, such as
i nt i ;
doubl e dat a[ 1000] ;
f or ( i =0; i <1000; i ++)
dat a[ i ] = 0. 0;
an inefficient compiler generates an array index operation once every time through the
loop. An array index operation requires multiplying the index by the size of each element,
and adding the result to the start address of the array.
Consider the following alternative code:
i nt i ;
doubl e dat a[ 1000] , * copy, * end;
end = dat a+1000;
f or ( copy = dat a; copy < end; copy ++)
* copy = 0. 0;
On one compiler I tried, the latter code executed 40% fewer instructions in the loop,
and did no multiplies, which it did to do array indexing. But this was with no
optimization. I turned optimization on, and the array version was two instructions
shorter!
caution: this could lead you astray. Heres a hacker version of the loop
(it has no body) for which my optimizing compiler generates exactly the
same number of instructions as the readable version:
/ * decl ar at i ons and end as bef or e */
f or ( copy=dat a; copy<end; *( copy++) =0. 0) ;
Why then does pointer arithmetic persist as a feature?
Sometimes its necessary for performance reasons to write your own low-level code,
for example, a memory manager. In such a situation, you may have to write very general
code in which the size of units you are dealing with is not known in advance. In such
cases the ability to do pointer arithmetic is very useful.
However if you find yourself doing pointer arithmetic because your compiler doesnt
generate good code for array indexing, look for a better compiler.
This feature is in the language for reasons of efficiency that have been superseded by
better compiler technology, but there are still rare occasions where it is useful.
caution: pointer arithmetic is hard to debug and hard to maintain. Make
sure you really do have good reason to use it and there is no alternative
before committing to itand once you have decided to use it, isolate it to
15
as small a part of your code as possible: preferably a single file
containing all your low-level hard-to-understand code
16
hands-onsorting strings
Here is some code to sort integers. Modify it to sort strings. Use the
followin:
#i ncl ude <st r i ng. h>
/ * f r om whi ch use
i nt st r cmp( char * s, char * t ) r et ur ns
<0 i f s < t , 0 i f s ==t , >0 i f s > t
you may al so need
voi d st r cpy( char * s, char * t ) copi es t t o s
* /
declare your string array as follows:
char st r i ngs [ 10] [ 255] ; / * 10 st r i ngs of up t o 255 char s each * /
and read them in as follows
char st r i ngs [ 10] [ 255] ; / * NB not [ 10, 255] * /
i nt i ;
pr i nt f ( " Ent er 10 st r i ngs, max 255 char s each: \ n" ) ;
f or ( i =0; i < n; i ++)
scanf ( " %s" , st r i ngs[ i ] ) ;
#i ncl ude <st di o. h>
voi d swap ( i nt dat a[ ] , i nt i , i nt j )
{ i nt t emp;
t emp = dat a[ i ] ;
dat a[ i ] = dat a[ j ] ;
dat a[ j ] = t emp;
}
voi d sor t ( i nt dat a[ ] , i nt n)
{ i nt i , j ;
f or ( i = 0; i < n- 1; i ++)
f or ( j = i + 1; j > 0; j - - )
i f ( dat a[ j - 1] > dat a[ j ] )
swap ( dat a, j - 1, j ) ;
}
voi d mai n( )
{ i nt sor t _ dat a [ 10] ,
i , n;
n = si zeof ( sor t _ dat a) / si zeof ( i nt ) ;
pr i nt f ( " Ent er %d i nt eger s t o sor t : " , n) ;
f or ( i =0; i <n; i ++)
scanf ( " %d" , &sor t _ dat a[ i ] ) ;
sor t ( sor t _ dat a, n) ;
pr i nt f ( " Sor t ed dat a: \ n\ n" ) ;
f or ( i =0; i <n; i ++)
pr i nt f ( " %d " , sor t _ dat a[ i ] ) ;
pr i nt f ( " . \ n" ) ;
}
caution: this isnt as easy as it looksto do this as specified requires a
good understanding of the way 2-dimensional arrays are implemented
in C; its actually much easier if the strings are allocated using pointers,
17
so you can swap the pointers much as the i nt s are swapped above.
Heres how to initialize the pointers:
char * st r i ngs [ 10] ;
i nt i , n = 10, st r _ si ze = 255; / * i n pr act i ce r ead n f r om f i l e * /
f or ( i = 0; i < n; i ++)
st r i ngs [ i ] = ( char * ) mal l oc ( 255) ;
18
part 4Structured Types
struct
The C st r uct is essentially the same as a record type in languages like Pascal, but there
are some syntactic oddities.
A st r uct is declared as in the following example:
st r uct Empl oyee
{ char * name;
i nt empl oyee_ no;
f l oat sal ar y, t ax_ t o_ dat e;
};
Variables of this type can be defined as follows:
st r uct Empl oyee secr et ar y, MD, sof t war e_ engi neer ;
Note that the name of the type is st r uct Empl oyee, not just Empl oyee.
If you need to define a mutually recursive type (two st r uct s that refer to each other),
you can do a forward declaration leaving out the detail, as in a tree in which the root holds
no data node, and each other node can find the root directly:
st r uct Tr ee;
st r uct Root
{ st r uct Tr ee * l ef t , * r i ght ;
};
st r uct Tr ee
{ st r uct Tr ee * l ef t , * r i ght ;
char * dat a;
st r uct Root * t r ee_ r oot ;
};
caution: the semicolon after the closing bracket of a st r uct is essential
even though it isnt needed when similar brackets are used to group
statements
double caution: a * must be put in for each pointer variable or field; if
left out the variable is not a pointer (e.g., r i ght in both cases above must
be written as *r i ght if a pointer type is wanted)
typedef
C is not a strongly typed language. Aside from generous conversion rules between
various integer types, Cs named typesdeclared by t ypedef are essentially shorthand
for the full description of the type. This is by contrast with more strongly typed languages
like Modula2 and Ada, where a new named type is a new typeeven if it looks the same
as an existing type.
This is also a useful opportunity to introduce the enumroughly equivalent to
enumerated types of Pascal, Ada and Modula2, but with a looser distinction from integer
types.
A t ypedef is written in the same order as if a variable of that type were declared, but
with the variable name replaced by the new type name:
t ypedef i nt cent s;
introduces a new type that is exactly equivalent to i nt , which for reasons of
maintainability has been given a new name.
Back to enums now that we have the t ypedef mechanism. An enumintroduces a
symbolic name for an integer constant:
enum Bool ean {FALSE, TRUE};
establishes symbolic names for the values 0 and 1, which can give programmers
accustomed to typed languages a greater sense of security. Unfortunately this is not a
default in system headers, where #def i ne is usually used to define TRUE and FALSE
(using the preprocessor). This can lead to a conflict of incompatible definitions. More on
the preprocessor in Part 5.
It is also possible to specify the values of the enumnames, for example,
enum vowel s {A=' a' , E=' e' , I =' i ' , O=' o' , U=' u' };
Once you have specified values for an initial group of names, if you stop supplying
names, the rest continue from the last one specified:
enum di gi t _ enum {ZERO=' 0' , ONE, TWO, THREE/ * et c. * / };
You can now make a variable of one of these types, or better still, make a type so you
dont have to keep writing enum:
enum di gi t _ enum number = TWO; / * i ni t i al i zer must be a const ant * /
t ypedef enum di gi t _ enum Di gi t s;
Di gi t s a_ di gi t ; / * l ook: no enum * /
For a more complicated example of something well look at in more detail later in Part
5, here is how to declare a type for a function pointer:
t ypedef i nt ( * Compar e) ( char * , char * ) ;
This declares a type called Compar e, which can be used to declare variables or
arguments which point to a function returning i nt , and taking two pointers to char
(probably strings) as arguments. More detail in Part 5.
Back to something more immediately useful: to reduce typing, it is common practice to
supply a st r uct with a type name so that instead of writing
st r uct Tr ee sor t _ dat a;
its possible to write
t ypedef Tr ee Sor t _ t r ee;
Sor t _ t r ee sor t _ dat a;
In fact, it is common practice when declaring a st r uct to give it a type name
immediately:
st r uct t r ee_ st r uct ;
t ypedef st r uct r oot _ st r uct
{ st r uct t r ee_ st r uct * l ef t , * r i ght ;
} Root ;
t ypedef st r uct t r ee_ st r uct
{ st r uct Tr ee * l ef t , * r i ght ;
char name[ 100] ;
Root * t r ee_ r oot ;
} Tr ee;
From here on, its possible to declare variables of type Root or Tr ee, without having
to put in the annoying extra word st r uct .
As we shall see, C++ classes are a much cleaner way of defining structured types.
Some notation for access fields of st r uct s:
Root f ami l y_ t r ee;
f ami l y_ t r ee. l ef t = ( Tr ee* ) mal l oc( si zeof ( Tr ee) ) ;
st r cpy( f ami l y_ t r ee. l ef t - >name, " Mum" ) ;
( * f ami l y_ t r ee. l ef t ) . l ef t = NULL; / * - > bet t er * /
Note the use of - > to dereference a pointer to a st r uct and access a field in one go
instead of the more cumbersome ( * name) . name.
20
putting it together: array of struct
Using structured types, pointers, and arrays, we can create data of arbitrary complexity.
For example, we can make a mini-employee database using the Employee st r uct at
the start of this Part. If we assume the number of employees is fixed at 10, we can store
the whole database in memory in an array.
Using t ypedef s to clean things up a bit:
t ypedef st r uct emp_ st r uct
{ char * name;
i nt empl oyee_ no;
f l oat sal ar y, t ax_ t o_ dat e;
} Empl oyee;
t ypedef Empl oyee Dat abase [ 10] ;
Dat abase peopl e = / * i ni t i al i zer : r eal DB woul d r ead f r om di sk* /
{ {" Fr ed" , 10, 10000, 3000},
{" J i m" , 9, 12000, 3100. 5},
{" Fr ed" , 13, 1000000, 30},
{" Mar y" , 11, 170000, 40000},
{" J udi t h" , 45, 130000, 50000},
{" Ni gel " , 10, 5000, 1200},
{" Tr evor " , 10, 20000, 6000},
{" Kar en" , 10, 120000, 34000},
{" Mar i anne" , 10, 50000, 12000},
{" Mi l dr ed" , 10, 100000, 30000}
};
Well now use this example for an exercise in putting together a more complicated
program.
21
hands-onsorting employee records
Starting from the toy employee record database, rewrite the sorting code
from the string sort to sort database records instead, sorting in
ascending order on the employee name. If any employees have the
same name, sort in ascending order of employee number.
22
part 5Advanced Topics
preprocessor
Its now time to look in more detail at what #i ncl ude does, and introduce a few more
features of the C preprocessor. As mentioned before, the effect of #i ncl ude is to include
the named file as if its text had appeared where the #i ncl ude appears.
Two major questions remain to be answered:
the difference between names enclosed in <> and " "
how to avoid including the same header twice if its used in another header
Usually, <> is for system or library includes, whereas " " is for your own headers.
The reason for this is the order of searching for files: files in <> are searched for among
system or library headers first, before looking where the current source file was found,
whereas the opposite applies when " " is used.
caution: the current source file may not be the one you think it is: its
the file containing the #i ncl ude that brought this file in. If that file is a
header file that isnt in the directory of the original compilable file, you
could end up bringing in a header with the right name but in the wrong
directory. In cases of mysterious or bizarre errors, check if your compiler
has the option of stopping after preprocessing, so you can examine the
preprocessor output
Avoiding including the same header twice is usually a matter of efficiency: since
headers shouldnt cause memory to be allocated or code to be generated (declarations not
definitions), bringing them in more than once shouldnt matter.
The mechanism for preventing the whole header being compiled a second time
requires introducing more features of the preprocessor.
Before the C compiler sees any compilable file, its passed through the preprocessor.
The preprocessor expands #i ncl udes and macros. Also, it decides whether some parts of
code should be compiled, and (usually) strips out comments.
Macros and conditional compilation are the key to achieving an include-once effect.
A preprocessor macro is introduced by #def i ne, which names it and associates text
with the name. If the name appears after that, its expanded by the preprocessor. If the
macro has parameters, theyre substituted in. This happens before the compiler starts: the
macros text is substituted in as if you replaced it using a text editor.
The next important idea is conditional compilation using #i f or #i f def , as in
#def i ne PC 1 / * PC, a pr epr ocessor symbol , expands as 1 * /
#i f PC
#i ncl ude <pc. h>
#el se
#i ncl ude <uni x. h>
#endi f
This sequence defines a preprocessor macro, PC, which expands to 1. Then, the value
of PC is tested to decide which header to use. You can usually specify preprocessor
symbols at compile time, and most compilers have built-in symbols (specifying which
compiler it is, whether its C or C++, etc.).
The usual way of doing this is to use #i f def , which checks if a preprocessor symbol
is defined without using its valueor #i f ndef , which does the opposite. This is typically
the way header files are set up so they are only compiled once, even if (through being
used in other headers) they may be included more than once:
/ * Fi l e: empl oyees. h * /
#i f ndef empl oyees_ h / * f i r st t hi ng i n t he f i l e * /
#def i ne empl oyees_ h
/ * decl ar at i ons, i ncl udes et c. * /
#endi f / * l ast t hi ng i n t he f i l e * /
The effect of this is to ensure that for any compilable file importing empl oyees. h, it
will only be compiled once. Using the name of the file with the . replaced by an
underscore to make it a legal preprocessor name is a common convention, worth adhering
to so as to avoid name clashes.
This is not as good as one would like because the preprocessor must still read the
whole file to find the #endi f , so some compilers have mechanisms to specify that a
specific include file should only be looked at once.
caution: it seems a good idea to put the test for inclusion around the
instead of inside the header. This saves the preprocessor from having to
read the whole filebut you end up with a big mess because you must
put the test around every #i ncl ude
Heres an example of the other use of macros, to define commonly used text:
#def i ne t i mes10( n) 10* n
As well see later, C++ largely does away with the need for preprocessor macros of
this kind by providing inline functions.
function pointers
Function pointers derive from the idea of functions a data type, with variables able to refer
to different functions, as long as argument and return types match. Modula2 also has this
feature; Pascal and Ada dont (some extended Pascals do, and you can pass procedures as
parameters in Pascal, but not Ada). As well see later, C++ classes are a cleaner way of
achieving the generality offered by function pointers, but they are important to understand
because theyre often used in C library and system routines.
To see why this is a useful feature, well look at how the sort routine we used in the
last two exercises could be generalized to handle both data types we used.
To do this, we need to reconstruct the array we used into an array of pointers, so the
sort routine can index the array without knowing the size of each element.
This introduces the need to dynamically allocate memory, which well do using the
system routine mal l oc( ) , which had a brief guest appearance in Part 3, where we looked
at arrays and pointer arithmetic.
To make mal l oc( ) return a pointer of the correct type, a type cast (type name in
parentheses) of its result is needed. mal l oc( ) has one argument: the number of bytes
required. Special operator si zeof (special because it operates on a type) is useful to get
the size. If you call it with an expression rather than a type, this expression isnt evaluated:
the compiler determines its type and hence its size.
caution: be sure you are asking for si zeof ( ) of the object being
pointed to, not the size of the pointer, which will usually be way too
small; one exception: si zeof an array type is the size of the whole array
even though an array is normally equivalent to a pointer
Back to the problem: we need a comparison function the sort can use to check the
order of two employee records, strings etc. The sort also needs to swap array elements.
Prototypes for functions for this purpose look like this:
/ * - 1 i f dat a[ s] <dat a[ t ] ; 0 i f equal , el se +1 * /
i nt compar e ( i nt * dat a[ ] , i nt s, i nt t ) ;
/ * swap t wo el ement s i n t he gi ven ar r ay * /
voi d swap ( i nt * dat a[ ] , i nt s, i nt t ) ;
24
You have to supply functions conforming to the types of these prototypes for the sort.
Within the functions, the type of dat a can be cast to the required type. Here are t ypedef s
for the function pointers (type names comp_ pt r and swap_ pt r ):
t ypedef i nt ( * comp_ pt r ) ( i nt * dat a[ ] , i nt s, i nt t ) ;
t ypedef voi d ( * swap_ pt r ) ( i nt * dat a[ ] , i nt s, i nt t ) ;
Parentheses around the type name are needed to stop the * from applying to the return
type (making it i nt * ).
The sort can then be defined by the prototype:
voi d sor t ( i nt * dat a[ ] , i nt n, comp_ pt r compar e,
swap_ pt r swap) ;
and can call the function pointers as follows:
i f ( compar e( dat a, i , j ) > 0)
swap( dat a, i , j ) ;
Older compilers required dereferencing the function pointer, as in
i f ( ( *compar e) ( dat a, j - 1, j ) > 0)
but newer ones allow the simpler call notation
To sort employee records, you have to supply swap and compare routines:
i nt comp_ empl oyee ( i nt * dat abase[ ] , i nt i , i nt j ) ;
voi d swap_ empl oyee ( i nt * dat a[ ] , i nt i , i nt j ) ;
/ * names of ar gument s don t have t o be t he same * /
and call the sort as follows:
sor t ( ( i nt * * ) my_ dat a, no_ empl oyees, comp_ empl oyee,
swap_ empl oyee) ;
/ * ar r ays ar e poi nt er s: i nt * * , i nt * [ ] same t ype* /
traps and pitfalls
Something not fully explained in previous sections is the C increment and decrement
operators. These are in 2 forms: pre- and post-increment (or decrement in both cases).
Preincrement is done before the variable is used: an expression with ++i in it uses the
value of i +1, and i is updated at the same time. An expression with postincrement i ++
uses the value of i before it changes, and afterwards i s value is replaced by i +1.
Perfectly clear? No. The problem comes in expressions with more than one increment on
the same variable. Does afterwards (or before in the case of prefix versions) mean after
that place in the expression, or after the whole expression?
What is the result of the following, and what value do i and j have at the end?
i = 0;
j = i ++ + i ++;
Under one interpretation, i is incremented immediately after its value is used, so the
expression for j evaluates in the following sequence:
i s val ue 0 used f or j ; i ++ makes i s val ue 1
i s val ue 1 used f or j : j s f i nal val ue i s 1; i ++ makes i s val ue 2
Under another interpretation, i is incremented at the end of the statement:
i s val ue 0 used f or j ; no i ++ yet so i s val ue st ays on 0
i s val ue 0 used f or j : j s f i nal val ue i s 0; bot h i ++ push i t o 2
25
caution: if you write code like this, immediately after you are fired the
person assigned to maintaining your code after you leave will resign
26
hands-ongeneralizing a sort
Heres a start towards the sort code, with a main program at the end:
/ * f i l e sor t . h * /
#i f ndef sor t _ h
#def i ne sor t _ h
t ypedef i nt ( * comp_ pt r ) ( i nt * dat a[ ] , i nt s, i nt t ) ;
t ypedef voi d ( * swap_ pt r ) ( i nt * dat a[ ] , i nt s, i nt t ) ;
voi d sor t ( i nt * dat a[ ] , i nt n, comp_ pt r compar e, swap_ pt r swap) ;
#endi f / * sor t _ h * /
/ * f i l e empl oyee. h * /
#i f ndef empl oyee_ h
#def i ne empl oyee_ h
t ypedef st r uct emp_ st r uct
{ char name[ 100] ;
i nt empl oyee_ no;
f l oat sal ar y, t ax_ t o_ dat e;
} Empl oyee;
t ypedef Empl oyee * Dat abase[ 10] ;
i nt comp_ empl oyee ( i nt * dat abase[ ] , i nt i , i nt j ) ;
voi d swap_ empl oyee ( i nt * dat a[ ] , i nt i , i nt j ) ;
/ * r ead i n dat abase ( f or t hi s exer ci se f ake i t ) * /
voi d i ni t _ dat abase ( Dat abase empl oyees,
i nt no_ empl oyees) ;
/ * pr i nt out t he dat abase * /
voi d pr i nt _ dat abase ( Dat abase peopl e, i nt no_ empl oyees) ;
#endi f / * empl oyee_ h * /
/ * f i l e mai n. c * /
#i ncl ude " sor t . h"
#i ncl ude " empl oyee. h"
voi d mai n( i nt ar gc, char * ar gv[ ] )
{ const i nt no_ empl oyees = 10;
Dat abase peopl e;
i ni t _ dat abase ( peopl e, no_ empl oyees) ;
pr i nt _ dat abase ( peopl e, no_ empl oyees) ;
sor t ( ( i nt * * ) peopl e, no_ empl oyees, comp_ empl oyee, swap_ empl oyee) ;
pr i nt _ dat abase ( peopl e, no_ empl oyees) ;
}
27
part 6Programming in the Large
The distinction between programming in the small and programming in the large arises
from a desire to avoid having implementation detail interfere with understanding how
various (possibly separately written) parts of a program fit together and interact.
file structure revisited
Having seen the way the preprocessor works we are now in a better position to look at
how multi-file programs are put together.
Header files are the glue used to tie independently compiled files together. These files
may be files you are responsible for, library files, or files other programmers are
responsible for on a large project.
Unlike some later languages such as Ada or Modula2, there is no mechanism in C to
enforce using equivalent interfaces for all separately compiled files. Also, there is no
mechanism in the language to ensure that separately compiled files are recompiled if any
header they import is recompiled.
There are tools external to the language to help overcome these limitations.
In UNIX, the make program is usually used to rebuild a program: a makef i l e contains
a specification of dependencies between the files that are compiled, headers, libraries, etc.
Writing a makef i l e is beyond the scope of these notes. However it is worth noting that
there is a short cut in most versions of UNIX: the makedepend program can be used to
create most of the dependencies in the makef i l e. Most PC-based interactive environments
automate the make process, but command-line driven programming tools often still have
this problem. A related problem is that C does not require a type-sensitive linker, so its
possible (e.g. as a result of a bug in your includes) to link files which have an erroneous
expectation of function arguments, or types of global data.
C++ goes a long way towards fixing the type-safe linkage problem, so there is a case
for using a C++ compiler even for plain C programs.
For large, multi-programmer projects, many platforms have tools to manage access to
source files, so only one programmer has write permission on a file at a time. Such tools
are not part of C, and can usually be used with other languages and tools available on the
system. UNIXs make is also general-purpose, not only for C and C++.
maintainability
Cthrough its lax type checking, permissive file structure and support for low-level
hackinghas much potential for producing unmaintainable code.
If include files are used purely as module interfaces, with as few global declarations as
possible, module level maintainability can approach that of Modula2 or Ada.
C allows variables to be declared global to a file, but not visible in other compiled
files. Such a variable is declared outside functions, with the keyword st at i c. If global
state information is needed, it should ideally be restricted to a single file in this way.
For the opposite effecta globally visible variableleave out st at i c; to make it
visible elsewhere, declare it as ext er n in a header imported by files that access it.
An ext er n is a declaration, not definition: it doesnt cause memory allocation.
caution: global variables are reasonably safe within one file but
making them global to the entire program is bad for maintainability.
Preferably use access functions to read or update the variables from
another file if global state must be shared across files. You can enforce
this convention by always declaring a global using st at i c. As noted in
part 2, functions can also be st at i c
portability
Related to maintainability is portability.
If machine-specific features of your program are isolated to one file, and that file kept
a small as possible, portability is not too difficult to achieve.
Some key problems:
many C programmers assume i nt s are the same size as pointers, and both are 32 bits
or 4 bytes; this is causing problems for example in porting UNIX to new processors
with 64-bit addresses
some operating systems have case-sensitive file names. The UNIX file system is case
sensitive, while those of DOS, Windows and Macintosh arent; this can cause
problems if you are not careful about typing header file names using all lower-case,
and try to move a program to UNIX
integer types, especially char and i nt , can be different sizes even across different
compilers on the same machine ; if you rely on their size, use si zeof to check they are
what you expect (ideally embed this in your code to make it general)
path name conventions differon UNIX, the separator is a /, on DOS, a \, on
Macintoshes, a :; if portability across these operating systems is an issue, it may be
useful to separate out #i ncl udes that must pick up files from a different directory, and
put them in a file of their own:
/ * f i l e dat a. h * /
#i f ndef dat a_ h / * cont r act i on of i f ! def i ned( dat a_ h) * /
#def i ne dat a_ h
# i f def UNI X / * what ever symbol pr edef i ned by your compi l er * /
# i ncl ude " . . / dat a. h"
# el i f def i ned( DOS) / * agai n * /
# i ncl ude " . . \ dat a. h"
# el se / * f al l t hr ough t o Maci nt osh * /
# i ncl ude " : : dat a. h"
# endi f / * onl y one endi f needed when el i f used * /
#endi f / * dat a_ h * /
In general: creating one such file for each file that must be found in another directory is
a reasonable strategy if you expect to need to port your program to other operating
systems. Note my indentation to highlight nesting of the #i f sin general its bad practice
to deeply nest conditional compilation.
hiding the risky parts
Another important point related to both portability and maintainability is avoiding using
risky features such as pointer arithmetic and type casts throughout your code. Ideally, they
should be isolated into one file as noted before.
As we shall see, C++ offers better mechanisms for hiding details, but a disciplined
approach to C can pay off.
Putting together the strategies for portability and maintainability, putting machine-
dependent or otherwise potentially troublesome code in only one place is a good start. If
this is taken a step further and global variables are never exported (always use st at i c),
potential trouble can be isolated.
performance v s . maintainability
In a performance-critical program, it is tempting to ignore these rules and sprinkle the code
liberally with clever tricks to attempt to wring out every last bit of performance.
A good book on algorithm analysis will reveal that this is a futile effort: most
programs spend a very large proportion of their time in a very small part of their code.
29
Finding a more efficient algorithm is usually much more worthwhile than hacking at the
code and making it unmaintainable.
example: the sort we have been using in our examples takes
approximately n
2
operations to sort n data items. A more efficient
algorithm, such as quicksort, takes roughly n log
2
n operations. If
n=1000, n
2
= 1-million; n log
2
n is about 100 times less. The detail of
quicksort is more complex than our simple sort, so the actual speedup is
less than a factor of 100, but youd do a lot better if you are sorting 1000
items to start with quicksort and optimize itwhile still keeping it
maintainablethan use all kinds of tricks to speed up our original sort
Also recall the lesson of the optimizing compiler: there are some very good compilers
around, and before you attempt to using some of Cs less maintainable programming
idioms, try the optimizeror switch to a better compiler.
caution: if you must do strange, hard-to-understand things for good
performance, make sure that you really are attacking parts of the code
that contribute significantly to run timeand preferably isolate that code
to one file, with good documentation to aid maintenance
30
hands-onporting a program from UNIX
The following is a simple floating point benchmark written by a student a
few years ago for a UNIX platform. See how easily you can get it working
on your compiler
#i ncl ude <t i me. h>
#i ncl ude <st dl i b. h>
#i ncl ude <st di o. h>
#i ncl ude <mat h. h>
#def i ne N 1000
f l oat f i r st [ N] , second[ N] , r esul t [ N] ;
i nt i , j , i nt er at i ons = 1000;
cl ock_ t st ar t , end, el apsed;
voi d mai n ( )
{ f or ( i =0; i <N; i ++) / * i ni t i al i ze * /
{ f i r st [ i ] = r andom( ) ;
second[ i ] = r andom( ) ;
}
st ar t = cl ock ( ) ; / * st ar t t i mer * /
f or ( i =0; i <i nt er at i ons ; i ++)
f or ( j =0; j < N; j ++)
r esul t [ j ] = f i r st [ j ] * second[ j ] ;
end = cl ock ( ) ;
pr i nt f ( " Ti mi ng Ended. \ n\ n" ) ;
el apsed = end - st ar t ;
pr i nt f ( " Ti me : %f s\ n" , ( f l oat ) ( el apsed) / CLOCKS_ PER_ SEC) ;
}
Alternatively, if you are working on a UNIX platform, try porting the
following which compiles and runs on a Macintosh compiler:
#i ncl ude <t i me. h>
#i ncl ude <st di o. h>
/ * pr i nt cur r ent dat e and t i me * /
voi d mai n ( )
{ cl ock_ t now;
st r uct t m * mac_ t i me;
char * t i me_ st r ;
now = t i me ( NULL) ; / * t i me now * /
mac_ t i me = l ocal t i me( &now) ;
t i me_ st r = asct i me( mac_ t i me) ;
pr i nt f ( " Now : %s\ n" , t i me_ st r ) ;
}
31
part 7Object-Oriented Design
identifying objects
The most important element of design is abstraction. Most design methods in some or
other form include ways of layering detail, so as little as possible needs be dealt with at a
time.
A key aspect of achieving abstraction in an object-oriented design is encapsulating
detail in classes. The idea of encapsulation is that a unit of some sort puts a wall around
implementation detail, leaving only a public interface visible. This is the general idea called
information hiding. Where classes are useful is in allowing information hiding to be
implemented in a hierarchical fashion, using inheritance. Inheritance is the mechanism by
which objects are derived from others, sharing attributes they have in common, while
adding new ones, or overriding existing ones.
In the design phase, it is useful to try to find objects which can be related by
similarityor differences. Inheritance makes it possible to start with a very general class,
and specialize it to specific purposes. If this kind of decomposition can be found, an
object-oriented program is relatively easy to construct, and even non-object-oriented
languages such as C can benefit from such a design. In particular, an object-oriented
design can guide in the use of header files in a disciplined way, as well as discouraging
the use of global variables.
For example, in a simulation of n bodies interacting through gravitational attraction, it
turns out that groups of bodies far away can be grouped together and treated as a single
body. Such a clustered body has most of the properties of a normal body, except it
contains a list of other bodies which have to be updated at each round of the simulation. A
possible decomposition: make a class for ordinary bodies, while extending it for the
clustered body.
object relationships
Once having identified the major objects and which ones can be described in terms of
specialization from a more general one, other relationships can be shown using a suitable
notation to distinguish them. Some objects are contained in composite objects, or some are
clients of others that provide a service. For example, as illustrated below, a drawing object
could provide a service to the bodies in the n-body simulation.
Once the first pass of decomposition is complete, the next step is to look in more detail
at each object. In terms of the design model, its necessary to identify state (history, or
data the object stores), and behaviours the object is responsible for. By contrast, most
older design models treat procedural and data decomposition separately.
A good strategy as each stage of the design is complete is to implement a toy version
of the program with as much detail as has been designed so far; this makes it easier to be
contains
i
n
h
e
r
i
t
s
-
f
r
o
m
client-of
general_body
draw_galaxy bodycluster body
sure the design is correct. If any problems are found in implementation, its possible to go
back and correct them at an early stage. This kind of iterative approach is not unique to
object-oriented design.
As with other design methodologies, its important not too fill in too much detail at
once: you should proceed from an overall design to designing individual components,
always keeping the level of detail under control so you can understand everything you are
working on.
entities v s . actions
Some other decomposition techniques emphasize either data or program decomposition. In
an object-oriented design, depending on the context, either starting with real-world entities
or with actions carried out by them, can be a natural approach.
The n-body simulation is an example of an entity-based decomposition.
An example of an action-based decomposition is a event-driven simulation. For
example, if a new computer architecture is being designed, it is common to run extensive
simulations of programs executing on a simulator of the new design. The simulation is
typically run by scheduling events, each of which represents an action that would have
taken place in a real execution on the still-to-be-built hardware. Examples of events:
a memory read
an instruction execution
an interrupt
Also, such a simulation has a lot of global state which is however only of interest to
specific events (e.g., the state of memory is interesting to a memory read).
An object-oriented design based on events helps to focus on the common features
across events, and which data is common to which events, leading to an effective
approach to decomposition and information hiding.
example: event-driven program
A typical event-driven user interface such as Microsoft Windows or Macintosh responds
to events from the user.
Such events are conceptually very similar to events in an event-driven simulation.
The typical style of program for such an interface consists of a loop containing a call to
a system event manager to find out if there is an event to process. If there is an event, its
dispatched using a large case or swi t ch statement (with a case for each type of event the
program handles).
In terms of object-oriented design, events which have features in common should be
grouped together and common parts used as a basis for designing a higher-level
abstraction. Specific events are then derived from these general ones. Attention can also be
paid to which events need which parts of the global state of an application. A similar
approach is needed to define entities, such as documents, fonts, etc. Finally, interactions
between different types of events must be defined, and related to entities.
An event-driven application is interesting as an example where insisting on doing the
entire design using either entity-based or action-based decomposition is not helpful.
Entities such as documents are useful to model as objects. In fact the application can be
though of as breaking down into two major components: views and manipulators. Views
are such things as documents (in both their internal representation and their display on the
screen), and contents of clipboards and inter-application sharing mechanisms. On the
other hand, as we have already seen, eventswhich are a kind of actionare also useful
to model. A natural description of the behaviour of the application is in terms of interaction
between entities and actions.
33
design tasksimple event-driven user interface
For purposes of this design we will restrict the problem to user events
consisting purely of mouse downs (only one button), keystrokes, update
and null events (sent at regular intervals if the user does nothing).
Rather than look at a specific application, the design is for a generic
framework, in which the detail of each kind of event handler is left out.
The design should include: mouse down in a window (front most or not),
mouse down in a menu item, with consequences: printing the front most
document, saving the front most document, quitting or other action
added by the application.
An update event results in redrawing a window.
Think about the global state thats needed, which objects can be related,
and where to hide data.
An idea to consider: have an internal representation of the application
data, which can be displayed via various views: this is a useful
abstraction for sharing code between printing and drawing windows
34
part 8OOD and C
language elements
We have already seen the features of C we need to translate an object-oriented design to
code. To encapsulate data, we can use a st r uct , though it does not enforce information
hiding. To include actions on the data, we can store function pointers in the st r uct .
To implement inheritance is more of a problem, since there is no mechanism in C to
add new data fields to a struct. Nonetheless, we can get quite a fair part of the way
without this, since the ability to change function pointers at least allows behaviour to be
changed. To extend a data representation though theres no mechanism in the language:
we can use an editor to duplicate common parts of st r uct s, and use type casts to allow
pointers to point to more than one type of st r uct at the risk of errors not detectable by
the compiler. Another optionnot recommendedis to become a preprocessor macro
hacker.
One additional language feature can in principle be used as well: the ability to define
st at i c data within a function. The effect of this is to preserve the value of the data
between function calls (this is really the same as a st at i c variable at file scope, except its
only visible inside the function). This feature would allow us to store information global
to a class (not stored with each object). However, there is a problem: since we can
change any of the functions stored in the function pointers, we would have to be careful to
keep track of such global-to-the-class data as a static in a function when we change a
function pointer.
This last feature is not necessary in the example to follow.
example
The example follows from the design developed in Part 7.
Lets consider the relationship between just two parts of the specification: printing and
redrawing windows.
These both involve essentially similar operations: iterating over the internal
representation and rendering a view. Therefore it makes sense to have a general approach
to rendering a view that can be extended to handle either case, with as much re-use
between the two cases as possible.
We can make a general Vi ew class which renders by iterating over a representation.
For our purposes, we will fake the rendering action to keep the example simple. In C, this
can all be implemented along the following lines:
#i ncl ude <st di o. h>
t ypedef st r uct vi ew_ st r uct
{ char dat a[ 100] ; / * r eal t hi ng woul d have somet hi ng usef ul * /
} Vi ew;
/ * f unct i on poi nt er t ype * /
t ypedef voi d ( * Render er ) ( Vi ew cur r ent _ vi ew) ;
t ypedef st r uct r ender _ st r uct
{ Render er r ender _ f unct i on;
} Render ;
voi d Render er _ pr i nt ( Vi ew cur r ent _ vi ew)
{ pr i nt f ( " pr i nt i ng\ n" ) ;
}
voi d Render er _ wi ndow ( Vi ew cur r ent _ vi ew)
{ pr i nt f ( " r edr aw\ n" ) ;
}
A Vi ew could call one of these as follows (highly simplified to illustrate the
principles):
Render * pr i nt _ vi ew = ( Render * ) mal l oc ( si zeof ( Render ) ) ;
Render * wi ndow_ vi ew = ( Render * ) mal l oc ( si zeof ( Render ) ) ;
Vi ew wi t h_ a_ r oom;
pr i nt _ vi ew- >r ender _ f unct i on = Render er _ pr i nt ;
wi ndow_ vi ew- >r ender _ f unct i on = Render er _ wi ndow;
pr i nt _ vi ew- >r ender _ f unct i on( wi t h_ a_ r oom) ;
Of course the actual code would be much more complex, since the Vi ew would have to
contain a detailed representation of the objects to render, and the renderers would have to
have detailed code for drawing or printing.
36
group session: finalize the design
Having seen how part of the design could be implemented in C, refine
the design making sure you have a clean abstraction.
To avoid too many complications in implementation, make sure your
object hierarchy is not very deep, and try not to have too much variation
in data representation (state) between objects in the same hierarchy.
Also try to define a small subset of the design which you can be
reasonably sure of implementing in an afternoon
37
hands-onimplementation
Now implement the design in C. For simplicity, rather than the actual
event generation mechanism, whenever the program requires another
event, read a char from the keyboard:
char new_ event ;
new_ event = get char ( ) ;
Take q as quit, p as print, r as redraw, k as keydown and m as
mousedown.
Doing this will save the effort of implementing an event queue (needed
in the real thing since many events are generated as interrupts, outside
the control of the user program)your toy program will consume the
event as soon as its generated
38
part 9Object-Oriented Design and C++
OOD summary
Object-oriented design requires information hiding through encapsulation in objects and
sharing common features through inheritance.
In a language like C, some aspects of the design can be implemented directly, while
others can be implemented through programmer discipline.
Inheritance in particular would be useful to implement with compiler support.
objects in C++
In C++, a new concept is introduced: the cl ass. For compatibility with C, a st r uct can
be used as a class, with some minor differences. From now on, however, we shall use
classes since the syntax is more convenient.
Classes support information hiding, and allow encapsulation of data with functions
that operate on the data. Inheritance is supported, as is redefining built-in operators for
new classes. Data and functions in classes are called members. Functionsincluding
member functionscan be overloaded, i.e., its possible to have more than one function
with the same name as long as the compiler can tell them apart by the types of their
arguments, or the class of which they are a member. Overloaded functions may not be
distinguished by the type they return.
A simple class declaration looks like this:
cl ass Poi nt
{publ i c:
Poi nt ( i nt new_ x, i nt new_ y) ;
~Poi nt ( ) ;
voi d dr aw ( ) ;
pr i vat e:
i nt x, y;
};
This defines a new type, cl ass Poi nt . Unlike with C st r uct s, it isnt necessary to
use the word cl ass when declaring or defining a variable of the new type, so theres no
need to do a t ypedef to give a class a single-word name. The keyword publ i c: means
following members are visible outside the class. The member with the same name as the
class, Poi nt ( ) , is a constructor, which is called when a new variable is created, by a
definition, or after allocation through a pointer by the new operator. ~Poi nt ( ) is a
destructor, which is automatically called when a variable goes out of scope, or if allocated
through a pointer, the del et e operator is called on it.
Keyword pr i vat e: is used to make following members invisible to the rest of the
program, even classes derived from Poi nt . Parts of a class can be made accessible only to
other classes derived from it by preceding them with pr ot ect ed: .
Here is an example of two definitions of points, with their position:
Poi nt or i gi n( 0, 0) , of _ no_ r et ur n( 1000000, 1000000) ;
When to use pr i vat e: , publ i c: and pr ot ect ed: ? Only member functions that are
part of the external specificationor interfaceof the class should be made public. These
usually include the constructor and destructor (which can be overloaded with different
arguments), access functions to allow setting the internal state or reading it, without
making the internal representation known, and operations on the class.
Private members should be anything else, except secrets of the class that are shared
with derived classes. When in doubt, use privatethis isolates information to one class,
reducing possibilities for things to go wrong.
caution: C++ allows you to break the principle of encapsulation. Dont
be tempted. Making data members public is a sure way of writing
unmaintainable code. Instead, use access functions if other classes
need to see the data; this can be done efficiently using i nl i nessee
differences from C below
The interaction between storage allocation and constructors is important to understand.
When a new instance of a class is created (either as a result of a definition or of a new),
initially all that is created is raw uninitialized memory. Then, the constructor is called
and only after the constructor finishes executing can the object properly be said to exist.
Another key feature of C++ classes is inheritance. A derived class is defined by
extension from a base class (in fact possibly more than one through multiple inheritance).
The notation is illustrated with a simple example:
cl ass Shape
{publ i c:
Shape ( Poi nt new_ or i gi n) ;
~Shape ( ) ;
vi r t ual voi d dr aw ( ) = 0;
pr i vat e:
Poi nt or i gi n;
};
cl ass Ci r cl e : publ i c Shape
{publ i c:
Ci r cl e ( Poi nt new_ or i gi n, i nt new_ r adi us) ;
~Ci r cl e ( ) ;
vi r t ual voi d dr aw ( ) ;
pr i vat e:
i nt r adi us, ar ea;
};
A few points (well save the detail for Part 10): class Ci r cl e is declared as a public
derived class of Shape: that means the public and protected members of Shape are also
public or protected respectively in Ci r cl e. In a private derived class, public and protected
members of its base class become private, i.e., they cant be seen by any derived class
(unless given explicit permission with a f r i end declaration). A virtual function is one that
can be replaced dynamically if a pointer to a specific class in fact points to one of its
derived classes. If a class has virtual functions, it has a virtual function table to look up the
correct member function. The line
vi r t ual voi d dr aw ( ) = 0;
in class Shape means Shape is a abstract class, and creating objects of that class is an
error that the compiler should trap. In such a case, the function is called a pure virtual
function. Only classes that are derived from it that define dr aw (or are derived from others
that define dr aw) may be instantiated. (This is a bit like having a function pointer in a C
st r uct , and setting it to the NULL pointerbut in such a case the C compiler wont detect
the error of trying to call a function through a NULL pointer.)
Since Ci r cl e is not an abstract class, we can do the following:
Ci r cl e l et t er ( or i gi n, 100) ;
Shape * i _ m_ i n = new Ci r cl e( of _ no_ r et ur n, 22) ;
i _ m_ i n- >dr aw ( ) ; / * cal l s Ci r cl e: : dr aw * /
The double-colon is the C++ scope operator, is used to qualify a member explicitly if
it isnt clear which class it belongs to, or to force it to belong to a specific class. Notice
how member function dr aw( ) is called through an object: in a member function, the
current object can be accessed through a pointer called t hi s (you seldom need to use t hi s
explicitly: a member function can use members of its class directly).
Note that the compiler should complain if you try something like:
Shape bl ob( or i gi n) ;
40
with a message along the lines of
Er r or : cannot cr eat e i nst ance of abst r act cl ass ' Shape'
stream I/O
One additional point is worth explaining now: input and output using the i ost r eam
library. It defines three widely used standard streams: cout , ci n and cer r , based on the
UNIX convention of standard out, standard in and standard error. Output to a stream is
strung together with the << operator, while input is strung together with >> (also usedas
in Crespectively, as left and right shift bit operators).
Use of the default streams (C++ has end-of-line comments, started by / / ):
#i ncl ude <i ost r eam. h>
voi d mai n ( )
{ cout << " Ent er a number : " ; / / no l i ne br eak
ci n >> i ;
cer r << " Number out of r ange" << endl ; / / endl ends l i ne
} / * can al so use C- st yl e comment * /
if you need to use files, use
#i ncl ude <f st r eam. h>
A stream can be associated with a file for output (i os: : out is an enumvalue):
of st r eam my_ out ( " f i l e. t xt " , i os: : out ) ;
and used to write to the file:
my_ out << " A l i ne of t ext ended by a number : " << 100 << endl ;
To read the file:
i f st r eam my_ i n ( " f i l e. t xt " , i os: : i n) ;
char dat a[ 100] ;
my_ i n >> dat a;
You can also explicitly open the file, if you didnt connect the i f st r eamor of st r eam
object to a file when you defined it:
#i ncl ude <st dl i b. h> / * f or exi t ( ) * /
i f st r eam my_ i n;
my_ i n. open ( " f i l e. t xt " , i os: : i n) ;
i f ( ! my_ i n)
{ cer r << " open f ai l ed" << endl ;
exi t ( - 1) ; / / ki l l pr ogr am r et ur ni ng er r or code
}
/ / use t he f i l e . . . t hen f i nal l y:
my_ i n. cl ose ( ) ;
If you need to do both input and output on a file, declare it as class f st r eam; open
with i os: : i n| i os: : out which combines the two modes using a bitwise or.
caution: I/O is one of the most system-dependent features of any
language. Streams should work on any C++ but file names are system-
specific (.e.g., DOSs \ path separator, vs. UNIXs /)
differences from C
Classes, aside from supporting object-oriented programming, are a major step towards
taking types seriously. Some see C++ as a better Cif you use a C++ compiler on C
code and fix everything it doesnt like the chances are you will unearth many bugs.
Classes bring C++ much closer to having types as in a modern language such as Ada
or Modula2, while adding features such as inheritance that both languages lack.
41
Classes, templates (done briefly in Part 14) and inline functions reduce the need to
define arcane code using the preprocessor. Heres an example of an inline function:
i nl i ne i nt t i mes10 ( i nt n)
{ r et ur n 10 * n;
}
The i nl i ne directive asks the compiler to attempt to substitute the function in directly,
rather than to generate all the overhead of a procedure call. The following two lines should
cause the same code to be generated:
a = t i mes10 ( b) ;
a = 10 * b;
Its particularly useful to use an i nl i ne as an access function (e.g. for private
members of a class). Compilers dont always honour an i nl i ne: its only a directive.
Early compilers didnt inline complex functions where the procedure call overhead was
minor compared with the function; this is less common now.
Since an i nl i ne doesnt generate code unless (and wherever) its used, it should be in
a header file. I prefer to keep inlines separate from the . h file, since they are
implementation detail rather than part of the interface of a class. My file structure is
File par t _ of _ pr ogr am. h: contains classes, ends with
#i ncl ude " par t _ of _ pr ogr am. i nl "
File par t _ of _ pr ogr am. i nl : contains inlines
File par t _ of _ pr ogr am. c++: starts with
#i ncl ude " par t _ of _ pr ogr am. h"
Make sure the compiler has seen the i nl i ne directive before you call the function.
This can be a problem if an inline calls another, if you use my file strategy. One fix is to
put a prototype of the problem function at the top of the file, with an i nl i ne directive.
Another is to put the i nl i ne directive into the class declaration, but this is bad practice.
Inlining is implementation detail, better not made into part of the classs interface.
Another bad practice: you can put the body of a member function in the class
declaration, in which case its assumed to be inlined. This again puts part of the
implementation into the publicly advertised interface of the class.
caution: if an inline is called before the compiler sees its an inline, it
generates a normal call. This causes an error when the compiler sees
the i nl i ne directive, or later when the linker tries to find non-existent
code (the compiler doesnt generate code for an inline, but substitutes it
in directly). If you get link errors, this is one thing to check
C++ requires type-safe linking. This isnt as good as it could be since it assumes a
UNIX-style linker, with no type information from the compiler. Instead, name mangling is
implemented by most C++ compilers: argument types of functions and their class are
added to the name, so overloaded versions of the function can be distinguished, and only
functions with correct argument types are linked. This is not guaranteed to find all errors;
you still need to be sure to use consistent header files, and to use a mechanism such as
make to force recompilation when necessary.
One other significant addition to C is the reference type, t ypename&. This differs from
a pointer in that you dont have to explicitly dereference itthe compiler generates pointer
manipulation for you. Reference types can be used as arguments in function calls, with the
same effect as Pascal var parameters:
voi d swap ( i nt &a, i nt &b)
{ i nt t emp;
t emp = a;
a = b;
b = t emp;
42
}
/ / usage ( cor r ect , unl i ke Par t 2 exampl e) :
swap ( f i r st , second) ;
In UNIX, C++ compilable files usually end in . C or . c++. PC compilers use
. cpp more often. Headers usually end in . h, though . hpp is sometimes
used on PCs
43
hands-onsimple example
Dont worry too much about the detail of the below codewell look at
detail in the Part 10. For now, concentrate on how classes are used: fill
in the main program as suggested, and see what output results
#i ncl ude <i ost r eam. h>
#i ncl ude <st r i ng. h>
cl ass Vi ew
{publ i c:
Vi ew ( const char new_ name[ ] ) ;
~Vi ew ( ) ;
pr i vat e:
char name[ 100] ; / * r eal t hi ng woul d have somet hi ng usef ul * /
};
cl ass Render
{publ i c:
vi r t ual voi d dr aw ( Vi ew * t o_ dr aw) = 0;
};
cl ass Pr i nt : publ i c Render
{publ i c:
vi r t ual voi d dr aw ( Vi ew * t o_ dr aw) ;
};
cl ass Updat e : publ i c Render
{publ i c:
vi r t ual voi d dr aw ( Vi ew * t o_ dr aw) ;
};
voi d Pr i nt : : dr aw ( Vi ew * t o_ dr aw)
{ cout << " Pr i nt " << endl ;
}
voi d Updat e: : dr aw ( Vi ew * t o_ dr aw)
{ cout << " Updat e" << endl ;
}
Vi ew: : Vi ew ( const char new_ name[ ] )
{ st r ncpy( name, new_ name, si zeof ( name) - 1) ; / / copy max. 99 char s
}
voi d mai n( )
{ Vi ew * wi ndow = new Vi ew ( " wi ndow" ) ;
Render * r ender er ;
r ender er = new Pr i nt ;
r ender er - >dr aw ( wi ndow) ;
/ / based on t he above, cr eat e an obj ect of cl ass Updat e
/ / by r epl aci ng t he Wor d Pr i nt by Updat e i n t he l ast 3
/ / l i nes - now t r y t o r el at e t hi s t o t he obj ect - or i ent ed desi gn
/ / exer ci se
}
44
part 10Classes in More Detail
constructors and destructors
A constructor is a special function that is called automatically. As defined, it does not
return a value (i.e., doesnt contain a statement r et ur n expr essi on; ). Its effect is to turn
uninitialized memory into an object through a combination of compiler-generated code and
the code you write if you supply your own constructor. If you dont supply a constructor,
compiler supplies onea default constructor, with no arguments. The additional code the
compiler generates for your own constructor is calls to default constructors for objects
within the object which dont have their own constructor, and code to set up the virtual
function table (if necessary).
Here is an example of a constructor:
Shape: : Shape ( Poi nt new_ or i gi n) : or i gi n( new_ or i gi n)
{
}
The first thing thats interesting is the use of the scope operator : : to tell the compiler
that this is a member of the class Shape (through writing Shape: : ) . Then theres the
initializer for or i gi n after the colon. Why could we not do the following?
Shape: : Shape ( Poi nt new_ or i gi n)
{ or i gi n = new_ or i gi n;
}
This is not allowed because or i gi n hasnt had a constructor called on it. Look back at
class Poi nt . The only constructor defined for Poi nt has to take two i nt arguments. Once
we define a constructor, the default parameterless constructor is no longer available
(unless we explicitly put one in). However there is another compiler-supplied constructor,
the copy constructor, which allows you to initialize an object from another. Thats what
or i gi n( new_ or i gi n) does. Initializers for contained classes follow a colon after the
arguments, separated by commas. Shape is very simple, so theres very little for the
constructor to do, so heres a more interesting example:
Ci r cl e: : Ci r cl e ( Poi nt new_ or i gi n, i nt new_ r adi us) :
Shape ( new_ or i gi n) , r adi us ( new_ r adi us)
{ ar ea = PI * r adi us * r adi us; / / PI usual l y i n <mat h. h>
}
ar ea could also be initialized in the header, but its easier to read if initialization that
doesnt require its own constructor is in the body of the constructor. I would usually write
the above as follows, with only the base class initialized in the heading:
Ci r cl e: : Ci r cl e ( Poi nt new_ or i gi n, i nt new_ r adi us) :
Shape ( new_ or i gi n)
{ r adi us = new_ r adi us;
ar ea = PI * r adi us * r adi us;
}
Any base classes that have a default constructor need not have the constructor
explicitly called, but if you want to pass parameters to the constructor of a base class, it
must be done using this mechanism.
Another thing to notice: once youve passed the scope operator in Ci r cl e: : , you can
refer to class members without further qualification.
A destructor is only needed if some kind of global state needs to be undone. For
example, if an object contains pointers, the constructor will probably allocate memory for
them (though this could happen later, e.g., if the object is the root of a tree or node of a
list), and the destructor should see to it that it is deallocatedotherwise the memory is not
reclaimed, resulting in a memory leak. (In general a memory leak is a gradual
disappearance of available memory through failing to reclaim memory from pointers that
are either no longer active, or have had their value changed.)
Destructors are automatically called in the correct order if an object is of a derived
class. The destructor for Ci r cl e is empty and could have been left out, but since its in
the class specification we must define it:
Ci r cl e: : ~Ci r cl e ( )
{
}
inheritance and virtual functions
Consider the following:
voi d Ci r cl e: : dr aw ( )
{
/ / cal l some syst em r out i ne t o dr aw a ci r cl e
}
Shape * i _ m_ i n = new Ci r cl e( of _ no_ r et ur n, 22) ;
i _ m_ i n- >dr aw ( ) ;
Where dr aw( ) is called, the compiler generates code to find the correct function in the
virtual function table stored with object * i _ m_ i n, and calls Ci r cl e: : dr aw( ) .
What happens to virtual function calls in a constructor? Until all constructors terminate
(the current one may have been called by another constructor), the virtual function table
may not be set up. To be safe, the virtual function is called like a regular member
functionbased on the class currently under construction.
On the other hand, destructors can both be virtual and can call virtual functions with
the expected semantics: the compiler ensures that the virtual function table is valid.
information hiding
Classes are a much more robust way of hiding information than Cs limited capabilities of
hiding local variables inside functions and hiding names at file scope by making them
st at i c. These mechanisms are still available in C++, but with the additional machinery of
the class mechanism, names visible to a whole file are seldom necessary.
Furthermore, low-level detail can be hidden by putting it in a base class. The machine-
specific parts of the class can be made private, preventing direct access to them by even
classes derived from them.
We shall now look at a more robust mechanism for hiding global state than Cs file-
scope st at i cs.
static members
If a member of a class has the word st at i c before it in the class declaration, it means
there is only one instance of it for the whole class in the case of a data member, or in the
case of a function, that it can be called without having to go through an object.
For example, to extend the class for shapes, it would be useful to have a global count
of all shapes that have been created. To do this, we need a count that is stored only once,
not for every shape, and a function to look up the count (because we dont make data
members public):
/ / i n t he header f i l e:
cl ass Shape
{publ i c:
Shape ( Poi nt new_ or i gi n) ;
~Shape ( ) ;
vi r t ual voi d dr aw ( ) = 0;
st at i c i nt get _ count ( ) ;
pr i vat e:
Poi nt or i gi n;
st at i c i nt count ;
46
};
/ / i n t he compi l abl e f i l e:
i nt Shape: : count ( 0) ; / / coul d al so use " count = 0"
i nt Shape: : get _ count ( )
{ r et ur n count ;
}
The line i nt Shape: : count ( 0) ; is needed because a static member must be defined.
A class declaration is like a t ypedef in the sense that it doesnt cause memory to be
allocated. Non-static members have memory allocated for them when a variable of the
class is defined, or operator new is used, but static data members must be explicitly
defined in this way. We must also change the constructor and destructor to keep the count
(since count is private, this can only be done by Shape):
Shape: : Shape ( Poi nt new_ or i gi n) : or i gi n( new_ or i gi n)
{ count ++;
}
Shape: : ~Shape ( )
{ count - - ;
}
Now, whenever a new object of any class derived from Shape is created, count is
incremented, and decremented whenever an object of such a class ceases to exist. You can
look up the count as follows:
#i ncl ude <i ost r eam. h>
/ / assume t he above cl asses et c.
/ / can l eave out ar gument s t o mai n i f not used
voi d mai n( )
{ Ci r cl e l et t er ( or i gi n, 100) , r ound( l et t er ) ;
Shape * i _ m_ i n = new Ci r cl e( of _ no_ r et ur n, 22) ;
cout << " Number of shapes i s " << Shape: : get _ count ( ) << endl ;
}
which results in the output
Number of shapes i s 2
47
hands-onadding to a class
Define a class Doubl e_ci r cl e which contains an offset and its dr aw( )
calls Ci r cl e: : dr aw( ) twiceonce with the radius increased by the
offsetand use it in a simple test program (you may fake drawing by
writing out the radius). Use the Ci r cl e class defined in this section,
adding the missing part of dr aw( )
Hint: set the radius to its new value before calling Ci r cl e: : dr aw( ) the
second time, and reset it afterwards
48
part 11style and idioms
access functions
Its generally bad style to put data members into the public interface of a class. To do so is
bad for maintenance, and is not much better than unrestricted use of global variables.
Its much better practice, if parts of the data of a class need to be accessed elsewhere,
to use access functions. If the representation is changed later, only the access functions
need change, not every place in the code where the data is accessed. Also, the places
where values change are easier to find, which makes for better maintainability. For
example:
/ / shapes. h
#i f ndef shapes_ h
#def i ne shapes_ h
cl ass Ci r cl e : publ i c Shape / / cl ass Shape as bef or e
{publ i c:
Ci r cl e ( Poi nt new_ or i gi n, i nt new_ r adi us) ;
~Ci r cl e ( ) ;
vi r t ual voi d dr aw ( ) ;
i nt get _ r adi us ( ) ;
i nt get _ ar ea ( ) ;
voi d put _ r adi us ( i nt new_ r adi us) ;
pr i vat e:
i nt r adi us, ar ea;
};
#i ncl ude " shapes. i nl "
#endi f / * shapes_ h * /
/ / shapes. i nl
i nl i ne i nt Ci r cl e: : get _ r adi us ( )
{ r et ur n r adi us;
}
i nl i ne i nt Ci r cl e: : get _ ar ea ( )
{ r et ur n ar ea;
}
i nl i ne voi d Ci r cl e: : put _ r adi us ( i nt new_ r adi us)
{ r adi us = new_ r adi us;
ar ea = PI * r adi us* r adi us;
}
/ / f i l e mai n. c++
#i ncl ude " shapes. h"
voi d mai n( )
{ Ci r cl e l et t er ( or i gi n, 100) ;
l et t er . put _ r adi us ( 200) ;
}
protected v s . private
This has been mentioned before but is worth repeating. As much of a class as possible
should be private. Private members are hidden even from derived classes. Protected is
better than public, but derived classes can still see protected members.
This relationship applies if a derived class is derived publicly, as in our examples. If a
class is derived privately, no names are visible in the derived class, for example:
cl ass Ci r cl e : pr i vat e Shape / / et c.
usage of constructors
Constructors are mostly automatically invoked. There are a few points about constructors
that are important to understand, aside from those mentioned already.
An array of objects can only be defined using default (parameterless) constructors. If
you have an array declaration, youll get a compiler error if you defined constructors with
arguments, and havent supplied a default constructor (only if you supply no constructor
is a default constructor supplied by the compiler).
If you have common code across several constructors, or want to re-initialize an
object, can you call a constructor explicitly? No. You have to separate out the initialization
code you want to re-use into an ordinary member function.
In C, a type cast looks like this:
( i nt * ) st r i ng_ var ; / * t ur n ar r ay of char i nt o poi nt er t o i nt * /
In C++, you can do the same thing. But you can also define type conversions by
defining a constructor taking a given type as an argument:
#i ncl ude <st r i ng. h>
cl ass P_ st r i ng
{publ i c:
P_ st r i ng ( ) ;
P_ st r i ng ( const char * c_ st r i ng) ;
pr i vat e:
char p_ st r i ng[ 256] ;
};
P_ st r i ng: : P_ st r i ng ( )
{
}
P_ st r i ng: : P_ st r i ng ( const char * c_ st r i ng)
{ i nt l engt h = st r l en( c_ st r i ng) ;
char * st ar t = &p_ st r i ng[ 1] ; / / &p_ st r i ng[ 1] i s i t s addr ess
i f ( l engt h >= 256)
p_ st r i ng[ 0] = 255; / / need 1 byt e f or l engt h
el se
p_ st r i ng[ 0] = l engt h;
st r ncpy( st ar t , c_ st r i ng, 255) ; / / copy at most 255 char s
}
P_ st r i ng di al og_ name = P_ st r i ng ( " Save Fi l e" ) ; / / f or exampl e
Constructor P_st r i ng( " Save Fi l e" ) does type conversion from null character-
terminated C string to Pascal string (first byte is the lengthcommon on Mac and PC).
A very useful property of a destructor: its automatically called for non-pointer
variables, even if you r et ur n from a function, or br eak from a loop or swi t ch.
The following is useful to ensure that bracketing operations are properly paired (e.g.,
set the graphics state, do something, restore it to its previous state):
cl ass St at e
{publ i c:
St at e ( ) ;
~St at e ( ) ;
pr i vat e:
St at e_ dat a st at e_ i nf o;
};
St at e: : St at e ( )
{ / / save pr evi ous st at e, t hen set new st at e
}
St at e: : ~St at e ( )
{ / / r eset t o st or ed st at e
}
voi d updat e_ wi ndow ( )
{ St at e gr aphi cs_ st at e;
50
/ / do st uf f t o t he wi ndow
i f ( some_ condi t i on)
r et ur n;
}
The destructor is called in two places: before the r et ur n, and before the final }. If you
used explicit calls to save and restore state, it would be easy to forget the restore in one of
these cases.
51
hands-onimplementing a simple design
Choose a manageable part of your event-driven interface design, and
implement it in C++, using classes. You should find it a lot easier than
trying to implement an object-oriented design in C. In particular, a deep
hierarchy with many additions to the state at each level should no longer
present a big problem
52
part 12Advanced Features
mixing C and C++
Sometimes its useful to link separately compiled C code with a C++ program. You are
likely to run into problems if the main program is not written in C++, because most C++
compilers insert initialization code into the main program. Otherwise, the major problem to
overcome is the expectation of the C++ compiler that type-safe linking is used. The
ext er n " C" mechanism is supplied to solve this problem:
ext er n " C"
{ / / {} needed onl y i f mor e t han 1 decl ar at i on
#i ncl ude " sor t . h"
}
Everything bracketed this way is exempt from type-safe linkage (i.e., names arent
mangledthe mechanism for type-safe linking). The C++ compiler generates calls to the
functions declared in sor t . h without mangling names.
overloading operators
One of the more advanced features of C++, also found in a few recent languages such as
Ada (and some older ones like Algol68), is the ability to define new behaviours for built-
in operators.
In C++, aside from obvious operator symbols such as +, - , etc., some other things
are operators, including assignment (=) new, del et e, and array indexing ([ ] ).
For example, if you do not like the limitations of built-in array indexing and want to
define your own, you can create a class containing the array data and indexing operations
of your own design. One reason to do this: as with many other languages, C++ is limited
as to its support for freely specifying all dimensions of a multi-dimensional array at run-
time. The reason for this is that the conventional array indexing operation needs to
multiply by all but the last dimension to find the actual place in memory that an array
element occupies.
With Cs capability of using pointers and arrays interchangeably, this problem can
usually be worked around by implementing multi-dimensional arrays as arrays of
pointers. C++s class mechanism provides a cleaner way of hiding the detail of this,
allowing you to use code that looks like an ordinary array indexing operation once you
have worked out the detail of your array classs index operation.
Here is how you could declare such an indexing operation (we shall extend this to a 3-
dimensional array class in the next hands-on session):
cl ass Ar r ay1D
{publ i c:
Ar r ay1D ( i nt new_ max) ;
~Ar r ay1D ( ) ;
i nt & oper at or [ ] ( i nt i nd) ;
pr i vat e:
i nt * dat a;
i nt max;
};
Note the &: it specifies that the return type of the operator is a reference to i nt , which
means that it is effectively a pointer to the actual data item. However the compiler
automatically dereferences the pointer as necessary. The reason for doing this is to make it
possible to use the index operator on the left-hand-side of an assignment, as in
Ar r ay1D scor es( 100) ; / / not [ ] : 100 i s const r uct or ar g
i nt i ;
f or ( i = 0; i < 100; i ++)
scor es[ i ] = 0;
Here is how oper at or [ ] could be defined:
i nt & Ar r ay1D: : oper at or [ ] ( i nt i nd)
{
#i f def BOUNDS_ CHECK
i f ( ( i nd<0) | | ( i nd>=max) )
; / / i nser t er r or message
el se
#endi f / * BOUNDS_ CHECK * /
r et ur n dat a[ i nd] ;
}
The constructor has to allocate dat a. Ive put in an option of checking the index
against the array bounds, which usually isnt available in C or C++ (an array is a pointer,
so the compiler may not know the bounds: the array could be an arbitrary piece of
memory). If you dont want bounds checking you can compile without defining
BOUNDS_ CHECK. To improve performance, you can inline the operatorin which case its
no less efficient than the usual oper at or [ ] , but with the option of bounds checking.
memory management
Its also possible to redefine the built-in operators new and del et e. This is useful because
the standard strategy for memory allocation may not always be efficient enough for every
application.
For example, I once encountered a situation where someone was reading a large
amount of data off disk, sorting it in memory then writing it back. The data was too large
to fit in memory, but he relied on the operating systems virtual memory to allow him to
get away with this. While the sort was running, he noticed the disk was constantly busy,
indicating that there was a very high number of page faults. He re-examined his sorting
strategy, which should have been very efficient:
1. di vi de t he possi bl e keys on whi ch t he sor t i s bei ng done i nt o
a number of bucket s
2. r ead t he dat a sequent i al l y, put t i ng each i t em di r ect l y i nt o
t he r i ght bucket ( qui ck i f you know t he r ange of key val ues)
3. wr i t e out t he bucket s sequent i al l y t o di sk
What was happening was his memory allocator was allocating data in the order it was
read from disk, so by step 3, data in each bucket was scattered all over memory. The
figure below illustrates the problem.
big boxes are buckets; smaller boxes are shaded to show order of arrival of
bucket contents
Once he realized this was the problem, he wrote his own memory allocator that
allocated a large chunk of memory for each bucket, and when a new data item was added
to a bucket, it was given memory allocated for the bucket.
The result? A 100-fold speedup.
Here is how you can write your own versions of new and del et e:
#i ncl ude <st ddef . h> / * somet i mes needed t o over l oad new * /
54
#i ncl ude <new. h> / * usual l y needed t o over l oad new * /
cl ass Bucket _ dat a; / / can now use Bucket _ dat a*
cl ass Bucket
{publ i c:
voi d add_ dat a ( Bucket _ dat a * new_ dat a) ;
Bucket _ dat a * get _ new ( ) ;
voi d r ecycl e_ ol d ( Bucket _ dat a * ol d_ dat a) ;
pr i vat e:
Dat a_ l i st dat a, f r ee_ l i st ; / / some det ai l t o wor k out
};
cl ass Bucket _ dat a
{publ i c:
Bucket _ dat a ( Bucket * new_ owner ) ;
st at i c voi d* oper at or new ( si ze_ t si ze, Bucket &owner ) ;
st at i c voi d oper at or del et e ( voi d * ol d_ dat a, si ze_ t si ze) ;
pr i vat e:
Bucket * owner ;
};
An implementation could look like this:
voi d* Bucket _ dat a : : oper at or new ( si ze_ t si ze, Bucket &owner )
{ r et ur n owner . get _ new ( ) ;
}
voi d Bucket _ dat a: : oper at or del et e ( voi d * ol d_ dat a, si ze_ t si ze)
{ ( ( Bucket _ dat a* ) ol d_ dat a) - >owner - >
r ecycl e_ ol d( ( Bucket _ dat a* ) ol d_ dat a) ;
ol d_ dat a = NULL;
}
Usage:
Bucket _ dat a * new_ dat a = new ( pai l ) Bucket _ dat a ( &pai l ) ;
The first ( pai l ) is the last argument to operator new (the compiler automatically puts
in the size), and the later ( &pai l ) the & makes a pointer to pai l is passed to the
constructor, Bucket _ dat a ( Bucket * new_ owner ) .
Exercise: fill in the detail. Bucket : : get _new ( ) should use ::new to grab a
large chunk of memory when it runs out, otherwise just return the next
piece of whats left of the chunk
multiple inheritance
Sometimes its useful to base a class on more than one other class. For example, we
would like to add a capability of printing error messages to our buckets, with a default
message for each bucket. This is a useful capability to add to other things, so lets create a
separate Er r or class and make a new class built up out of it and Bucket :
cl ass Er r or
{publ i c:
Er r or ( const char * new_ message) ;
voi d pr i nt _ er r ( const char * message = " none" ) ;
pr i vat e:
char * def aul t _ message;
};
cl ass Er r or _ bucket : publ i c Bucket , publ i c Er r or
{publ i c:
Er r or _ bucket ( const char * new_ def aul t = " Hol e i n bucket " ) ;
};
55
Note the default argument in the Er r or _ bucket constructor: if an Er r or _ bucket is
created with no argument for the constructor, its as if the argument had actually been
" Hol e i n bucket " . The constructors and implementation are straightforward:
Er r or : : Er r or ( const char * new_ message)
{ def aul t _ message = ( char * ) new_ message;
}
/ / no const r uct or cal l f or Bucket : has def aul t const r uct or
Er r or _ bucket : : Er r or _ bucket ( const char * new_ def aul t ) :
Er r or ( new_ def aul t )
{
}
voi d Er r or : : pr i nt _ er r ( const char * message)
{ i f ( st r cmp( message, " none" ) == 0)
cer r << def aul t _ message << endl ;
el se
cer r << message << endl ;
}
The following:
Er r or _ bucket beyond_ pal e, hol y_ bucket ( " Leaky" ) ;
beyond_ pal e. pr i nt _ er r ( ) ;
hol y_ bucket . pr i nt _ er r ( ) ;
hol y_ bucket . pr i nt _ er r ( " Fi xed" ) ;
results in this output:
Hol e i n bucket
Leaky
Fi xed
LISP programmers call little classes designed to be added to new classes mixins.
cloning
Sometimes its useful to be able to make a new object based on an existing one, without
knowing what class the original is.
One way of doing this is to define a cl one( ) member function:
cl ass Bucket
{publ i c:
voi d add_ dat a ( Bucket _ dat a * new_ dat a) ;
Bucket _ dat a * get _ new ( ) ;
voi d r ecycl e_ ol d ( Bucket _ dat a * ol d_ dat a) ;
vi r t ual Bucket * cl one ( ) ;
pr i vat e:
Dat a_ l i st dat a, f r ee_ l i st ;
};
cl ass Er r or _ bucket : publ i c Bucket , publ i c Er r or
{publ i c:
Er r or _ bucket ( const char * new_ def aul t = " Hol e i n bucket " ) ;
vi r t ual Bucket * cl one ( ) ;
};
The two versions of clone look like this:
Bucket * Bucket : : cl one ( )
{ r et ur n new Bucket ( * t hi s) ; / / not e use of copy const r uct or
}
Bucket * Er r or _ bucket : : cl one ( )
{ r et ur n ( Bucket * ) new Er r or _ bucket ( * t hi s) ;
56
}
And a call like this:
Bucket * ki cked = new Er r or _ bucket ( " ki cked over " ) , * spi l t ;
spi l t = ki cked- >cl one ( ) ;
would result in a new the creation of a new object of class Er r or _ bucket , copied
from the object pointed to by ki cked.
There are variations on cloning: deep cloning doesnt copy any pointers, but always
makes a completely new object, including allocating new memory and copying any
contained objects; shallow cloning only copies the outermost level, which may mean more
than one pointer is pointing to the same piece of memory (called an alias).
caution: theres no direct way in C++ to force a member function to be
redefined for every derived class. Its easy to forget to redefine the
cl one( ) virtual function in a class and clone the wrong type of object.
Use cloning with care, and not for deep class hierarchies
double caution: an alias is bad newsone thing it can result in for
example is calling del et e more than once on the same piece of memory
with probably disastrous consequences on the internal state of memory
allocation/deallocation
57
hands-on3-D array class
Lets put some of these ideas together now, and define a 3-dimensional
array class, capable of storing objects of any class in a hierarchy that
has a clone member function. The following is a start:
/ / 3- di mensi onal ar r ay - si ze set at al l ocat i on, check bounds
/ / #i f def BOUNDS_ CHECK, each di mensi on i ndexed 0. . i ni t i al _ max- 1
/ / suppl y exampl e of obj ect t o cl one f or el ement s i f al l t he
/ / same cl ass, ot her wi se t he el ement s i ni t i al i zed as NULL
cl ass Ar r ay2D;
cl ass Ar r ay3D;
cl ass Ar r ay1D
{publ i c:
Ar r ay1D ( i nt new_ max, Bucket * exampl e) ;
~Ar r ay1D ( ) ;
Bucket * & oper at or [ ] ( i nt i nd) ;
f r i end cl ass Ar r ay2D; / / l et 2D see 1D' s pr i vat e member s
pr i vat e:
Bucket * * dat a;
i nt max;
};
cl ass Ar r ay2D
{publ i c:
Ar r ay2D ( i nt new_ max_ y, i nt new_ max_ z, Bucket * exampl e) ;
~Ar r ay2D ( ) ;
Ar r ay1D& oper at or [ ] ( i nt i nd) ;
f r i end cl ass Ar r ay3D;
pr i vat e:
i nt get _ z ( ) ; / / 3D can' t see Ar r ay1D' s max
Ar r ay1D * * r ows;
i nt max;
};
cl ass Ar r ay3D
{publ i c:
/ / suppl y exampl e t o cl one f r om i f al l t o be same t ype and
/ / al l ocat ed when ar r ay i s al l ocat ed
Ar r ay3D ( i nt new_ max_ x, i nt new_ max_ y, i nt new_ max_ z,
Bucket * exampl e = NULL) ;
~Ar r ay3D ( ) ;
Ar r ay2D& oper at or [ ] ( i nt i nd) ;
i nt get _ max_ x ( ) ;
i nt get _ max_ y ( ) ;
i nt get _ max_ z ( ) ;
pr i vat e:
Ar r ay2D * * pl anes;
i nt max;
};
58
part 13Design Trade-Offs
case studyvector class
Another useful class in many applications is one for vectors, including vector arithmetic,
such as addition. To keep things simple, well stick to a vector of three dimensions, and
only look at a small number of possible operations.
defining operators v s . functions
The ability to define your own overloaded versions of built-in operations in C++ makes it
tempting to always use them when the possibility arises. However, this can sometimes
lead to complications, especially the temporary problem described below. However,
before going into problems, here is an example of defining a simple vector operation, - =
as both a function and an operator. The operator negates its argument and returns a
reference to it, so the expression could appear on the left-hand-side of an assignment.
const i nt n_ di m = 3;
cl ass Vect or
{publ i c: / / const r uct or set s al l t o zer o i f no ar gs
Vect or ( f l oat f i r st =0. 0, f l oat second=0. 0, f l oat t hi r d=0. 0) ;
Vect or & oper at or - =( f l oat scal ar ) ;
voi d decr ement ( f l oat scal ar ) ;
pr i vat e:
f l oat dat a[ n_ di m] ;
};
Vect or : : Vect or ( f l oat f i r st , f l oat second, f l oat t hi r d)
{ dat a[ 0] = f i r st ;
dat a[ 1] = second;
dat a[ 2] = t hi r d;
}
Vect or & Vect or : : oper at or - =( f l oat scal ar )
{ i nt i ;
f or ( i = 0; i < n_ di m; i ++)
dat a[ i ] - = scal ar ;
r et ur n * t hi s;
}
voi d Vect or : : decr ement ( f l oat scal ar )
{ i nt i ;
f or ( i = 0; i < n_ di m; i ++)
dat a[ i ] - = scal ar ;
}
An example of usage
Vect or vel oci t y ( 100. 0, 37. 0, 500. 6) ;
vel oci t y - = 25;
vel oci t y. decr ement ( 20) ;
illustrates how notationally convenient overloading operators can be.
when to inline
Overloading operators is a good topic under which to discuss the issue of when to inline
more thoroughly.
Although inlining generally gives a performance advantage, it has some drawbacks.
Unless the function (or operator) is smaller than the overhead of setting up a conventional
call, the overall size of the program is bigger, since the code is duplicated. Also, the
compiler has to process the inlines source code more often: it has to be #i ncl uded into
every file that uses it, instead of compiled once, then not seen again until link time. This is
slows compilation. If you inline often, youll frequently run into the problem mentioned in
Part 9 (confusing link errors). Finally, many debuggers lose track of where you are in the
source if you inline a lot, and other tools such as profilers have less information at run
time.
Thats not to say you should never use inlines. Once you have written your program
and are starting to tune it for performance, you can start to work out which function calls
are too expensive, and try inlining them. Remember the lesson of the sorting algorithm:
optimizing only makes sense once you know you have the most efficient design.
the temporary problem
An additional problem with operators is that many require returning a value to be
consistent with the built-in operator. In a case where the value has to be an l-value
(capable of appearing on the left-hand-side of an assignment), its possible to return a
reference to t hi s, as in the - = example.
However if the value returned is meant to be a completely new value, as in the result
of an addition, it must be stored somewhere. In the case of built-in operators, that
somewhere is generated by the compiler (a temporary space in memory, or more likely, a
register)and merged into the target of the assignment if possible. If you write your own
operator, the compiler cant manage temporary values as efficiently, resulting in
unnecessary construction of a new object, copying and deletion of the temporary.
By contrast, if you use a function for operations such as addition, you can use a
technique such as making the current object the destination for the result.
We can add addition to the vector class, to illustrate the alternative styles:
Vect or Vect or : : oper at or +( Vect or ot her )
{ Vect or r esul t ; / / no const r uct or : def aul t i s al l zer oes
f or ( i nt i = 0; i < n_ di m; i ++)
r esul t . dat a[ i ] += dat a[ i ] + ot her . dat a[ i ] ;
r et ur n r esul t ;
}
voi d Vect or : : add ( Vect or f i r st , Vect or second)
{ f or ( i nt i = 0; i < n_ di m; i ++)
dat a[ i ] += f i r st . dat a[ i ] + second. dat a[ i ] ;
}
which could be used as follows:
Vect or vel oci t y ( 100. 0, 37. 0, 500. 6) , accel ( - 1. 0, 1. 0, 0. 0) ,
f i nal _ vel oci t y;
f i nal _ vel oci t y = vel oci t y + accel ;
f i nal _ vel oci t y. add ( vel oci t y, accel ) ;
accel - = 10;
f i nal _ vel oci t y. add ( f i nal _ vel oci t y, accel ) ;
60
hands-onvector class using operators
Extend the vector class to include a few common operations, like
multiply by scalar (oper at or *=), inner product (oper at or *) and
indexing (oper at or [ ] ).
Experiment with both implementing and using these operations as
operators, as well as functions
61
part 14More Advanced Features and Concepts
templates
Templates are a relatively late addition to the language and do not work properly on all
compilers. Nonetheless they are a useful concept and worth explaining.
A template is a parametrized type. The sorts of Parts 3 to 5 started as a sort for strings,
which became a sort for employee records, and finally a more general one with function
pointers. Imagine how much better it would be if we could define a generic sort, which
would work on any data type we could compare and exchange. C++ has templates for this
purpose. Ada has a similar feature called generics. To do this in Pascal or Modula2, you
have to use a text editor to create multiple versions of a routine such as a sort, whereas in
Ada or C++, the compiler can do this for you.
In C++ you can define a generic sort as follows:
t empl at e<cl ass T> voi d sor t ( T dat a[ ] , i nt n)
{ i nt i , j ;
f or ( i = 0; i < n- 1; i ++)
f or ( j = i + 1; j > 0; j - - )
i f ( compar e( dat a, j - 1, j ) > 0)
swap( dat a, j - 1, j ) ;
}
/ / assume oper at or == and oper ar or < def i ned on T
t empl at e<cl ass T> i nt compar e( T dat a[ ] , i nt i , i nt j )
{ i f ( dat a[ i ] < dat a[ j ] )
r et ur n - 1;
el se i f ( dat a[ i ] == dat a[ j ] )
r et ur n 0;
el se
r et ur n 1;
}
t empl at e<cl ass T> voi d swap( T dat a[ ] , i nt i , i nt j )
{ T t emp = dat a[ i ] ;
dat a[ i ] = dat a[ j ] ;
dat a[ j ] = t emp;
}
Some examples of usage:
i nt dat a[ ] = {0, 1, 4, 3, 45, 2, 1, 4, 6, 89};
f l oat money[ ] = {1. 20, 1. 50, 0. 59, 500. 55, 89, 5};
sor t ( dat a, i nt ( si zeof dat a / si zeof ( i nt ) ) ) ;
sor t ( money, i nt ( si zeof money / si zeof ( f l oat ) ) ) ;
The compiler automatically generates versions of sor t ( ) for i nt and f l oat arrays
when it sees the two calls.
Its also possible to parametrize a class. For example, the vector class of Part 13 could
be generalized to make vectors of general objects (some detail left out):
t empl at e<cl ass T> cl ass Vect or
{publ i c:
T& oper at or [ ] ( i nt i nd) ;
voi d add ( Vect or <T> f i r st , Vect or <T> second) ;
pr i vat e:
T dat a[ n_ di m] ;
};
t empl at e<cl ass T>T& Vect or <T>: : oper at or [ ] ( i nt i nd)
{ r et ur n dat a[ i nd] ;
}
t empl at e<cl ass T>voi d Vect or <T>: : add( Vect or <T>f i r st ,
Vect or <T> second)
{ f or ( i nt i = 0; i < n_ di m; i ++)
dat a[ i ] += f i r st . dat a[ i ] +second. dat a[ i ] ;
}
Here are some examples of usage:
Vect or <i nt > pos, of f set ;
Vect or <f l oat > vel , acc;
f or ( i nt i = 0; i < n_ di m; i ++)
{ pos[ i ] = 10- i ; / / use oper at or [ ]
of f set [ i ] = - 1;
}
pos. add( pos, of f set ) ;
vel . add( vel , acc) ;
exceptions
Exceptions are another late addition to the language. Since they are not fully implemented
in all compilers, Ill give a quick overview rather than detail.
The essential idea is that you t r y to execute a piece of code. If it fails (either through a
built-in exception like floating-point overflow or one you t hr ow), you fall through to a
cat ch which handles the exception:
cl ass Over f l ow
{/ / what ever st at e you want t o st or e about over f l ows
};
t r y
{ Over f l ow st at us;
/ / code t hat causes an except i on r esul t s i n:
t hr ow st at us;
}
cat ch ( Over f l ow &over f l ow_ i nf o)
{ / / use over f l ow_ i nf o t o handl e t he except i on
}
virtual base classes
With multiple inheritance, if the same base class appears more than once in the hierarchy,
it is duplicated. If you only want it to appear once, you declare it as a virtual base class.
For example:
cl ass Er r or
{publ i c:
Er r or ( const char * new_ message) ;
voi d pr i nt _ er r ( const char * message = " none" ) ;
pr i vat e:
char * def aul t _ message;
};
cl ass Er r or _ bucket : publ i c Bucket , publ i c Er r or
{publ i c:
Er r or _ bucket ( const char * new_ def aul t = " Hol e i n bucket " ) ;
};
cl ass Er r or _ spade : publ i c Spade, publ i c Er r or
{publ i c:
Er r or _ spade ( const char * new_ def aul t = " Hol e i n bucket " ) ;
};
cl ass Er r or _ beach : publ i c Er r or _ spade, publ i c Er r or _ bucket ;
will result in an object of class Error_beach having two places to store errors. If this is
not desired, the following will fix the problem:
cl ass Er r or _ bucket : publ i c Bucket , vi r t ual publ i c Er r or / / et c.
cl ass Er r or _ spade : publ i c Spade, vi r t ual publ i c Er r or / / et c.
63
future feature: name spaces
A big problem with mixing class libraries from various sources is that natural choices of
names tend to be duplicated.
For example, its very common to have class hierarchies descended from a common
ancestor with a name like Obj ect , or T_ obj ect . Also, conventions for making symbolic
names for boolean values are not standardized. Most use the C convention:
#i f ndef TRUE
# def i ne TRUE 1
# def i ne FALSE 0
#endi f
or something along those lines, but some define a boolean enum, and libraries that
do this may be hard to mix with others that use a slightly different strategy.
A proposal which is likely to be added to the language is a way of giving a name to a
collection of namesa name space. Other languages like Ada and Modula2 have module
or package mechanisms which are slightly more robust than C++s naming conventions,
but the problem of name management in large programs exists even with these languages.
Look out for name spaces in future C++ compilers.
libraries vs . frameworks
Reusability is one of the selling points of object-oriented programming.
Libraries are a traditional way of making code reusable. A library is a collection of
type, class and procedure definitions, designed for greater generality than code written for
a special purpose. Examples of libraries include FORTRAN floating-point libraries like
IMSL, the Smalltalk-80 class library, and linkable libraries typically distributed with
compilers to handle routine tasks like I/O.
Some advocate going a step further, and pre-writing a large part of an application,
trying to keep the code as general as possible. Functionality like updating windows and
printing is supplied in very general form, and you fill in the details to make a real
application.
This application framework approach has advantages and disadvantages. The biggest
drawback is you have to understand the programming style of the framework designer.
This can be a major task. Some have claimed it takes about 3 months to feel at home with
MacApp, for example (one of the earlier frameworks, for writing Macintosh applications).
On the other hand once youve understood the framework, you dont have to worry about
many details that dont change across most applications.
My view is that a compromise is the best strategy. A good library that you can use
whatever the style of program can be designed around a relatively simple application
framework. This framework should be designed so it can be learnt quickly, and only
implements very common functionality, or features which are tricky to get right. When
you start to use it, you will tend to mostly use it as a library, gradually graduating to using
it more like a frameworkparticularly once you start enhancing the framework with your
own tricks.
64
i n d e x
2-dimensional array 17 do-while 11
3-dimensional array 58 double 9
abstract class 40 encapsulation 32
access function 28, 49 endif (preprocessor) 24
actions 33 entities 33
Ada 28 enum 19
enumerated type 19 event-driven
no function pointer 24 simulation 33
type-safe separate compilation 28 user interface 33
alias (caution) 57 design 34
and 11 exception 63
application framework exit 41
MacApp 64 expressions 6
vs. library 64 expressions as statements. 6
array extern 28
2-dimensional 17 extern "C" 53
3-dimensional 58 files
argument 13 compilable endings 43
as pointer 13, 14 stream 41
base class 40 float 9
virtual 63 for 11
bitwise operations 11 format 5
boolean 9 Fortran 4, 64
break 10, 11, 13 framework
C MacApp 64
implementing object-oriented design 37 vs. library 64
mixing with C++ 53 free() 14
C++ friend 40, 58
implementing object-oriented design 52 fstream 41
case-sensitive 7 fstream.h 41
case-sensitive file names 29 function pointer 24
cerr 41 header 5
cin 41 fstream.h 41
class 39 iostream.h 41
abstract 40 new.h 55
base 40 stddef.h 55
derived 40 stdio.h 5
private derived 40 stdlib.h 41
public derived 40 string.h 17
virtual base class 63 hexadecimal constant 9
cloning 56 hiding details 29
compilable file endings 43 if 10
constructor 39, 45 if (preprocessor) 23
copy 45 ifdef (preprocessor) 23
default 45 ifndef (preprocessor) 23
no explicit call 50 IMSL 64
safe bracketing 50 include
storage allocation 40 search paths 23
type conversion 50 include (preprocessor) 5, 23
virtual function call 46 include once 24
continue 11, 13 inlines 42
copy constructor 45 portability 29
cout 41 indexing operator [] 53
default constructor 45 information hiding 32
array 50 C++ 46
delete (operator) 54 C++ mechanisms 49
derived class 40 inheritance 32, 40
private 40 inline function (C++) 42
destructor 39, 45, 46 file strategy 42
automatic call 51 strategy 59
vs. preprocessor macro 24 arithmetic 14
int 9 array indexing 15
char size 29 performance 15
pointer size 29 parameter 8
iostream 41 portability 29
iostream.h 41 postincrement 25
L-value 60 preincrement 25
and arrays 14 preprocessor 23
return using reference (&) 53 printf 5
library vs. application framework 64 private 39
long 9 derived class 40
MacApp 64 vs. protected 49
Macintosh 33, 64 protected 39
macro (preprocessor) 23 vs. private 49
vs. C++ inline function 24 prototype 8
make 28 public 39
malloc() 24 derived class 40
memory quicksort 30
C reference type
free() 14 & 42
malloc() 24 parameter 42
C++ returning L-value 53
delete 54 register 9
new 54 return 8
leak 45 reusability 64
mixin 56 scanf (risk) 14
Modula2 28 scope operator
enumerated type 19 short 9
function pointer 24 sizeof 24
type-safe separate compilation 28 Smalltalk-80 64
multiple inheritance 40, 55 sort
name space 64 employees 22
new (operator) 54 generic 24, 27
overloading 54 template 62
new.h 55 int 17
object-oriented design 32 string 17
C 35 statements 10
limitations 37 static 9, 28
C++ 39 class member 46
implementation 52 defining 47
open stddef.h 54
fstream 41 stdio.h 5
operator stdlib.h 41
delete 54 streams 41
indexing [] 53 files 41
new 54 fstream 41
overloading 53 string.h 17
or 11 strings 9
overloading 39 strong typing (other languages) 19
new 54 struct 19
operators 53 switch 10
parameter 13 template 62
parameter passing 14 class 62
pointer. 8 generic sort 62
Pascal type-safe separate compilation 28, 42
enumerated type 19 C vs. Ada and Modula2 28
function pointer 24 extern "C" 53
record 19 typedef 19
path name conventions 29 types 9
pointer cast 24
66
conversion (constructor) 50
float 9
int 9
string 9
Unix 28
unsigned 9
var parameter (reference types) 42
virtual base class 63
virtual function 46
virtual function table 46
volatile 9
ways to loose your job 5, 7, 10, 26, 40, 57
while 10
Windows 33
67