Unit 3
Unit 3
CPF
Interpreter (computing)
In computer science, an interpreter is a computer program that directly executes, i.e. performs,
instructions written in aprogramming or scripting language, without previously compiling them into
a machine language program. An interpreter generally uses one of the following strategies for
program execution:
Early versions of the Lisp programming language and Dartmouth BASIC would be examples of the
first type. Perl,Python, MATLAB, and Ruby are examples of the second, while UCSD Pascal is an
example of the third type. Source programs are compiled ahead of time and stored as machine
independent code, which is then linked at run-time and executed by an interpreter and/or compiler
(for JIT systems). Some systems, such as Smalltalk, contemporary versions of BASIC, Java and
others may also combine two and three.
While interpretation and compilation are the two main means by which programming languages are
implemented, they are not mutually exclusive, as most interpreting systems also perform some
translation work, just like compilers. The terms "interpreted language" or "compiled language" signify
that the canonical implementation of that language is an interpreter or a compiler, respectively.
A high level language is ideally an abstraction independent of particular implementations.
History[edit]
The first interpreted high-level language was Lisp. Lisp was first implemented in 1958 by Steve
Russell on an IBM 704 computer. Russell had read John McCarthy's paper, and realized (to
McCarthy's surprise) that the Lisp eval function could be implemented in machine code.[2] The result
was a working Lisp interpreter which could be used to run Lisp programs, or more properly,
"evaluate Lisp expressions".
An illustration of the linking process. Object files and static libraries are assembled into a new library or
executable
Programs written in a high level language are either directly executed by some kind of interpreter or
converted into machine code by a compiler (and assembler and linker) for the CPU to execute.
While compilers (and assemblers) generally produce machine code directly executable by computer
hardware, they can often (optionally) produce an intermediate form called object code. This is
basically the same machine specific code but augmented with a symbol table with names and tags
to make executable blocks (or modules) identifiable and relocatable. Compiled programs will
typically use building blocks (functions) kept in a library of such object code modules. A linker is
used to combine (pre-made) library files with the object file(s) of the application to form a single
executable file. The object files that are used to generate an executable file are thus often produced
at different times, and sometimes even by different languages (capable of generating the same
object format).
A simple interpreter written in a low level language (e.g. assembly) may have similar machine code
blocks implementing functions of the high level language stored, and executed when a function's
entry in a look up table points to that code. However, an interpreter written in a high level language
typically uses another approach, such as generating and then walking a parse tree, or by generating
and executing intermediate software-defined instructions, or both.
Development cycle
During the software development cycle, programmers make frequent changes to source code. When
using a compiler, each time a change is made to the source code, they must wait for the compiler to
translate the altered source files and link all of the binary code files together before the program can
be executed. The larger the program, the longer the wait. By contrast, a programmer using an
interpreter does a lot less waiting, as the interpreter usually just needs to translate the code being
worked on to an intermediate representation (or not translate it at all), thus requiring much less time
before the changes can be tested. Effects are evident upon saving the source code and reloading
the program. Compiled code is generally less readily debugged as editing, compiling, and linking are
sequential processes that have to be conducted in the proper sequence with a proper set of
commands. For this reason, many compilers also have an executive aid, known as a Make file and
program. The Make file lists compiler and linker command lines and program source code files, but
might take a simple command line menu input (e.g. "Make 3") which selects the third group (set) of
instructions then issues the commands to the compiler, and linker feeding the specified source code
files.
Distribution
A compiler converts source code into binary instruction for a specific processor's architecture, thus
making it less portable. This conversion is made just once, on the developer's environment, and after
that the same binary can be distributed to the user's machines where it can be executed without
further translation. A cross compiler can generate binary code for the user machine even if it has a
different processor than the machine where the code is compiled.
An interpreted program can be distributed as source code. It needs to be translated in each final
machine, which takes more time but makes the program distribution independent of the machine's
architecture. However, the portability of interpreted source code is dependent on the target machine
actually having a suitable interpreter. If the interpreter needs to be supplied along with the source,
the overall installation process is more complex than delivery of a monolithic executable since the
interpreter itself is part of what need be installed.
The fact that interpreted code can easily be read and copied by humans can be of concern from the
point of view of copyright. However, various systems ofencryption and obfuscation exist. Delivery of
intermediate code, such as bytecode, has a similar effect to obfuscation, but bytecode could be
decoded with adecompiler or disassembler.[citation needed]
Efficiency
The main disadvantage of interpreters is that an interpreted program typically runs slower than if it
had been compiled. The difference in speeds could be tiny or great; often an order of magnitude and
sometimes more. It generally takes longer to run a program under an interpreter than to run the
compiled code but it can take less time to interpret it than the total time required to compile and run
it. This is especially important when prototyping and testing code when an edit-interpret-debug cycle
can often be much shorter than an edit-compile-run-debug cycle.[citation needed]
Interpreting code is slower than running the compiled code because the interpreter must analyze
each statement in the program each time it is executed and then perform the desired action,
whereas the compiled code just performs the action within a fixed context determined by the
compilation. This run-time analysis is known as "interpretive overhead". Access to variables is also
slower in an interpreter because the mapping of identifiers to storage locations must be done
repeatedly at run-time rather than at compile time.[citation needed]
There are various compromises between the development speed when using an interpreter and the
execution speed when using a compiler. Some systems (such as some Lisps) allow interpreted and
compiled code to call each other and to share variables. This means that once a routine has been
tested and debugged under the interpreter it can be compiled and thus benefit from faster execution
while other routines are being developed.[citation needed] Many interpreters do not execute the source code
as it stands but convert it into some more compact internal form. Many BASIC interpreters
replace keywords with single byte tokens which can be used to find the instruction in a jump table. A
few interpreters, such as the PBASIC interpreter, achieve even higher levels of program compaction
by using a bit-oriented rather than a byte-oriented program memory structure, where commands
tokens occupy perhaps 5 bits, nominally "16-bit" constants are stored in a variable-length
code requiring 3, 6, 10, or 18 bits, and address operands include a "bit offset". Many BASIC
interpreters can store and read back their own tokenized internal representation. An interpreter might
well use the same lexical analyzer and parser as the compiler and then interpret the
resulting abstract syntax tree. Example data type definitions for the latter, and a toy interpreter for
syntax trees obtained from C expressions are shown in the box.
Regression
Interpretation cannot be used as the sole method of execution: even though an interpreter can itself
be interpreted and so on, a directly executed program is needed somewhere at the bottom of the
stack because the code being interpreted is not, by definition, the same as the machine code that
the CPU can execute.[4][5]
Machine code/Language
Numerical machine code (i.e., not assembly code) may be regarded as the lowest-level
representation of a compiled or assembled computer program or as a primitive and hardware-
dependent programming language. While it is possible to write programs directly in numerical
machine code, it is tedious and error prone to manage individual bits and calculate numerical
addresses and constants manually. It is thus rarely done today, except for situations that require
extreme optimization or debugging.
Almost all practical programs today are written in higher-level languages assembly language and
translated to executable machine code by utilities such as compilers, assemblers and linkers.
Programs in interpreted languages[1] are not translated into machine code although
their interpreter (which may be seen as an executor or processor) typically consists of directly
executable machine code (generated from assembly or high level language source code).
Every processor or processor family has its own machine code instruction set. Instructions are
patterns of bits that by physical design correspond to different commands to the machine. Thus, the
instruction set is specific to a class of processors using (mostly) the same architecture. Successor or
derivative processor designs often include all the instructions of a predecessor and may add
additional instructions. Occasionally, a successor design will discontinue or alter the meaning of
some instruction code (typically because it is needed for new purposes), affecting code compatibility
to some extent; even nearly completely compatible processors may show slightly different behavior
for some instructions, but this is rarely a problem. Systems may also differ in other details, such as
memory arrangement, operating systems, or peripheral devices. Because a program normally relies
on such factors, different systems will typically not run the same machine code, even when the same
type of processor is used.
A machine code instruction set may have all instructions of the same length, or it may have variable-
length instructions. How the patterns are organized varies strongly with the particular architecture
and often also with the type of instruction. Most instructions have one or more opcode fields which
specifies the basic instruction type (such as arithmetic, logical, jump, etc.) and the actual operation
(such as add or compare) and other fields that may give the type of the operand(s), the addressing
mode(s), the addressing offset(s) or index, or the actual value itself (such constant operands
contained in an instruction are called immediates).[2]
Not all machines or individual instructions have explicit operands. An accumulator machine has a
combined left operand and result in an implicit accumulator for most arithmetic instructions. Other
architectures (such as 8086 and the x86-family) have accumulator versions of common instructions,
with the accumulator regarded as one of the general registers by longer instructions. A stack
machine has most or all of its operands on an implicit stack. Special purpose instructions also often
lack explicit operands (CPUID in the x86 architecture writes values into four implicit destination
registers, for instance). This distinction between explicit and implicit operands is important in
machine code generators, especially in the register allocation and live range tracking parts. A good
code optimizer can track implicit as well as explicit operands which may allow more
frequent constant propagation, constant folding of registers (a register assigned the result of a
constant expression freed up by replacing it by that constant) and other code enhancements.
Programs
A computer program is a sequence of instructions that are executed by a CPU. While simple
processors execute instructions one after another, superscalar processors are capable of executing
several instructions at once.
Program flow may be influenced by special 'jump' instructions that transfer execution to an
instruction other than the numerically following one. Conditional jumps are taken (execution
continues at another address) or not (execution continues at the next instruction) depending on
some condition.
Assembly languages[edit]
Main article: Assembly language
A much more readable rendition of machine language, called assembly language, uses mnemonic
codes to refer to machine code instructions, rather than using the instructions' numeric values
directly. For example, on the Zilog Z80 processor, the machine code 00000101 , which causes the
CPU to decrement the B processor register, would be represented in assembly language as DEC B .
Example[edit]
The MIPS architecture provides a specific example for a machine code whose instructions are
always 32 bits long. The general type of instruction is given by the op (operation) field, the highest 6
bits. J-type (jump) and I-type (immediate) instructions are fully specified by op. R-type (register)
instructions include an additional field funct to determine the exact operation. The fields used in
these types are:
6 5 5 5 5 6 bits
[ op | rs | rt | rd |shamt| funct] R-type
[ op | rs | rt | address/immediate] I-type
[ op | target address ] J-type
rs, rt, and rd indicate register operands; shamt gives a shift amount; and
the address or immediate fields contain an operand directly.
For example, adding the registers 1 and 2 and placing the result in register 6 is encoded:
[ op | rs | rt | rd |shamt| funct]
0 1 2 6 0 32 decimal
000000 00001 00010 00110 00000 100000 binary
Load a value into register 8, taken from the memory cell 68 cells after the location listed in register 3:
[ op | rs | rt | address/immediate]
35 3 8 68 decimal
100011 00011 01000 00000 00001 000100 binary
[ op | target address ]
2 1024 decimal
000010 00000 00000 00000 10000 000000 binary
Relationship to microcode[edit]
Using a microcode layer to implement an emulator enables the computer to present the architecture
of an entirely different computer. The System/360 line used this to allow porting programs from
earlier IBM machines to the new family of computers, e.g. an IBM 1401/1440/1460 emulator on the
IBM S/360 model 40.
Relationship to bytecode[edit]
Machine code should not be confused with so-called "bytecode" (or the older term p-code), which is
either executed by an interpreter or itself compiled into machine code for faster (direct) execution.
Machine code and assembly code are sometimes called native code when referring to platform-
dependent parts of language features or libraries.[3]
Storing in memory[edit]
The Harvard architecture is a computer architecture with physically separate storage and signal
pathways for the code (instructions) and data. Today, most processors implement such separate
signal pathways for performance reasons but actually implement a Modified Harvard architecture,
[citation needed]
so they can support tasks like loading an executableprogram from disk storage as data and
then executing it. Harvard architecture is contrasted to the Von Neumann architecture, where data
and code are stored in the same memory which is read by the processor allowing the computer to
execute commands.
From the point of view of a process, the code space is the part of its address space where the code
in execution is stored. Inmultitasking systems this comprises the program's code segment and
usually shared libraries. In multi-threadingenvironment, different threads of one process share code
space along with data space, which reduces the overhead ofcontext switching considerably as
compared to process switching.
Readability by humans[edit]
It has been said that machine code is so unreadable that the United States Copyright Office cannot
identify whether a particular encoded program is an original work of authorship; [4] however, the US
Copyright Office does allow for copyright registration of computer programs.[5] Douglas
Hofstadter compares machine code with the genetic code: "Looking at a program written in machine
language is vaguely comparable to looking at a DNA molecule atom by atom."[6]
program design
program design The activity of progressing from a specification of some required program to a
description of the program itself. Most phase models of the software life cycle recognize
program design as one of the phases. The input to this phase is a specification of what the
program is required to do. During the phase the design decisions are made as to how the program
will meet these requirements, and the output of the phase is a description of the program in some
form that provides a suitable basis for subsequent implementation.
Frequently the design phase is divided into two subphases, one of coarse architectural design
and one of detailed design. The architectural design produces a description of the program at a
gross level; it is normally given in terms of the major components of the program and their
interrelationships, the main algorithms that these components employ, and the major data
structures. The detailed design then refines the architectural design to the stage where actual
implementation can begin. See also program design language.
Citation styles
Encyclopedia.com gives you the ability to cite reference entries and articles according to
common styles from the Modern Language Association (MLA), The Chicago Manual of Style,
and the American Psychological Association (APA).
Within the “Cite this article” tool, pick a style to see how all available information looks when
formatted according to that style. Then, copy and paste the text into your bibliography or works
cited list.
Because each style has its own formatting nuances that evolve over time and not all information
is available for every reference entry or article, Encyclopedia.com cannot guarantee each
citation it generates. Therefore, it’s best to use Encyclopedia.com citations as a starting point
before checking the style against your school or publication’s requirements and the most-recent
information available at these sites:
Design and Operational Art: A Practical Approach to Teaching the Army Design...
Magazine article from: Military Review
...evolution of its understanding of the Army design methodology. As the Army
Design and Operational Art: A Practical Approach to Teaching the Army Design...
Magazine article from: Military Review
...evolution of its understanding of the Army design methodology. As the Army design
doctrine has evolved, so has the design curriculum at the U.S. Army's
School...teammates. The Advanced Military Studies Program has eight graduate
outcomes. First...
DIY Design: Empowering Frontline Stakeholders to Develop Their Own Workplace...
Magazine article from: Talent Development
...lend themselves to a user-design approach. User design can lead to innovations
that...when users take ownership of designs. User-design processes may be
complicated...system included the agency, the program offices found within the
agency...
Design or "Design"- Envisioning a Future Design Education
Magazine article from: Visible Language
...the common grand vision of Design, this article considers 'design' as a humble re-
forming...respects previous iterations of a design and seeks to retain what is...the
outcomes of the research program into communication and information...
Design Education for Nondesigners
Magazine article from: The Technology Teacher
...involved in the entire product design cycle--overcoming the typical...the material
available about design, look at the numerous pictures of successful product designs,
and study the many diagrams produced explaining design process and research,
but...lives. We need to consider a program to ...
Design World Gears Up for Season ; Festivals around World Aim to Attract...
Newspaper article from: International New York Times
...Object. The first Beijing Design Trade Fair is to open this month during Beijing
Design Week.Yet most festivals also program events with a broader...hopes to
strengthen its program by introducing a London Design Biennale at Somerset...
Design, Form, and Function in Art Education
Magazine article from: Art Education
...developing functional design instruction in school art programs. Responding
to...curricula for teaching design. For example, programs at the National...Hewitt
National Design Museum offers "award-winning" programs for New York City...
Dwell on Design to Present Latest Trends, Opportunities to Engage with...
Newspaper article from: Pasadena Star-News
Modern attractive design - both inside and outside...simply an amazing event for
design professionals, architects...core content tracks - "Design for Humankind,"
"Smart...and Energy 360" - while program discussions will cover topics...
Design
Magazine article from: Research-Technology Management
...When our system presented a design, it was just another possible...engineer had to
compare with the designs from service company reps...the ce- ment company's design
against the ex- pert system...critiqued the service company designs and provided
specific questions...of arriving at an effective ...
Teaching Design: Taking the First Steps: Teaching Students How to Think like...
Magazine article from: The Technology Teacher
...Preparation for Teaching Design Formal training on the techniques, or processes, of
design (technical sketching...in teacher preparation programs, and within the
various...had a unifying theme of design as a complete process...
Debugging
Software development process
Core activities
Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a
computer program or a piece of electronic hardware, thus making it behave as expected.
Debugging tends to be harder when various subsystems are tightly coupled, as changes in one
may cause bugs to emerge in another.
Numerous books have been written about debugging (see below: Further reading), as it involves
numerous aspects, including interactive debugging, control flow, integration testing, log files,
monitoring (application, system), memory dumps, profiling, Statistical Process Control, and
special design tactics to improve detection while simplifying changes.
Contents
[hide]
1 Origin
2 Scope
3 Tools
4 Typical debugging process
5 Techniques
6 Debugging for embedded systems
7 Anti-debugging
8 See also
9 References
10 Further reading
11 External links
Origin[edit]
A computer log entry from the Mark II, with a moth taped to the page
There is some controversy over the origin of the term "debugging". The terms "bug" and
"debugging" are both popularly attributed to Admiral Grace Hopper in the 1940s.[1] While she
was working on a Mark II Computer at Harvard University, her associates discovered a moth
stuck in a relay and thereby impeding operation, whereupon she remarked that they were
"debugging" the system. However the term "bug" in the meaning of technical error dates back at
least to 1878 and Thomas Edison (see software bug for a full discussion), and "debugging"
seems to have been used as a term in aeronautics before entering the world of computers. Indeed,
in an interview Grace Hopper remarked that she was not coining the term[citation needed]. The moth fit
the already existing terminology, so it was saved. A letter from J. Robert Oppenheimer (director
of the WWII atomic bomb "Manhattan" project at Los Alamos, NM) used the term in a letter to
Dr. Ernest Lawrence at UC Berkeley, dated October 27, 1944,[2] regarding the recruitment of
additional technical staff.
The Oxford English Dictionary entry for "debug" quotes the term "debugging" used in reference
to airplane engine testing in a 1945 article in the Journal of the Royal Aeronautical Society. An
article in "Airforce" (June 1945 p. 50) also refers to debugging, this time of aircraft cameras.
Hopper's bug was found on September 9, 1947. The term was not adopted by computer
programmers until the early 1950s. The seminal article by Gill[3] in 1951 is the earliest in-depth
discussion of programming errors, but it does not use the term "bug" or "debugging". In the
ACM's digital library, the term "debugging" is first used in three papers from 1952 ACM
National Meetings.[4][5][6] Two of the three use the term in quotation marks. By 1963, "debugging"
was a common enough term to be mentioned in passing without explanation on page 1 of the
CTSS manual.[7]
Kidwell's article Stalking the Elusive Computer Bug[8] discusses the etymology of "bug" and
"debug" in greater detail.
Scope[edit]
As software and electronic systems have become generally more complex, the various common
debugging techniques have expanded with more methods to detect anomalies, assess impact, and
schedule software patches or full updates to a system. The words "anomaly" and "discrepancy"
can be used, as being more neutral terms, to avoid the words "error" and "defect" or "bug" where
there might be an implication that all so-called errors, defects or bugs must be fixed (at all costs).
Instead, an impact assessment can be made to determine if changes to remove an anomaly (or
discrepancy) would be cost-effective for the system, or perhaps a scheduled new release might
render the change(s) unnecessary. Not all issues are life-critical or mission-critical in a system.
Also, it is important to avoid the situation where a change might be more upsetting to users,
long-term, than living with the known problem(s) (where the "cure would be worse than the
disease"). Basing decisions of the acceptability of some anomalies can avoid a culture of a "zero-
defects" mandate, where people might be tempted to deny the existence of problems so that the
result would appear as zero defects. Considering the collateral issues, such as the cost-versus-
benefit impact assessment, then broader debugging techniques will expand to determine the
frequency of anomalies (how often the same "bugs" occur) to help assess their impact to the
overall system.
Tools[edit]
Debugging on video game consoles is usually done with special hardware such as this Xbox debug unit
intended only for developers.
Generally, high-level programming languages, such as Java, make debugging easier, because
they have features such as exception handling that make real sources of erratic behaviour easier
to spot. In programming languages such as C or assembly, bugs may cause silent problems such
as memory corruption, and it is often difficult to see where the initial problem happened. In those
cases, memory debugger tools may be needed.
In certain situations, general purpose software tools that are language specific in nature can be
very useful. These take the form of static code analysis tools. These tools look for a very specific
set of known problems, some common and some rare, within the source code. All such issues
detected by these tools would rarely be picked up by a compiler or interpreter, thus they are not
syntax checkers, but more semantic checkers. Some tools claim to be able to detect 300+ unique
problems. Both commercial and free tools exist in various languages. These tools can be
extremely useful when checking very large source trees, where it is impractical to do code
walkthroughs. A typical example of a problem detected would be a variable dereference that
occurs before the variable is assigned a value. Another example would be to perform strong type
checking when the language does not require such. Thus, they are better at locating likely errors,
versus actual errors. As a result, these tools have a reputation of false positives. The old Unix lint
program is an early example.
For debugging electronic hardware (e.g., computer hardware) as well as low-level software (e.g.,
BIOSes, device drivers) and firmware, instruments such as oscilloscopes, logic analyzers or in-
circuit emulators (ICEs) are often used, alone or in combination. An ICE may perform many of
the typical software debugger's tasks on low-level software and firmware.
Normally the first step in debugging is to attempt to reproduce the problem. This can be a non-
trivial task, for example as with parallel processes or some unusual software bugs. Also, specific
user environment and usage history can make it difficult to reproduce the problem.
After the bug is reproduced, the input of the program may need to be simplified to make it easier
to debug. For example, a bug in a compiler can make it crash when parsing some large source
file. However, after simplification of the test case, only few lines from the original source file
can be sufficient to reproduce the same crash. Such simplification can be made manually, using a
divide-and-conquer approach. The programmer will try to remove some parts of original test
case and check if the problem still exists. When debugging the problem in a GUI, the
programmer can try to skip some user interaction from the original problem description and
check if remaining actions are sufficient for bugs to appear.
After the test case is sufficiently simplified, a programmer can use a debugger tool to examine
program states (values of variables, plus the call stack) and track down the origin of the
problem(s). Alternatively, tracing can be used. In simple cases, tracing is just a few print
statements, which output the values of variables at certain points of program execution.
Techniques[edit]
Print debugging (or tracing) is the act of watching (live or recorded) trace statements, or print
statements, that indicate the flow of execution of a process. This is sometimes called printf
debugging, due to the use of the printf function in C. This kind of debugging was turned on by
the command TRON in the original versions of the novice-oriented BASIC programming
language. TRON stood for, "Trace On." TRON caused the line numbers of each BASIC command
line to print as the program ran.
Remote debugging is the process of debugging a program running on a system different from
the debugger. To start remote debugging, a debugger connects to a remote system over a
network. The debugger can then control the execution of the program on the remote system
and retrieve information about its state.
Post-mortem debugging is debugging of the program after it has already crashed. Related
techniques often include various tracing techniques (for example,[9]) and/or analysis of memory
dump (or core dump) of the crashed process. The dump of the process could be obtained
automatically by the system (for example, when process has terminated due to an unhandled
exception), or by a programmer-inserted instruction, or manually by the interactive user.
"Wolf fence" algorithm: Edward Gauss described this simple but very useful and now famous
algorithm in a 1982 article for communications of the ACM as follows: "There's one wolf in
Alaska; how do you find it? First build a fence down the middle of the state, wait for the wolf to
howl, determine which side of the fence it is on. Repeat process on that side only, until you get
to the point where you can see the wolf."[10] This is implemented e.g. in the Git version control
system as the command git bisect, which uses the above algorithm to determine which commit
introduced a particular bug.
Saff Squeeze – a technique of isolating failure within the test using progressive inlining of parts
of the failing test.[12]
to identify and fix bugs in the system (e.g. logical or synchronization problems in the code, or a
design error in the hardware);
to collect information about the operating states of the system that may then be used to
analyze the system: to find ways to boost its performance or to optimize other important
characteristics (e.g. energy consumption, reliability, real-time response etc.).
Anti-debugging[edit]
Anti-debugging is "the implementation of one or more techniques within computer code that
hinders attempts at reverse engineering or debugging a target process".[13] It is actively used by
recognized publishers in copy-protection schemas, but is also used by malware to complicate its
detection and elimination.[14] Techniques used in anti-debugging include:
Syntax errors
Runtime errors
Logical errors
Syntax errors
Python will find these kinds of errors when it tries to parse your program, and
exit with an error message without running anything. Syntax errors are
mistakes in the use of the Python language, and are analogous to spelling or
grammar mistakes in a language like English: for example, the sentence
Would you some tea? does not make sense – it is missing a verb.
Note
it is illegal for any block (like an if body, or the body of a function) to be left
completely empty. If you want a block to do nothing, you can use the pass
statement inside the block.
Python will do its best to tell you where the error is located, but sometimes
its messages can be misleading: for example, if you forget to escape a
quotation mark inside a string you may get a syntax error referring to a
place later in your code, even though that is not the real source of the
problem. If you can’t see anything wrong on the line specified in the error
message, try backtracking through the previous few lines. As you program
more, you will get better at identifying and fixing errors.
Here are some examples of syntax errors in Python:
myfunction(x, y):
return x + y
else:
print("Hello!")
if mark >= 50
print("You passed!")
if arriving:
print("Hi!")
esle:
print("Bye!")
if flag:
print("Flag is set!")
Runtime errors
Consider the English instruction flap your arms and fly to Australia. While the
instruction is structurally correct and you can understand its meaning
perfectly, it is impossible for you to follow it.
division by zero
performing an operation on incompatible types
using an identifier which has not been defined
accessing a list element, dictionary value or object attribute which doesn’t exist
trying to access a file which doesn’t exist
Runtime errors often creep in if you don’t consider all possible values that a
variable could contain, especially when you are processing user input. You
should always try to add checks to your code to make sure that it can deal
with bad input and edge cases gracefully. We will look at this in more detail
in the chapter about exception handling.
Control structures
Following the structured program theorem, all programs are seen as composed of three control
structures:
Graphical representations of the three basic patterns using NS diagrams (blue) and flow charts
(green).
Subroutines
Subroutines; callable units such as procedures, functions, methods, or subprograms are used to
allow a sequence to be referred to by a single statement.
Blocks
Blocks are used to enable groups of statements to be treated as if they were one statement. Block-
structured languages have a syntax for enclosing structures in some formal way, such as an if-
statement bracketed by if..fi as in ALGOL 68, or a code section bracketed by BEGIN..END, as
in PL/I, whitespace indentation as in Python - or the curly braces {...} of C and many later
languages.
History
Theoretical foundation
The structured program theorem provides the theoretical basis of structured programming. It
states that three ways of combining programs—sequencing, selection, and iteration—are
sufficient to express any computable function. This observation did not originate with the
structured programming movement; these structures are sufficient to describe the instruction
cycle of a central processing unit, as well as the operation of a Turing machine. Therefore a
processor is always executing a "structured program" in this sense, even if the instructions it
reads from memory are not part of a structured program. However, authors usually credit the
result to a 1966 paper by Böhm and Jacopini, possibly because Dijkstra cited this paper himself.
[2]
The structured program theorem does not address how to write and analyze a usefully
structured program. These issues were addressed during the late 1960s and early 1970s, with
major contributions by Dijkstra, Robert W. Floyd, Tony Hoare, Ole-Johan Dahl, and David
Gries.
Top-down and bottom-up design
Top-down and bottom-up are both strategies of information processing and knowledge
ordering, used in a variety of fields including software, humanistic and scientific theories
(see systemics), and management and organization. In practice, they can be seen as a style
of thinking and teaching.
A top-down approach (also known as stepwise design and in some cases used as a synonym
of decomposition) is essentially the breaking down of a system to gain insight into its
compositional sub-systems. In a top-down approach an overview of the system is
formulated, specifying but not detailing any first-level subsystems. Each subsystem is then
refined in yet greater detail, sometimes in many additional subsystem levels, until the entire
specification is reduced to base elements. A top-down model is often specified with the
assistance of "black boxes", these make it easier to manipulate. However, black boxes may
fail to elucidate elementary mechanisms or be detailed enough to realistically validate the
model. Top down approach starts with the big picture. It breaks down from there into
smaller segments.[1]
A bottom-up approach is the piecing together of systems to give rise to more complex
systems, thus making the original systems sub-systems of the emergent system. Bottom-up
processing is a type of information processing based on incoming data from the environment
to form a perception. From a Cognitive Psychology perspective, information enters the eyes
in one direction (sensory input, or the "bottom"), and is then turned into an image by the
brain that can be interpreted and recognized as a perception (output that is "built up"
from processing to final cognition). In a bottom-up approach the individual base elements
of the system are first specified in great detail. These elements are then linked together to
form larger subsystems, which then in turn are linked, sometimes in many levels, until a
complete top-level system is formed. This strategy often resembles a "seed" model,
whereby the beginnings are small but eventually grow in complexity and completeness.
However, "organic strategies" may result in a tangle of elements and subsystems,
developed in isolation and subject to local optimization as opposed to meeting a global
purpose.
Product design and development
During the design and development of new products, designers and engineers rely on both a
bottom-up and top-down approach. The bottom-up approach is being utilized when off-the-shelf
or existing components are selected and integrated into the product. An example would include
selecting a particular fastener, such as a bolt, and designing the receiving components such that
the fastener will fit properly. In a top-down approach, a custom fastener would be designed such
that it would fit properly in the receiving components. For perspective, for a product with more
restrictive requirements (such as weight, geometry, safety, environment, etc.), such as a space-
suit, a more top-down approach is taken and almost everything is custom designed. However,
when it's more important to minimize cost and increase component availability, such as with
manufacturing equipment, a more bottom-up approach would be taken, and as many off-the-shelf
components (bolts, gears, bearings, etc.) would be selected as possible. In the latter case, the
receiving housings would be designed around the selected components.
Computer science[edit]
Software development[edit]
Part of this section is from the Perl Design Patterns Book.
In the software development process, the top-down and bottom-up approaches play a key role.
Top-down design was promoted in the 1970s by IBM researchers Harlan Mills and Niklaus
Wirth. Mills developed structured programming concepts for practical use and tested them in a
1969 project to automate the New York Times morgue index. The engineering and management
success of this project led to the spread of the top-down approach through IBM and the rest of
the computer industry. Among other achievements, Niklaus Wirth, the developer of Pascal
programming language, wrote the influential paper Program Development by Stepwise
Refinement. Since Niklaus Wirth went on to develop languages such as Modula and Oberon
(where one could define a module before knowing about the entire program specification), one
can infer that top down programming was not strictly what he promoted. Top-down methods
were favored in software engineering until the late 1980s,[3] and object-oriented programming
assisted in demonstrating the idea that both aspects of top-down and bottom-up programming
could be utilized.
Modern software design approaches usually combine both top-down and bottom-up approaches.
Although an understanding of the complete system is usually considered necessary for good
design, leading theoretically to a top-down approach, most software projects attempt to make use
of existing code to some degree. Pre-existing modules give designs a bottom-up flavor. Some
design approaches also use an approach where a partially functional system is designed and
coded to completion, and this system is then expanded to fulfill all the requirements for the
project
Programming[edit]
Building blocks are an example of bottom-up design because the parts are first created and then
assembled without regard to how the parts will work in the assembly.
In a bottom-up approach, the individual base elements of the system are first specified in great
detail. These elements are then linked together to form larger subsystems, which then in turn are
linked, sometimes in many levels, until a complete top-level system is formed. This strategy
often resembles a "seed" model, whereby the beginnings are small, but eventually grow in
complexity and completeness. Object-oriented programming (OOP) is a paradigm that uses
"objects" to design applications and computer programs. In mechanical engineering with
software programs such as Pro/ENGINEER, Solidworks, and Autodesk Inventor users can
design products as pieces not part of the whole and later add those pieces together to form
assemblies like building with LEGO. Engineers call this piece part design.
This bottom-up approach has one weakness. Good intuition is necessary to decide the
functionality that is to be provided by the module. If a system is to be built from existing system,
this approach is more suitable as it starts from some existing modules.
Parsing[edit]
Parsing is the process of analyzing an input sequence (such as that read from a file or a
keyboard) in order to determine its grammatical structure. This method is used in the analysis of
both natural languages and computer languages, as in a compiler.
Bottom-up parsing is a strategy for analyzing unknown data relationships that attempts to
identify the most fundamental units first, and then to infer higher-order structures from them.
Top-down parsers, on the other hand, hypothesize general parse tree structures and then consider
whether the known fundamental structures are compatible with the hypothesis. See Top-down
parsing and Bottom-up parsing.
Nanotechnology[edit]
Main article: Nanotechnology
Top-down and bottom-up are two approaches for the manufacture of products. These terms
were first applied to the field of nanotechnology by the Foresight Institute in 1989 in order to
distinguish between molecular manufacturing (to mass-produce large atomically precise objects)
and conventional manufacturing (which can mass-produce large objects that are not atomically
precise). Bottom-up approaches seek to have smaller (usually molecular) components built up
into more complex assemblies, while top-down approaches seek to create nanoscale devices by
using larger, externally controlled ones to direct their assembly.
The top-down approach often uses the traditional workshop or microfabrication methods where
externally controlled tools are used to cut, mill, and shape materials into the desired shape and
order. Micropatterning techniques, such as photolithography and inkjet printing belong to this
category.
Bottom-up approaches, in contrast, use the chemical properties of single molecules to cause
single-molecule components to (a) self-organize or self-assemble into some useful conformation,
or (b) rely on positional assembly. These approaches utilize the concepts of molecular self-
assembly and/or molecular recognition. See also Supramolecular chemistry. Such bottom-up
approaches should, broadly speaking, be able to produce devices in parallel and much cheaper
than top-down methods, but could potentially be overwhelmed as the size and complexity of the
desired assembly increases.