9/26/2007
Data-flow
Data flow Testing
Laurie Williams
North Carolina State University
[email protected]
Data-flow Testing
• Most failures involve*:
– Execution of an incorrect definition
• Incorrect assignment or input statement
• Definition is missing
g ((use of null definition))
• Predicate is faulty (incorrect path is taken which leads to incorrect
definition)
• At least half of the source code consists of data
declaration or definition statements
• Need to focus some testing effort on these as
well (not a focus of coverage-based testing)
– Explore the sequences of events related to the data state and the
unreasonable things that can happen to data.
– Explore the effect of using the value produced by each and every
computation
Weyuker, E., More Experiences with Data Flow Testing, TSE Vol. 19, Sept. 1993.
© Laurie Williams 2006
1
9/26/2007
Data flow graph
Annotations/”Link Weights”
d – defined, created, initialized
Data declaration; on left hand side
of computation
k – killed, undefined, released
u – used for something (=c and p)
c - Right hand side of
computation, pointer (calculation)
p - Used in a predicate (or as
control variable of loop)
Control flowgraph’s
flowgraph s structure is
the same for every variable; the
weights change for each
variable.
© Laurie Williams 2006
Control flow statement classification I
• v = expression
– c-use of all variables in expression
– definition of v
• read(v1, v2, . . . vn)
– definitions of v1 . . . vn
• write(v1, v2, . . . vn)
– c-uses of v1 . . . vn
• method call: P(c1,
( , . . .cn))
– definition of each formal parameter
• while: while B do S
– p-use of each variable in boolean expression (B)
© Laurie Williams 2006
2
9/26/2007
Control flow statement classification II
• for statement: for (v=e1 to e2)
– c-use of each variable in e1 . . . e2
– definition of v
– p-use of v
• if-then-else: if B then S1; if B then S1 else S2
– p-use of each variable in boolean expression (B)
– S1 and S2 classified depending upon their composition
• case: case e1
S1
S1:
S2:
– p-use of each variable in expression e1
– S1 and S2 classified depending upon their composition
© Laurie Williams 2006
Time-sequence pairs of d, k, and u
• dd – why define twice without use?
• dk -- why define and kill without use?
• ku – a bug; killed then used
• kk – harmless but probably a bug
• du – normal (a “du pair”)
• kd – normal (kill then redefine)
• ud – usually normal, reassignment after use
• uk – normal
• uu – normal
• Often these anomalies are caught by compilers or static
analyzers today – depending upon the language
© Laurie Williams 2006
3
9/26/2007
Data-flow Anomalies
• Where dash means that nothing of interest (d, k,
u) happens
• -k – variable not defined but killed
• -d – OK, first definition in path
• -u – used before defined
• k- – normal; last thing done is to kill variable
• d
d- – defined but never used
• These are often integration testing issues.
© Laurie Williams 2006
Data flow graph (all)
© Laurie Williams 2006
4
9/26/2007
Data flow graph (x) Define X
Define x
Use x
Use x
du (normal)
-d ((normal))
dd (concern)
du (normal)
© Laurie Williams 2006
Data flow graph (y)
-u (bad) -u (bad)
ud ((ok))
ud (ok)
dk (probable error)
du (normal)
uk (normal)
© Laurie Williams 2006
5
9/26/2007
Data flow graph (z)
Kill z
Kill z
Use z Use z
Kill z Kill z
Define z Define z
Use z
Use z
-k (problem)
-k (problem)
kk (probable problem)
ku (problem)
Use z kd (normal)
Use z
uu (normal)
du (normal)
ud (normal)
uu (normal)
Define z ud (normal)
Define z
© Laurie Williams 2006
Strategies - 1
• All-definitions (AD): Test cases are generated to
cover each definition of each variable for at least
one use of the variable
variable.
• All predicate-uses (all p-use, APU): Test cases are
generated so that there is at least one path of
each variable definition to each p-use of the
variable.
• All-computational-uses (all c-use
c-use, ACU): Test
cases are generated so that there is at least one
path of each variable definition to each c-use of
the variable.
© Laurie Williams 2006
6
9/26/2007
Strategies - 1
define X
p-use y
kill z
define x
c-use x kill z
c-use z c-use x
define z
define y
p-use z
c-use y
c-use z
kill y
define z
All-p-use All-c-use
All definitions
© Laurie Williams 2006
Strategies - 2
All-p-uses/some-c-uses (APU+C): Test cases are generated
so that there is at least one path of each variable definition
to each p-use of the variable. If there are any variable
definitions that are not covered, use c-uses.
define X
APU APU+C
p-use y
kill z
define x
c-use x kill z
c-use z c-use x
define z
define y
p-use z
c-use y
c-use z
kill y
define z
© Laurie Williams 2006
7
9/26/2007
Strategies - 3
define X
All-c-uses/some-p-uses ACU and
(ACU+P): Test cases p-use y
p
ACU+P
kill z
are generated so that
there is at least one path define x
c-use x kill z
of each variable c-use z c-use x
define z
definition to each c-use define y
p-use z
of the variable. If there
are any variable c-use y
c-use z
definitions that are not
covered, use p-uses. kill y
define z
© Laurie Williams 2006
Strategies - 4
• All-uses (AU): Test cases are generated so that
there is at least one path of each variable
d fi iti tto each
definition h p-use andd eachh c-use use off
the definition.
• All-du paths (ADUP): Test cases are generated
which cause the traversal of every simple
subpath from each variable definition to every p-
use and d every c-use off that
th t d
definition.
fi iti
– the strongest data-flow testing strategy.
© Laurie Williams 2006
8
9/26/2007
Relative strength of white box test strategies
All paths
All du-paths
All uses
All-c/some-p All-p/some-c
All-p uses
All-c uses
All defs
Branch
A testing strategy X is stronger than some other
strategy Y if all test cases produced under Y are
included in those produced under X.
Statement
© Laurie Williams 2006