Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
53 views77 pages

07.search Tree Applications

The document discusses various data structures and algorithms for handling interval and range queries, focusing on interval trees, segment trees, and binary search trees. It details the construction, querying, and performance characteristics of these structures, including time complexities for building and querying. Additionally, it explores the challenges of multi-dimensional range queries and the use of the Inclusion-Exclusion Principle for counting and reporting points within specified ranges.

Uploaded by

Yue Chen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views77 pages

07.search Tree Applications

The document discusses various data structures and algorithms for handling interval and range queries, focusing on interval trees, segment trees, and binary search trees. It details the construction, querying, and performance characteristics of these structures, including time complexities for building and querying. Additionally, it explores the challenges of multi-dimensional range queries and the use of the Inclusion-Exclusion Principle for counting and reporting points within specified ranges.

Uploaded by

Yue Chen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

07- BST Application

A
Interval Tree

Your instinct, rather than precision stabbing, is more


about just random bludgeoning. [email protected]
Stabbing Query
qx
 Given a set of intervals in general position

on the x-axis:

and a query point

 Find all intervals that contain

 To solve this query,

we will use the so-called interval tree ...

Data Structures & Algorithms, Tsinghua University 1


Median

median

A A B B

C C D D E E

F F G G H H

X X

 Let be the set of all endpoints

( By general position assumption, )

 Let be the median of P

Data Structures & Algorithms, Tsinghua University 2


Partitioning

xmid(v)
L R

Smid
Sleft Sright
A A B B
C C D D E E
F F G G H H
X X
median

 All intervals can be then categorized into 3 subsets:

 Sleft/right will be recursively partitioned until they are empty (leaves)

Data Structures & Algorithms, Tsinghua University 3


Balance & O(logn) Depth

Best case: Worst case:

A A B B

C C D D E E

F F G G H H

X X

[D G B> <G D B]
G
[A F> <F A] [H X E> <E H X]
F E
[C> <C]
C

Data Structures & Algorithms, Tsinghua University 4


Associative Lists

 Lleft/right = all intervals of Smid sorted by the left/right endpoints

A A B B

C C D D E E

F F G G H H

X X

[D G B> <G D B]
G
[A F> <F A] [H X E> <E H X]
F E
[C> <C]
C

Data Structures & Algorithms, Tsinghua University 5


O(n) Size

 Each segment appears twice (one in each list)

A A B B

C C D D E E

F F G G H H

X X

[D G B> <G D B]
G
[A F> <F A] [H X E> <E H X]
F E
[C> <C]
C

Data Structures & Algorithms, Tsinghua University 6


O(nlogn) Construction Time

 Hint: avoid repeated sorting

A A B B

C C D D E E

F F G G H H

X X

[D G B> <G D B]
G
[A F> <F A] [H X E> <E H X]
F E
[C> <C]
C

Data Structures & Algorithms, Tsinghua University 7


queryIntervalTree( v, qx )

xmid(v) qx
if ( ! v ) return; //base
if ( qx < xmid(v) ) E E

use L-list to report D D

C C
all intervals of Smid(v) containing qx
B B
queryIntervalTree( lc(v), qx )
A A
else if ( xmid(v) < qx )
use R-list to report L R
all intervals of Smid(v) containing qx
Smid
queryIntervalTree( rc(v), qx )
else //with a probability ≈ 0 Sleft Sright
report all segments of Smid( v ) //both rc(v) & lc(v) can be ignored

Data Structures & Algorithms, Tsinghua University 8


O(r + logn) Query Time

 Each query visits O(logn) nodes //LINEAR recursion

A A B B

C C D D E E

F F G G H H

X X

[D G B> <G D B]
G
[A F> <F A] [H X E> <E H X]
F E
[C> <C]
C

Data Structures & Algorithms, Tsinghua University 9


07- BST Application

B
Segment Tree

把一条线分割成不相等的两段,再把这两段按照同样的比例再分成两个
部分。假设第一次分出来的两段中,一段代表可见世界,另一段代表理
智世界。然后再看第二次分成的两段,他们分别代表清楚与不清楚的程
度,你便会发现,可见世界那一段的第一部分是它的影像 [email protected]
Elementary Intervals

 Let be n intervals on the x-axis

 Sort all the endpoints into

p1 p2 p3 p4 p5 p6 p7

 m+1 elementary intervals are hence defined as:

Data Structures & Algorithms, Tsinghua University 1


Discretization

 Within each EI, all stabbing queries share a same output


 If we sort all EI's into a vector and
store the corresponding output with each EI, then ...

p1 p2 p3 p4 p5 p6 p7

 Once a query position is determined, //by an O(logn) time binary search


the output can then be returned directly //O(r)

Data Structures & Algorithms, Tsinghua University 2


Worst Case

 Every interval spans (n) EI's and a total space of (n2) is required

Data Structures & Algorithms, Tsinghua University 3


Sorted Vector --> BBST

 For each leaf v,


denote the corresponding elementary interval as R(v), //range of domination
denote the subset of intervals spanning R(v) as Int(v) and store Int(v) at v

Data Structures & Algorithms, Tsinghua University 4


1D Stabbing Query with BBST

 To find all intervals containing qx, we can


- find the leaf v whose R(v) contains qx //O(logn) time for a BBST
- and then report Int(v) //O(1 + r) time

Data Structures & Algorithms, Tsinghua University 5


(n2) Total Space In The Worst Cases

Data Structures & Algorithms, Tsinghua University 6


Store each interval at O(logn) common ancestors by greedy merging

Data Structures & Algorithms, Tsinghua University 7


Canonical Subsets with O(nlogn) Space

Data Structures & Algorithms, Tsinghua University 8


BuildSegmentTree( I )

Sort all endpoints in I before

determining all the EI's //O(nlogn)

Create T a BBST on all the EI's //O(n)

Determine R(v) for each node v

//O(n) if done in a bottom-up manner

For each s of I

InsertSegment( T.root, s )

Data Structures & Algorithms, Tsinghua University 9


InsertSegment( v , s )

if ( R(v)  s ) //greedy by top-down


store s at v and return;
if ( R( lc(v) )  s   ) //recurse
InsertSegment( lc(v), s );
if ( R( rc(v) )  s   ) //recurse
InsertSegment( rc(v), s );

 At each level,
 4 nodes are visited
(2 stores + 2 recursions)

 O(logn) time

Data Structures & Algorithms, Tsinghua University 10


Query( v , qx )

report all the intervals in Int(v)

if ( v is a leaf )

return

if ( qx  R( lc(v) ) )

Query( lc(v), qx )

else //qx  R( rc(v) )

Query( rc(v), qx )

Data Structures & Algorithms, Tsinghua University 11


O(r + logn)

 Only one node is visited per level,

altogether O(logn) nodes

 At each node v

- the CS Int(v) is reported

- in time

1 + |Int(v)| = O(1 + rv)

 Reporting all the intervals

costs O(r) time

Data Structures & Algorithms, Tsinghua University 12


Conclusion

 For a set of n intervals,

- a segment tree of size O(nlogn)

- can be built in O(nlogn) time

- which reports all intervals

containing a query point

in O(r + logn) time

Data Structures & Algorithms, Tsinghua University 13


07- BST Application

C1
Range Query: 1D

你这个人太敏感了。这个社会什么都需要,唯独不需要敏感 [email protected]
1D Range Query

 Let be a set of n points on the x-axis

x
x1 x2

 For any given interval

- COUNTING: how many points of P lies in the interval?

- REPORTING: enumerate all points in (if not empty)

 [Online] P is fixed while I is randomly and repeatedly given

 How to PREPROCESS P into a certain data structure s.t.

the queries can be answered efficiently?

Data Structures & Algorithms, Tsinghua University 1


Brute-Force

 For each point p of P, test if

x
x1 x2

 Thus each query can be answered in LINEAR time

 Can we do it faster? It seems we can't, for ...

 In the worst case,

the interval contains up to O(n) points, which need O(n) time to enumerate

 However, how if we

ignore the time for enumerating and count only the searching time?

Data Structures & Algorithms, Tsinghua University 2


Binary Search

 Sort all points into a sorted vector and add an extra sentinel p[0] = -

x
x1 x2

 For any interval

- Find t = search(x2) = max{ i | p[i]  x2 } //O(logn)

- Traverse the vector BACKWARD from p[t] and report each point //O(r)

until escaping from I at point p[s]

- return r = t - s //output size

Data Structures & Algorithms, Tsinghua University 3


Output-Sensitivity

 An enumerating query can be answered in time

x
x1 x2

 p[s] can also be found by binary search in O(logn) time

 Hence for COUNTING query, O(logn) time is enough //independent to r

 Can this simple strategy be extended to PLANAR range query?

TTBOMK, unfortunately, no!

Data Structures & Algorithms, Tsinghua University 4


07- BST Application

C2
Range Query: 2D

[email protected]
Planar Range Query
salary
 Let be a planar set

 Given

- COUNTING:

- REPORTING:

 Binary search

doesn't help this kind of query

 You might consider to

expand the counting method using

the Inclusion-Exclusion Principle age

Data Structures & Algorithms, Tsinghua University 1


Preprocessing

 This requires O(n2) time/space

Data Structures & Algorithms, Tsinghua University 2


Domination

 A point (u, v) is called to be DOMINATED by point (x, y) if

u  x and v  y

Data Structures & Algorithms, Tsinghua University 3


Inclusion-Exclusion Principle

 Then for any rectangular range , we have

Data Structures & Algorithms, Tsinghua University 4


Performance

 Each query needs only time


 Uses storage and even more for higher dimensions
 To find a better solution, let's go back to the 1D case ...

Data Structures & Algorithms, Tsinghua University 5


07- BST Application

D1
Multi-Level Search Tree: 1D

只在此山中,云深不知处 [email protected]
Structure: A Complete (Balanced) BST

66

39 85

16 52 74 93

8 26 45 61 70 78 90 99

5 12 22 33 42 45 52 61 66 70 74 78 85 90 93 99

1 5 8 12 16 22 26 33 39 42

 search(x) : returns the MAXIMUM key not greater than x


Data Structures & Algorithms, Tsinghua University 1
Lowest Common Ancestor

 Consider, as an example, the query for [17, 79] ...


17 79
66

39 85

16 52 74 93

8 26 45 61 70 78 90 99

5 12 22 33 42 45 52 61 66 70 74 78 85 90 93 99

78~84
1 5 8 12 16 22 26 33 39 42

16~21

 Do search(17) = 16 (might rejected) and search(79) = 78 (must accepted)

 Consider LCA(16, 78) = 66 ...


Data Structures & Algorithms, Tsinghua University 2
Union of O(logn) Disjoint Subtrees

 Starting from the LCA, traverse path(16) and path(78) once more resp.
17 79
66

39 85

16 52 74 93

8 26 45 61 70 78 90 99

5 12 22 33 42 45 52 61 66 70 74 78 85 90 93 99

78~84
1 5 8 12 16 22 26 33 39 42

16~21

- All R/L-turns along path(16)/path(78) are ignored and

- the R/L subtree at each L/R-turn is reported


Data Structures & Algorithms, Tsinghua University 3
Complexity

17 79
66

39 85

16 52 74 93

8 26 45 61 70 78 90 99

5 12 22 33 42 45 52 61 66 70 74 78 85 90 93 99

78~84
1 5 8 12 16 22 26 33 39 42

16~21

Query: Preprocessing: Storage:

Data Structures & Algorithms, Tsinghua University 4


Hot Knives Through A Chocolate Cake of Height O(logn)

 Region(u) is enclosed by Region(v) iff u is a descendant of v in the 1d-tree

 Region(u) and Region(v) are disjoint iff neither is the ancestor of the other

recursion accepted recursion rejected

rejected

 All nodes are partitioned into 3 types: accepted + rejected + recursion


Data Structures & Algorithms, Tsinghua University 5
07- BST Application

D2
Multi-Level Search Tree: 2D & kD

几株不知名的树,已脱下了黄叶
只有那两三片,多么可怜在枝上抖怯
它们感到秋来到,要与世间离别 [email protected]
2D Range Query = x-Query + y-Query

 An m-D orthogonal range query can be answered by

the INTERSECTION of m 1D queries

 For example, a 2D range query

can be divided into two 1D range queries:

- find all points in [x1, x2]; and then

- find in these candidates those lying in [y1, y2]

Data Structures & Algorithms, Tsinghua University 1


Worst Cases

 The x-query returns

(almost) all points whereas

the y-query rejects

(almost) all

 We spent (n) time

before getting r = 0 points

Data Structures & Algorithms, Tsinghua University 2


Painters' Strategy

Data Structures & Algorithms, Tsinghua University 3


2D Range Query = x-Query * y-Query

 Tree of trees

- build a 1D BBST (called x-tree)

for the first range query (x-query);

- for each node v in the x-range tree,

build a y-coordinate BBST (y-tree), containing

the canonical subset associate with v

 It's an x-tree of (a number of) y-trees,

called a Multi-Level Search Tree

 How to answer range queries with such an MLST?

Data Structures & Algorithms, Tsinghua University 4


2D Range Query = x-Query * y-Queries

 Query Time = ~

Data Structures & Algorithms, Tsinghua University 5


Query Algorithm

1. Determine the canonical subsets of points that


satisfy the first query
// there will be O(logn) such canonical sets,
// each of which is just represented as
// a node in the x-tree

2. Find out from each canonical subset


which points lie within the y-range
// To do this,
// for each canonical subset,
// we access the y-tree for the corresponding node
// this will be again a 1D range search (on the y-range)

Data Structures & Algorithms, Tsinghua University 6


Complexity: Preprocessing Time + Storage

 A 2-level search tree

for n points in the plane

can be built

in time

 Each input point is stored in y-trees

 A 2-level search tree

for n points in the plane

needs space

Data Structures & Algorithms, Tsinghua University 7


Complexity: Query Time

 Claim: A 2-level search tree for n points in the plane

answers each planar range query

in time

 The x-range query needs time

to locate the nodes

representing the canonical subsets

 Then for each of these nodes,

a y-range search is invoked,

which needs time


Data Structures & Algorithms, Tsinghua University 8
Beyond 2D

 Let S be a set of n points in Ed, d  2

- A d-level tree for S uses storage

- Such a tree can be constructed

in time

- Each orthogonal range query of S can

be answered in time

 For planar case, can the query time be

improved to, say, ?

Data Structures & Algorithms, Tsinghua University 9


07- BST Application

E
Range Tree

顺藤摸瓜 [email protected]
Coherence

 For each query, we

- need to repeatedly search

DIFFERENT y-lists,

- but always with

the SAME key

Data Structures & Algorithms, Tsinghua University 1


BBST<BBST<T>> --> BBST<List<T>>

 Each y-search is just

a 1D query without further recursions

 So it not necessary
7 7
to store each canonical subset 6 6
5 5
as a BBST 4 4
3 3
2 2
 Instead, a sorted y-list simply works 1 1
0 0

Data Structures & Algorithms, Tsinghua University 2


Links Between Lists

 We can CONNECT all the different lists

into a SINGLE massive list

 Thus, once a parent y-list is searched, 7 7


6 6
5 5
we can get, in O(1) time, 4 4
3 3
2 2
the entry for child y-list by 1 1
0 0
following the link between them

Data Structures & Algorithms, Tsinghua University 3


Using Coherence

 To answer a 2D range query, we will

do an O(logn) search

on the y-list for the LCA


7 7
 Thereafter, while descending the x-tree, we can 6 6
5 5
4 4
keep track of the position of y-range
3 3
2 2
in each successive list in O(1) time 1 1
0 0

 This technique is called ......

Data Structures & Algorithms, Tsinghua University 4


Fractional Cascading

 For each item in Av,

we store two pointers to

the item of NLT value

in AL and AR resp. 7 7
6 6
5 5
 Hence for any y-query with qy,
4 4
3 3
once we know its entry in Av, we can 2 2
1 1
determine its entry in either AL or AR 0 0

in O(1) additional time

Data Structures & Algorithms, Tsinghua University 5


Construction By 2-Way Merging

 Let V be an internal node in the x-tree

with L/R its left/right child resp.

 Let Av be the y-list for v and

AL/AR be the y-lists for its children


7 7
 Assuming no duplicate y-coordinates, we have 6 6
5 5
- Av is the disjoint union 4 4
3 3
of AL and AR, and hence 2 2
1 1
- Av can be obtained by 0 0

merging AL and AR (in linear time)

Data Structures & Algorithms, Tsinghua University 6


Complexity

 An MLST with fractional cascading


is called a range tree

 A y-search for root is done in O(logn) time

 To drop down to each next level, we can


get, in O(1) time, the current y-interval
from that of the prior level

 Hence, each 2D orthogonal range query


- can be answered in time
- from a data structure of size ,
- which can be constructed in time

Data Structures & Algorithms, Tsinghua University 7


Beyond 2D

 Unfortunately, the trick of fractional cascading

can ONLY be applied to

the LAST level the search structure

 Given a set of points in ,

an orthogonal range query

- can be answered in time

- from a data structure of size ,

- which can be constructed in time

Data Structures & Algorithms, Tsinghua University 8


07- BST Application

F1
kd-Tree: 2D

韦小宝跟著她走到桌边,只见桌上大白布上钉满了几千枚绣花针,几千块碎片已拼成一幅
完整无缺的大地图,难得的是几千片碎皮拼在一起,既没多出一片,也没少了一片

夫仁政,必自經界始...經界既正,分田制祿可坐而定也 [email protected]
思路

 为利用BBST来支持二维的区域查询,可以
递归地划分平面,并将分出来的矩形区域组织为一棵二叉树

 首先,根节点对应于整个平面
然后,交替地按x、y坐标划分,直至...

 为避免歧义,可约定每个矩形区域都是左闭右开、下闭上开

Data Structures & Algorithms, Tsinghua University 1


构造:算法buildKdTree(P,d)

// Construct a 2d-tree for point set P at depth d

if ( P == {p} ) return createLeaf( p ) //base

Root = createKdNode()

Root->SplitDirection = even(d) ? VERTICAL : HORIZONTAL

Root->SplitLine = findMedian( root->SplitDirection, P ) //O(n)!

( P1, P2 ) = divide( P, Root->SplitDirection, Root->SplitLine ) //DAC

Root->LC = buildKdTree( P1, d + 1 ) //recurse

Root->RC = buildKdTree( P2, d + 1 ) //recurse

return( Root )

Data Structures & Algorithms, Tsinghua University 2


构造:实例

G G G G
B B B B
F F F F

C E C E C E C E

A D A D A D A D

| | |
{A,B,C,D,F,G,E} {A,B,C,D,F,G,E} {A,B,C,D,F,G,E} {A,B,C,D,F,G,E}

--- --- --- ---


{A,C,B} {D,E,F,G} {A,C,B} {D,E,F,G} {A,C,B} {D,E,F,G}

A | | |
{A} {B,C} {D,E} {F,G} {B,C} {D,E} {F,G}

B C D E F G

Data Structures & Algorithms, Tsinghua University 3


特例:Quadtree

ABCDEFG

DEG AB CF

EG

Data Structures & Algorithms, Tsinghua University 4


性质

 树中的每个节点 ,都对应于平面上的一个矩形区域 以及子集

 树高

 任何同层的节点 和 ,都有:

 若兄弟节点 和 的父亲为 ,则有:

 同层的所有节点所对应的区域,无缝地覆盖了整个平面:
, 且

 任何一次矩形查询的解,都可以表示为若干个节点所对应 的并 //多少个?

Data Structures & Algorithms, Tsinghua University 5


07- BST Application

F2
kd-Tree: Query

[email protected]
查询:算法kdSearch(v,R):热刀来切logn层巧克力

// Pre-condition: region( v )  R  

- if ( isLeaf( v ) )
if ( inside( v, R ) ) report(v)
return

- if ( region( v->lc )  R )
reportSubtree( v->lc )
else if ( region( v->lc )  R   )
kdSearch( v->lc, R ) //recurse

- if ( region( v->rc )  R )
reportSubtree( v->rc )
else if ( region( v->rc )  R   )
kdSearch( v->rc, R ) //recurse

Data Structures & Algorithms, Tsinghua University 1


实例 因待判而递归 命中(个别或批量报告) 经核对后排除 因剪枝未访问

compare >>

HBAFGDCJEIQPKLSMRNTO

CABEDFIGJH MNKLOPRQTS

BADCE HFGJI KLMNO QPSRT

AB CED FH IGJ KL MNO PQ RTS

DE GJ NO ST

Data Structures & Algorithms, Tsinghua University 2


Bounding Box

Data Structures & Algorithms, Tsinghua University 3


Bounding Box: 实例 因待判而递归 命中(个别或批量报告) 经核对后排除 因剪枝未访问

<< compare

HBAFGDCJEIQPKLSMRNTO

CABEDFIGJH MNKLOPRQTS

BADCE HFGJI KLMNO QPSRT

AB CED FH IGJ KL MNO PQ RTS

DE GJ NO ST

Data Structures & Algorithms, Tsinghua University 4


07- BST Application

F3
kd-Tree: Complexity

肉眼看不清细节,但他们都知道那是木星所在的位置,
这颗太阳系最大的行星已经坠落到二维平面上了

有人嘲笑这种体系说:为了能发现这个比例中项并组成政府
共同体,按照我的办法,只消求出人口数字的平方根就行了 [email protected]
Preprocessing

 T(n)

= 2*T(n/2) + O(n)

= O(nlogn)

Data Structures & Algorithms, Tsinghua University 1


Storage

 The tree has a height

of O(logn)

 1

+ 2

+ 4

+ ...

+ O(2logn)

= O(n)

Data Structures & Algorithms, Tsinghua University 2


Query Time

 Claim: Report + Search =

 The searching time depends on Q(n), the number of


- recursive calls, or
- sub-regions intersecting with R (at all levels)

 No more than 2 of
the 4 grandchildren of each node
will recurse
-
-

 Solve to

Data Structures & Algorithms, Tsinghua University 3


Worst Case

Data Structures & Algorithms, Tsinghua University 4


Beyond 2D

 Can 2d-tree be extended to kd-tree and help HIGHER dimensional range query?

If yes, how efficiently can it help?

 A kd-tree in k-dimensional space is constructed by

recursively divide along the 1st, 2nd, ..., kth dimensions

 An orthogonal range query on a set of points in

- can be answered in time,

- using a kd-tree of size , which

- can be constructed in time

Data Structures & Algorithms, Tsinghua University 5

You might also like