0% found this document useful (0 votes)

14 views22 pages

Data Mining and Warehosuing Lecture 02

Uploaded by

vikum.amarananda47

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views22 pages

Data Mining and Warehosuing Lecture 02

Uploaded by

vikum.amarananda47

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Data Warehouse Implementation

Madava Viranjan
You will learn
• Cube operations
• Materialization
• Bit Map Indexing
• Join Index
• ROLAP, MOLAP, HOLAP Sever Architectures
Data Warehouse
Implementation
• Data Warehouse contains huge volume
of data. OLAP Servers should be able to
answer OLAP queries in seconds.

• Pre-computation of all or part of a data

cube can greatly enhance the
performance.

• It is challenging task as it requires

substantial computation time and
storage space.
Data Warehouse
Implementation
contd.

• What is the purpose

of GROUP BY in SQL?
• Queries
– Compute the
sum of sales
group by City &
Item
– Compute the
sum of sales
group by City
• Total Number of
Cuboids or Group By?
The compute cube Operator
• This operator computes aggregations over all subsets of the dimensions
specified in the operation
– compute cube sales_cube

• Proposed and studied by Gray et al

The Curse of Dimensionality
• Pre-computation of most, if not all, cuboids are required.

• Storage is the issue.

• When dimensions have concept hierarchies' things get worse.

No Materialization

• Do not precompile any non

based cuboids

Full Materialization
Cube
• Precompute all cuboids
Materialization
Partial Materialization

• Selectively compute a proper

subset of the all possible
cuboids
Full Cube
• Base Cell: Cell in the base cuboid
– { Colombo, Computer, 2018, 15000}

• Aggregate Cell: Cell from non based cuboid

– { Colombo, * , 2018, 150000}

• Compute all the cells of all the cuboids

Full Cube Contd.
• Multi dimensional array-based cube construction is used.

• Best performance in query processing

• Many Cells may little or no interest in query processing

Iceberg Cube
• Partially materialized cube

• Threshold values defined what to be included.

compute cube sales iceberg as

select month, city, customer group, count(*)
from salesInfo
cube by month, city, customer group
having count(*) >D min sup
Bitmap Indexing
• Record ID is used

• Distinct Bit Vector is defined for each value in attribute domain

• If the attribute has the value defined in the database row, then the bit
representing that value will set to 1.
Bitmap Indexing
RID Item City
R1 Computer Colombo
R2 Phone Colombo
R3 Computer Gampaha
R4 Home Ent. Colombo
R5 Computer Colombo
R6 Phone Gampaha

Base table RID H C P S

R1 0 1 0 0
R2 0 0 1 0
R3 0 1 0 0
R4 1 0 0 0
R5 0 1 0 0
Item bitmap index R6 0 0 1 0
table
Bitmap Indexing
• Comparison, aggregation, join kind of operations reduced to bit arithmetic.

• Storage space is saved due to bits.

Join Indexing
• Registers the joinable rows of two relations from a relational database.

• Start schema has significant benefit.

Join Indexing
Location SalesKey
…… ……
Main T57
Street
Main T238
Street
Main T884
Street
…… ……

Start Schema Fact and Dimensions Join Index Location and Sales
Join Indexing

Item SalesKey
…… ……
Sony-TV T57
Sony-TV T459
…… ……
…… ……

Start Schema Fact and Dimensions Join Index Item and Sales
Join Indexing

Item Location SalesKey

…… ……
Sony-TV Main T57
Street

Start Schema Fact and Dimensions

…… ……
…… ……
Join index table for location and item to sales
OLAP Server Architectures
• Business users want data to be stored in multi dimensional way.

• Physical implementation needs to consider storage issues.

• Three types
– Relational OLAP (ROLAP) Servers
– Multidimensional OLAP (MOLAP) Servers
– Hybrid OLAP (HOLAP) Servers
ROLAP Servers
• Intermediate Servers between relational servers and client tools

• Relational or extended relational DBMS to data store OLAP middleware for

rest.

• Greater scalability

• Decision Support Systems mostly used ROLAP Servers

MOLAP Servers
• Supports multi dimensional data views

• Array based multi dimensional storage engines

• Faster computation
HOLAP Servers
• Combines ROLAP and MOLAP architectures

• Large volume of detailed data stored in relational database while

aggregations are kept in separate MOLAP Servers

• Greater scalability from ROLAP and faster computation from MOLAP

Data Warehousing and Data Mining MCQ'S: Unit - I
No ratings yet
Data Warehousing and Data Mining MCQ'S: Unit - I
29 pages
1.7 Efficient Processing of OLAP Queries & OLAP Servers
No ratings yet
1.7 Efficient Processing of OLAP Queries & OLAP Servers
14 pages
Data Warehousing Implementation
No ratings yet
Data Warehousing Implementation
18 pages
1.6 Efficient Data Cube Computation & Indexing OLAP
No ratings yet
1.6 Efficient Data Cube Computation & Indexing OLAP
25 pages
DMDW Notes
100% (1)
DMDW Notes
62 pages
DDIA in Concise
100% (1)
DDIA in Concise
106 pages
Unit #2 - Data Warehouse and Data Mining
No ratings yet
Unit #2 - Data Warehouse and Data Mining
51 pages
Chapter 3 Data Exploration
No ratings yet
Chapter 3 Data Exploration
84 pages
OLAP Implementation Techniques: High Performance Data Warehouse Design and Construction
No ratings yet
OLAP Implementation Techniques: High Performance Data Warehouse Design and Construction
34 pages
Data Cube
No ratings yet
Data Cube
42 pages
Concepts and Techniques: - Chapter 5
No ratings yet
Concepts and Techniques: - Chapter 5
95 pages
1 Pengenalan Penambangan Data-IMD
No ratings yet
1 Pengenalan Penambangan Data-IMD
34 pages
Chapter 4
No ratings yet
Chapter 4
7 pages
DW - Rolap Molap Holap
No ratings yet
DW - Rolap Molap Holap
48 pages
What Is OLAP - On - Line Analytical Processing
No ratings yet
What Is OLAP - On - Line Analytical Processing
34 pages
Modeling The Data Warehouse and Data Mart
No ratings yet
Modeling The Data Warehouse and Data Mart
10 pages
OLAP i Skladišta Podataka
No ratings yet
OLAP i Skladišta Podataka
37 pages
Data Warehousing: Data Models and OLAP Operations
No ratings yet
Data Warehousing: Data Models and OLAP Operations
41 pages
Module 2 DMDW
No ratings yet
Module 2 DMDW
132 pages
Data Warehousing & OLAP Insights
No ratings yet
Data Warehousing & OLAP Insights
53 pages
DM Module 2
No ratings yet
DM Module 2
47 pages
Testing
No ratings yet
Testing
10 pages
Data Warehousing and Decision Support
No ratings yet
Data Warehousing and Decision Support
8 pages
ML Module1
No ratings yet
ML Module1
56 pages
Data Warehousing Exam Prep
No ratings yet
Data Warehousing Exam Prep
2 pages
Data Warehousing for ISE Students
No ratings yet
Data Warehousing for ISE Students
41 pages
Ax2012 Enus Deviv 12 PDF
100% (1)
Ax2012 Enus Deviv 12 PDF
60 pages
Data Warehousing & Modeling: Module - 2
No ratings yet
Data Warehousing & Modeling: Module - 2
144 pages
Capstone Project
No ratings yet
Capstone Project
57 pages
Data Mining - 3 PDF
No ratings yet
Data Mining - 3 PDF
62 pages
DWDM-Unit-5 Notes Mr. Rohit Pratap Singh
No ratings yet
DWDM-Unit-5 Notes Mr. Rohit Pratap Singh
51 pages
Module 2
No ratings yet
Module 2
19 pages
ETL and Data Warehousing Expertise
100% (1)
ETL and Data Warehousing Expertise
42 pages
Chapter 3 Topic - 4
No ratings yet
Chapter 3 Topic - 4
29 pages
Top 50 Data Mining Interview Questions & Answers PDF
No ratings yet
Top 50 Data Mining Interview Questions & Answers PDF
30 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
Data Ware House Concept 2019 (Compatibility Mode) PDF
No ratings yet
Data Ware House Concept 2019 (Compatibility Mode) PDF
25 pages
Mca Apr-2024 (2020 Pattern)
No ratings yet
Mca Apr-2024 (2020 Pattern)
11 pages
Data Warehousing Explained
No ratings yet
Data Warehousing Explained
21 pages
09 Data Serving
No ratings yet
09 Data Serving
46 pages
RTNU PHD Syllabus - Computer Application
No ratings yet
RTNU PHD Syllabus - Computer Application
14 pages
J R ® S R N: Asper Eports Erver Elease Otes
No ratings yet
J R ® S R N: Asper Eports Erver Elease Otes
26 pages
Data Cube Insights for Analysts
No ratings yet
Data Cube Insights for Analysts
14 pages
Bca DM Unit Ii
No ratings yet
Bca DM Unit Ii
17 pages
Data Warehousing and Mining
No ratings yet
Data Warehousing and Mining
69 pages
DWDM Module 2
No ratings yet
DWDM Module 2
76 pages
OLTP vs OLAP: A Technical Guide
No ratings yet
OLTP vs OLAP: A Technical Guide
44 pages
DMDW 1 2nd Module
No ratings yet
DMDW 1 2nd Module
29 pages
Chapter 2 and 3
No ratings yet
Chapter 2 and 3
89 pages
DM 24 Dwi Olap Queries Servers
No ratings yet
DM 24 Dwi Olap Queries Servers
4 pages
Data Warehousing for Analysts
No ratings yet
Data Warehousing for Analysts
56 pages
Cube Implementations
No ratings yet
Cube Implementations
29 pages
DM 6
No ratings yet
DM 6
29 pages
1200 MCQ For TGT CS
No ratings yet
1200 MCQ For TGT CS
229 pages
BMW M-2
No ratings yet
BMW M-2
41 pages
DMDW Co1 Session 7
No ratings yet
DMDW Co1 Session 7
46 pages
DM and DW Notes-Module2
No ratings yet
DM and DW Notes-Module2
18 pages
Implementation: Data Warehouse
No ratings yet
Implementation: Data Warehouse
56 pages
Data Analytics Roadmap @CodeBasics
No ratings yet
Data Analytics Roadmap @CodeBasics
13 pages
Data Warehousing & OLAP (Business Intellegent)
No ratings yet
Data Warehousing & OLAP (Business Intellegent)
31 pages
Note2 3
No ratings yet
Note2 3
36 pages
Real-Time Data Warehousing with SQL
No ratings yet
Real-Time Data Warehousing with SQL
27 pages
Data Warehousing: Data Models and OLAP Operations
No ratings yet
Data Warehousing: Data Models and OLAP Operations
39 pages
Data Reduction Techniques Guide
No ratings yet
Data Reduction Techniques Guide
39 pages
Difference Between Column-Stores and OLAP Data Cubes
No ratings yet
Difference Between Column-Stores and OLAP Data Cubes
3 pages
Multidimensional Data Model and OLAP
No ratings yet
Multidimensional Data Model and OLAP
21 pages
Advanced Analytics Unlocking The Power of Insight
No ratings yet
Advanced Analytics Unlocking The Power of Insight
15 pages
Cube Computation and Indexes For Data Warehouses: CPS 196.03 Notes 7
No ratings yet
Cube Computation and Indexes For Data Warehouses: CPS 196.03 Notes 7
28 pages
Lecture 3: Business Intelligence: OLAP, Data Warehouse, and Column Store
No ratings yet
Lecture 3: Business Intelligence: OLAP, Data Warehouse, and Column Store
119 pages
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
No ratings yet
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
32 pages
Data Warehouse - Logical Design
No ratings yet
Data Warehouse - Logical Design
40 pages
Decision Support Systems: Mcgraw-Hill/Irwin
No ratings yet
Decision Support Systems: Mcgraw-Hill/Irwin
67 pages
Notes Data Modelling Fundamentals
No ratings yet
Notes Data Modelling Fundamentals
154 pages
DWM Question Bank
No ratings yet
DWM Question Bank
2 pages
Dm-Lab - Nov 1
No ratings yet
Dm-Lab - Nov 1
86 pages
Data Warehousing: Online Analytical Processing (OLAP)
No ratings yet
Data Warehousing: Online Analytical Processing (OLAP)
44 pages
Adbms: Data Warehousing OLAP Technology
No ratings yet
Adbms: Data Warehousing OLAP Technology
57 pages
BI 7.x Tuning Tips for IT Teams
No ratings yet
BI 7.x Tuning Tips for IT Teams
2 pages
Data Representation and Analytics
No ratings yet
Data Representation and Analytics
15 pages
Data Warehousing: Data Models and OLAP Operations: by Kishore Jaladi
No ratings yet
Data Warehousing: Data Models and OLAP Operations: by Kishore Jaladi
41 pages
Database Reporting Tools To Query and Manage Data in Relational Database Management Systems Use (SQL)
No ratings yet
Database Reporting Tools To Query and Manage Data in Relational Database Management Systems Use (SQL)
2 pages
CS 345: Topics in Data Warehousing - Lecture 2
No ratings yet
CS 345: Topics in Data Warehousing - Lecture 2
27 pages
BW Multi Dimensional Modeling
No ratings yet
BW Multi Dimensional Modeling
72 pages
Enterprise BI & Analytics Guide
No ratings yet
Enterprise BI & Analytics Guide
69 pages
Olap Examples
No ratings yet
Olap Examples
10 pages
BOXI R3/R4 (Business Objects 3.1/4.0)
No ratings yet
BOXI R3/R4 (Business Objects 3.1/4.0)
2 pages

Data Mining and Warehosuing Lecture 02

Uploaded by

Data Mining and Warehosuing Lecture 02

Uploaded by

Data Warehouse Implementation

• Pre-computation of all or part of a data

• It is challenging task as it requires

• What is the purpose

• Proposed and studied by Gray et al

• Storage is the issue.

• When dimensions have concept hierarchies' things get worse.

• Do not precompile any non

• Selectively compute a proper

• Aggregate Cell: Cell from non based cuboid

• Compute all the cells of all the cuboids

• Best performance in query processing

• Many Cells may little or no interest in query processing

• Threshold values defined what to be included.

compute cube sales iceberg as

• Distinct Bit Vector is defined for each value in attribute domain

Base table RID H C P S

• Storage space is saved due to bits.

• Start schema has significant benefit.

Item Location SalesKey

Start Schema Fact and Dimensions

• Physical implementation needs to consider storage issues.

• Relational or extended relational DBMS to data store OLAP middleware for

• Decision Support Systems mostly used ROLAP Servers

• Array based multi dimensional storage engines

• Large volume of detailed data stored in relational database while

• Greater scalability from ROLAP and faster computation from MOLAP

You might also like