0% found this document useful (0 votes)

5 views10 pages

OLAP Operations in The Multidimensional Data Model

dmdw

Uploaded by

Subhadra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views10 pages

OLAP Operations in The Multidimensional Data Model

dmdw

Uploaded by

Subhadra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

OLAP Operations in the Multidimensional Data Model

In the multidimensional model, the records are organized into various dimensions, and each dimension
includes multiple levels of abstraction described by concept hierarchies. This organization support
users with the flexibility to view data from various perspectives. A number of OLAP data cube
operation exist to demonstrate these different views, allowing interactive queries and search of the
record at hand. Hence, OLAP supports a user-friendly environment for interactive data analysis.

Consider the OLAP operations which are to be performed on multidimensional data. The figure shows
data cubes for sales of a shop. The cube contains the dimensions, location, and time and item, where
the location is aggregated with regard to city values, time is aggregated with respect to quarters,
and an item is aggregated with respect to item types.

Roll-Up
The roll-up operation (also known as drill-up or aggregation operation) performs aggregation on
a data cube, by climbing down concept hierarchies, i.e., dimension reduction. Roll-up is like zooming-
out on the data cubes. Figure shows the result of roll-up operations performed on the dimension
location. The hierarchy for the location is defined as the Order Street, city, province, or state, country.
The roll-up operation aggregates the data by ascending the location hierarchy from the level of the
city to the level of the country.

When a roll-up is performed by dimensions reduction, one or more dimensions are removed from the
cube. For example, consider a sales data cube having two dimensions, location and time. Roll-up may
be performed by removing, the time dimensions, appearing in an aggregation of the total sales by
location, relatively than by location and by timen

Example

Consider the following cubes illustrating temperature of certain days recorded weekly:

Temperature 64 65 68 69 70 71 72 75 80 81 83 85

Week1 1 0 1 0 1 0 0 0 0 0 1 0

Week2 0 0 0 1 0 0 1 2 0 1 0 0

Consider that we want to set up levels (hot (80-85), mild (70-75), cool (64-69)) in temperature from
the above cubes.

To do this, we have to group column and add up the value according to the concept hierarchies. This
operation is known as a roll-up.

By doing this, we contain the following cube:

Temperature cool mild hot

Week1 2 1 1

Week2 2 1 1

The roll-up operation groups the information by levels of temperature.

The following diagram illustrates how roll-up works.

Drill-Down
The drill-down operation (also called roll-down) is the reverse operation of roll-up. Drill-down is
like zooming-in on the data cube. It navigates from less detailed record to more detailed data. Drill-
down can be performed by either stepping down a concept hierarchy for a dimension or adding
additional dimensions.

Figure shows a drill-down operation performed on the dimension time by stepping down a concept
hierarchy which is defined as day, month, quarter, and year. Drill-down appears by descending the
time hierarchy from the level of the quarter to a more detailed level of the month.

Because a drill-down adds more details to the given data, it can also be performed by adding a new
dimension to a cube. For example, a drill-down on the central cubes of the figure can occur by
introducing an additional dimension, such as a customer group.

Example

Drill-down adds more details to the given data

Temperature cool mild hot

Day 1 0 0 0

Day 2 0 0 0

Day 3 0 0 1

Day 4 0 1 0
Day 5 1 0 0

Day 6 0 0 0

Day 7 1 0 0

Day 8 0 0 0

Day 9 1 0 0

Day 10 0 1 0

Day 11 0 1 0

Day 12 0 1 0

Day 13 0 0 1

Day 14 0 0 0

The following diagram illustrates how Drill-down works.

Slice
A slice is a subset of the cubes corresponding to a single value for one or more members of the
dimension. For example, a slice operation is executed when the customer wants a selection on one
dimension of a three-dimensional cube resulting in a two-dimensional site. So, the Slice operations
perform a selection on one dimension of the given cube, thus resulting in a subcube.

For example, if we make the selection, temperature=cool we will obtain the following cube:
Temperature cool

Day 1 0

Day 2 0

Day 3 0

Day 4 0

Day 5 1

Day 6 1

Day 7 1

Day 8 1

Day 9 1

Day 11 0

Day 12 0

Day 13 0

Day 14 0

The following diagram illustrates how Slice works.

Here Slice is functioning for the dimensions "time" using the criterion time = "Q1".

It will form a new sub-cubes by selecting one or more dimensions.

Dice
The dice operation describes a subcube by operating a selection on two or more dimension.

For example, Implement the selection (time = day 3 OR time = day 4) AND (temperature = cool OR
temperature = hot) to the original cubes we get the following subcube (still two-dimensional)

Temperature cool hot

Day 3 0 1

Day 4 0 0

Consider the following diagram, which shows the dice operations.

The dice operation on the cubes based on the following selection criteria involves three dimensions.

o (location = "Toronto" or "Vancouver")

o (time = "Q1" or "Q2")
o (item =" Mobile" or "Modem")

Data Mining - Issues

Data mining is not an easy task, as the algorithms used can get very complex and data is not always
available at one place. It needs to be integrated from various heterogeneous data sources. These
factors also create some issues. Here in this tutorial, we will discuss the major issues regarding −

 Mining Methodology and User Interaction

 Performance Issues
 Diverse Data Types Issues
The following diagram describes the major issues.
Mining Methodology and User Interaction Issues

It refers to the following kinds of issues −

Mining different kinds of knowledge in databases − Different users may be interested in
different kinds of knowledge. Therefore it is necessary for data mining to cover a broad range of
knowledge discovery task.
Interactive mining of knowledge at multiple levels of abstraction − The data mining process
needs to be interactive because it allows users to focus the search for patterns, providing and refining
data mining requests based on the returned results.
Incorporation of background knowledge − To guide discovery process and to express the
discovered patterns, the background knowledge can be used. Background knowledge may be used to
express the discovered patterns not only in concise terms but at multiple levels of abstraction
Data mining query languages and ad hoc data mining − Data Mining Query language that
allows the user to describe ad hoc mining tasks, should be integrated with a data warehouse
query language and optimized for efficient and flexible data mining.
Presentation and visualization of data mining results − Once the patterns are discovered it
needs to be expressed in high level languages, and visual representations. These representations
should be easily understandable.
Handling noisy or incomplete data − The data cleaning methods are required to handle the noise
and incomplete objects while mining the data regularities. If the data cleaning methods are not there
then the accuracy of the discovered patterns will be poor.
Pattern evaluation − The patterns discovered should be interesting because either they
represent common knowledge or lack novelty.


Performance Issues

There can be performance-related issues such as follows −


Efficiency and scalability of data mining algorithms − In order to effectively extract the
information from huge amount of data in databases, data mining algorithm must be efficient and
scalable.
Parallel, distributed, and incremental mining algorithms − The factors such as huge size
of databases, wide distribution of data, and complexity of data mining methods motivate the
development of parallel and distributed data mining algorithms. These algorithms divide the data
into partitions which is further processed in a parallel fashion. Then the results from the
partitions is merged. The incremental algorithms, update databases without mining the data
again from scratch.

Diverse Data Types Issues

Handling of relational and complex types of data − The database may contain complex
data objects, multimedia data objects, spatial data, temporal data etc. It is not possible for one
system to mine all these kind of data.

Mining information from heterogeneous databases and global information systems −

The data is available at different data sources on LAN or WAN. These data source may be
structured, semi structured or unstructured. Therefore mining the knowledge from them adds
challenges to data mining.

Difference between Supervised and Unsupervised Learning

Supervised and Unsupervised learning are the two techniques of machine learning. But both the
techniques are used in different scenarios and with different datasets. Below the explanation of both
learning methods along with their difference table is given.

Supervised Machine Learning:

Supervised learning is a machine learning method in which models are trained using labeled data. In
supervised learning, models need to find the mapping function to map the input variable (X) with the
output variable (Y).

Supervised learning needs supervision to train the model, which is similar to as a student learns things
in the presence of a teacher. Supervised learning can be used for two types of
problems: Classification and Regression.

Learn more Supervised Machine Learning

Pause

Unmute

Current TimeÂ 0:08

DurationÂ 18:10
Loaded: 4.40%

Fullscreen

Example: Suppose we have an image of different types of fruits. The task of our supervised learning
model is to identify the fruits and classify them accordingly. So to identify the image in supervised
learning, we will give the input data as well as output for that, which means we will train the model by
the shape, size, color, and taste of each fruit. Once the training is completed, we will test the model by
giving the new set of fruit. The model will identify the fruit and predict the output using a suitable
algorithm.
Unsupervised Machine Learning:
Unsupervised learning is another machine learning method in which patterns inferred from the
unlabeled input data. The goal of unsupervised learning is to find the structure and patterns from the
input data. Unsupervised learning does not need any supervision. Instead, it finds patterns from the
data by its own.

Learn more Unsupervised Machine Learning

Unsupervised learning can be used for two types of problems: Clustering and Association.

Example: To understand the unsupervised learning, we will use the example given above. So unlike
supervised learning, here we will not provide any supervision to the model. We will just provide the
input dataset to the model and allow the model to find the patterns from the data. With the help of a
suitable algorithm, the model will train itself and divide the fruits into different groups according to the
most similar features between them.

The main differences between Supervised and Unsupervised learning are given below:

Supervised Learning Unsupervised Learning

Supervised learning algorithms are Unsupervised learning algorithms are

trained using labeled data. trained using unlabeled data.

Supervised learning model takes direct Unsupervised learning model does not
feedback to check if it is predicting take any feedback.
correct output or not.

Supervised learning model predicts the Unsupervised learning model finds the
output. hidden patterns in data.

In supervised learning, input data is In unsupervised learning, only input

provided to the model along with the data is provided to the model.
output.

The goal of supervised learning is to The goal of unsupervised learning is to

train the model so that it can predict the find the hidden patterns and useful
output when it is given new data. insights from the unknown dataset.

Supervised learning needs supervision Unsupervised learning does not need

to train the model. any supervision to train the model.

Supervised learning can be categorized Unsupervised Learning can be

in Classification and Regression probl classified
ems. in Clustering and Associations probl
ems.

Supervised learning can be used for Unsupervised learning can be used for
those cases where we know the input as those cases where we have only input
well as corresponding outputs. data and no corresponding output
data.

Supervised learning model produces an Unsupervised learning model may

accurate result. give less accurate result as compared
to supervised learning.

Supervised learning is not close to true Unsupervised learning is more close to

Artificial intelligence as in this, we first the true Artificial Intelligence as it
train the model for each data, and then learns similarly as a child learns daily
only it can predict the correct output. routine things by his experiences.

It includes various algorithms such as It includes various algorithms such as

Linear Regression, Logistic Regression, Clustering, KNN, and Apriori alg
Support Vector Machine, Multi-class
Classification, Decision tree, Bayesian
Logic, etc.

Error Messages
No ratings yet
Error Messages
53 pages
Larson PM 8e Ch03 Im
No ratings yet
Larson PM 8e Ch03 Im
16 pages
Data Mining Essentials for Analysts
No ratings yet
Data Mining Essentials for Analysts
7 pages
Heydaraliyevculturalcentre 180131094714 PDF
No ratings yet
Heydaraliyevculturalcentre 180131094714 PDF
23 pages
PL 100F VFD - UserManual
No ratings yet
PL 100F VFD - UserManual
35 pages
Step by Step Guide Book On Home Wiring
100% (4)
Step by Step Guide Book On Home Wiring
50 pages
2018
No ratings yet
2018
1 page
Data Mining Issues
No ratings yet
Data Mining Issues
5 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
Nintendo Power Issue 271 (September 2011)
No ratings yet
Nintendo Power Issue 271 (September 2011)
101 pages
Deploy DFS on Windows Server 2012 R2
No ratings yet
Deploy DFS on Windows Server 2012 R2
53 pages
V30Plus GNSS RTK Brochure EN 20220608 S
100% (1)
V30Plus GNSS RTK Brochure EN 20220608 S
2 pages
Components of A Big Data Architecture
No ratings yet
Components of A Big Data Architecture
3 pages
N - Channel Enhancement Mode " Single Feature Size " Power Mosfet
No ratings yet
N - Channel Enhancement Mode " Single Feature Size " Power Mosfet
9 pages
Wireless Livestock Feed Monitoring and Management System Using Arduino and IOT
No ratings yet
Wireless Livestock Feed Monitoring and Management System Using Arduino and IOT
7 pages
Gemcom Minex: New Features
No ratings yet
Gemcom Minex: New Features
13 pages
DM Lesson3
No ratings yet
DM Lesson3
14 pages
Black Box Fairness Testing of Machine Learning Models
No ratings yet
Black Box Fairness Testing of Machine Learning Models
11 pages
Data Mining Task Primitives and Major Issues
No ratings yet
Data Mining Task Primitives and Major Issues
18 pages
Presentation Topics
No ratings yet
Presentation Topics
11 pages
DWDM Assocaition
No ratings yet
DWDM Assocaition
17 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
SH - Fall of Troy Semi Fiction PDF
No ratings yet
SH - Fall of Troy Semi Fiction PDF
11 pages
TPNNR: Exennrnatron
No ratings yet
TPNNR: Exennrnatron
2 pages
2-Concept Hierarchy To Classification of DMS
No ratings yet
2-Concept Hierarchy To Classification of DMS
75 pages
Effective Executive Summary by Drucker
No ratings yet
Effective Executive Summary by Drucker
10 pages
IDELA Training Manual - Baseline II
No ratings yet
IDELA Training Manual - Baseline II
30 pages
Business Analytics For Decision Making 3-6
No ratings yet
Business Analytics For Decision Making 3-6
31 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
Data Mining Mod 1 Notes
No ratings yet
Data Mining Mod 1 Notes
25 pages
Imprimanta Bilete - Custom KPM302-user Manual
No ratings yet
Imprimanta Bilete - Custom KPM302-user Manual
188 pages
Data Mining
No ratings yet
Data Mining
44 pages
Unit III
No ratings yet
Unit III
101 pages
ICTCYS604 Project Portfolio Best Practices Identify Managment JPSR
No ratings yet
ICTCYS604 Project Portfolio Best Practices Identify Managment JPSR
20 pages
Data Mining Notes
No ratings yet
Data Mining Notes
25 pages
DWH Unit 3
No ratings yet
DWH Unit 3
7 pages
Week1 2
No ratings yet
Week1 2
24 pages
Concept Hierarchies
No ratings yet
Concept Hierarchies
6 pages
DM Notes
No ratings yet
DM Notes
91 pages
Advanced Databases and Mining Unit 4
No ratings yet
Advanced Databases and Mining Unit 4
10 pages
Olap in Data Mining Part 2
No ratings yet
Olap in Data Mining Part 2
9 pages
Attributes in Data Mining
No ratings yet
Attributes in Data Mining
10 pages
DWM Notes Class by Proff
No ratings yet
DWM Notes Class by Proff
88 pages
Deque Implementation in C
No ratings yet
Deque Implementation in C
6 pages
Unit-I Data Mining
No ratings yet
Unit-I Data Mining
28 pages
Complement of Graph
No ratings yet
Complement of Graph
5 pages
Bresenhams Line
No ratings yet
Bresenhams Line
5 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
2019
No ratings yet
2019
2 pages
LECTURE 3-BDM 411 Data Analytics and BIG Data
No ratings yet
LECTURE 3-BDM 411 Data Analytics and BIG Data
49 pages
Data Mining Notes
No ratings yet
Data Mining Notes
82 pages
Unit 1 DMW
No ratings yet
Unit 1 DMW
41 pages
Data Mining
No ratings yet
Data Mining
26 pages
Unit-1 Notes Onl
No ratings yet
Unit-1 Notes Onl
25 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Data Mining Module 2
No ratings yet
Data Mining Module 2
23 pages
Appleton Conduit Hub
No ratings yet
Appleton Conduit Hub
1 page
Unit 2
No ratings yet
Unit 2
37 pages
Data Mining: Key Issues and Tasks
No ratings yet
Data Mining: Key Issues and Tasks
5 pages
JavaTextbook Chapter 21 JDBC-2020
No ratings yet
JavaTextbook Chapter 21 JDBC-2020
29 pages
Data Mining Essentials for Analysts
No ratings yet
Data Mining Essentials for Analysts
73 pages
HAJJATII
No ratings yet
HAJJATII
11 pages
Ada Boost Optimizes Wave Energy Arrays
No ratings yet
Ada Boost Optimizes Wave Energy Arrays
6 pages
Major Issues in Data Mining
No ratings yet
Major Issues in Data Mining
1 page
DM Chapter 1
No ratings yet
DM Chapter 1
10 pages
Unit 1 Datamining
No ratings yet
Unit 1 Datamining
16 pages
Data Mining & Warehousing Basics
No ratings yet
Data Mining & Warehousing Basics
30 pages
Unit - I
No ratings yet
Unit - I
22 pages
Data Mining Notes1
No ratings yet
Data Mining Notes1
56 pages
Allied Meditec 1100 October 2023 Ver23-10
No ratings yet
Allied Meditec 1100 October 2023 Ver23-10
2 pages
DM Notes-1
No ratings yet
DM Notes-1
71 pages
Sapera User
No ratings yet
Sapera User
109 pages
Data Mining - Digital Notes (Unit I To V)
No ratings yet
Data Mining - Digital Notes (Unit I To V)
85 pages
Xuewei 2020
No ratings yet
Xuewei 2020
5 pages
Git Collaboration Basics Guide
No ratings yet
Git Collaboration Basics Guide
75 pages
Data Mining Essentials Explained
No ratings yet
Data Mining Essentials Explained
24 pages
Data Mining for Tech Enthusiasts
No ratings yet
Data Mining for Tech Enthusiasts
61 pages
Road Restraint Systems Guide
No ratings yet
Road Restraint Systems Guide
82 pages
Unit-1 Introduction To Data Mining
No ratings yet
Unit-1 Introduction To Data Mining
33 pages
Cambridge 1 Syllabus Planer Nov - Dec 2023
No ratings yet
Cambridge 1 Syllabus Planer Nov - Dec 2023
3 pages
L-1 Data Mining Issues
No ratings yet
L-1 Data Mining Issues
24 pages
Frequency Hopping Network Implementation and Planning: Number/Version Checked by Approved by 1.0.0 23 Oct 98 Jry 1
No ratings yet
Frequency Hopping Network Implementation and Planning: Number/Version Checked by Approved by 1.0.0 23 Oct 98 Jry 1
79 pages
Unit 3
No ratings yet
Unit 3
34 pages
ECON 246 Study Guide 4
No ratings yet
ECON 246 Study Guide 4
5 pages
Data Mining Insights & Applications
No ratings yet
Data Mining Insights & Applications
9 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Major Issues in DM
No ratings yet
Major Issues in DM
5 pages
Data Mining & Warehousing Guide
No ratings yet
Data Mining & Warehousing Guide
12 pages
Chapter-1 - Introduction To Data Mining
No ratings yet
Chapter-1 - Introduction To Data Mining
10 pages
Data Mining Notes
No ratings yet
Data Mining Notes
14 pages
Specialized Business Information Systems
0% (1)
Specialized Business Information Systems
34 pages
Yihao Final Paper CCSC For Submission
No ratings yet
Yihao Final Paper CCSC For Submission
6 pages
Kinds of Data: 1. Data Bases Data 2.data Warehouses Data 3. Transactional Data
No ratings yet
Kinds of Data: 1. Data Bases Data 2.data Warehouses Data 3. Transactional Data
24 pages
Data Mining
No ratings yet
Data Mining
22 pages
DM Unit 1 PDF
No ratings yet
DM Unit 1 PDF
9 pages
The Full Form of KDD Is
No ratings yet
The Full Form of KDD Is
6 pages

OLAP Operations in The Multidimensional Data Model

Uploaded by

OLAP Operations in The Multidimensional Data Model

Uploaded by

OLAP Operations in the Multidimensional Data Model

By doing this, we contain the following cube:

Temperature cool mild hot

The roll-up operation groups the information by levels of temperature.

The following diagram illustrates how roll-up works.

Drill-down adds more details to the given data

Temperature cool mild hot

The following diagram illustrates how Drill-down works.

The following diagram illustrates how Slice works.

It will form a new sub-cubes by selecting one or more dimensions.

Temperature cool hot

Consider the following diagram, which shows the dice operations.

o (location = "Toronto" or "Vancouver")

Data Mining - Issues

 Mining Methodology and User Interaction

It refers to the following kinds of issues −

There can be performance-related issues such as follows −

Diverse Data Types Issues

Mining information from heterogeneous databases and global information systems −

Difference between Supervised and Unsupervised Learning

Supervised Machine Learning:

Learn more Supervised Machine Learning

Current TimeÂ 0:08

Learn more Unsupervised Machine Learning

Supervised Learning Unsupervised Learning

Supervised learning algorithms are Unsupervised learning algorithms are

In supervised learning, input data is In unsupervised learning, only input

The goal of supervised learning is to The goal of unsupervised learning is to

Supervised learning needs supervision Unsupervised learning does not need

Supervised learning can be categorized Unsupervised Learning can be

Supervised learning model produces an Unsupervised learning model may

Supervised learning is not close to true Unsupervised learning is more close to

It includes various algorithms such as It includes various algorithms such as

You might also like