0% found this document useful (0 votes)

41 views21 pages

DWM 2

Uploaded by

bhimapasare45

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views21 pages

DWM 2

Uploaded by

bhimapasare45

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Samarth

Unit 2.Data Warehouse Modeling and Online Analytical Processing-I

Dimentional Modeling-
-The concept of Dimensional Modeling was developed by Ralph Kimball and consists of “fact”
and “dimension” tables.
-It is a logical design technique used for data warehouse.
-Every dimensional model is composed of at least one table with a multipart key called the fact
table and a set of smaller tables called dimension tables.
Elements of Dimensional Data Model

Fact

It is a collection of associated data items, consisting of measures and context data. It typically
represents business items or business transactions.

Dimensions

It is a collection of data which describe one business dimension. In simple terms, they give who,
what, where of a fact. In the Sales business process, for the fact quarterly sales number,
dimensions would be

● Who – Customer Names

● Where – Location
● What – Product Name
In other words, a dimension is a window to view information in the facts.

Fact Table
Fact tables are used to data facts or measures in the business
A fact table is a primary table in dimension modeling.
A Fact Table contains
1. Measurements/facts
2. Foreign key to dimension table

Dimension Table
-Dimension tables establish the context of the facts. Dimensional tables store fields that
describe the facts.

● A dimension table contains dimensions of a fact.

● They are joined to the fact table via a foreign key.
● Dimension tables are denormalized tables.
● The dimension can also contain one or more hierarchical relationships
Samarth

Comparison between Database & Data Warehouse

Database Data Warehouse

It Supports OLTP (Online Transaction It Supports OLAP (Online Analytical

Processing). Processing).

An organized accumulation of data called a A big, centralized repository of data that is

database. It makes it easier to access, specially created for reporting and data
retrieve, and manipulate information. analysis is known as a data warehouse.

It is designed for the purpose of storing the It is designed for the purpose of analysing the
data. data.

Designing is done using ER modelling Designing is done using data modelling

methods. methods.

A database contains detailed data. Data warehouses keep highly summarized

data.

databases are typically smaller in size. When compared to databases, data

warehouses are larger.

Data present in it is frequently updated to Data present in data warehouses are usually
maintain accuracy and consistency within the static and historical. Therefore, this already-
database. existing data can be utilised for effective data
analysis.

Applications developers and operational Business analysts and executives frequently

employees frequently use databases. use Data warehouses.

A few examples of databases are MySQL, A few examples of data warehouses are
Oracle, etc. Google BigQuery, IBM Db2, etc.

Data Cube:-
-When data is grouped or combined in multidimensional matrices called Data Cubes. The data
cube method has a few alternative names or a few variants, such as "Multidimensional
databases," "materialized views," and "OLAP (On-Line Analytical Processing)."
Samarth

-The general idea of this approach is to materialize certain expensive computations that are
frequently inquired.

-For example, a relation with the schema sales (part, supplier, customer, and sale-price) can be
materialized into a set of eight views as shown in fig, where psc indicates a view consisting of
aggregate function value (such as total-sales) computed by grouping three attributes part,
supplier, and customer, p indicates a view composed of the corresponding aggregate function
values calculated by grouping part alone, etc.

-The model view data in the form of a data cube. OLAP tools are based on the multidimensional
data model. Data cubes usually model n-dimensional data.

-A data cube enables data to be modeled and viewed in multiple dimensions. A

multidimensional data model is organized around a central theme, like sales and transactions. A
fact table represents this theme. Facts are numerical measures. Thus, the fact table contains
measure (such as Rs_sold) and keys to each of the related dimensional tables.

-Dimensions are a fact that defines a data cube. Facts are generally quantities, which are used
for analyzing the relationship between dimensions..

Example: In the 2-D representation, we will look at the All Electronics sales data for items sold
per quarter in the city of Vancouver. The measured display in dollars sold (in thousands).
Samarth

3-Dimensional Cuboids
Let suppose we would like to view the sales data with a third dimension. For example, suppose
we would like to view the data according to time, item as well as the location for the cities
Chicago, New York, Toronto, and Vancouver. The measured display in dollars sold (in
thousands). These 3-D data are shown in the table. The 3-D data of the table are represented
as a series of 2-D tables.

Conceptually, we may represent the same data in the form of 3-D data cubes, as shown in fig:

Let us suppose that we would like to view our sales data with an additional fourth dimension,
such as a supplier.
Samarth

In data warehousing, the data cubes are n-dimensional. The cuboid which holds the lowest level
of summarization is called a base cuboid.

For example, the 4-D cuboid in the figure is the base cuboid for the given time, item, location,
and supplier dimensions.

Figure is shown a 4-D data cube representation of sales data, according to the dimensions time,
item, location, and supplier. The measure displayed is dollars sold (in thousands).

Consider the data of a shop for items sold per quarter in the city of Delhi. The data is shown in
the table. In this 2D representation, the sales for Delhi are shown for the time dimension
(organized in quarters) and the item dimension (classified according to the types of an item
sold). The fact or measure displayed in rupee_sold (in thousands).

Now, if we want to view the sales data with a third dimension, For example, suppose the data
according to time and item, as well as the location is considered for the cities Chennai, Kolkata,
Mumbai, and Delhi. These 3D data are shown in the table. The 3D data of the table are
represented as a series of 2D tables.
Samarth

Conceptually, it may also be represented by the same data in the form of a 3D data cube, as
shown in fig:

Schema:-
-Schema is a logical description of the entire database. It includes the name and description of
records of all record types including all associated data-items and aggregates. Much like a
database, a data warehouse also requires to maintain a schema. A database uses relational
model, while a data warehouse uses Star, Snowflake, and Fact Constellation schema.

Types of Schema
1.Star Schema
2.Snowflake Schema
3.Fact Constellation Schema

Star Schema
Samarth

-star schema is the most popular schema design for a data warehouse.
-In a star schema, as the structure of a star, there is one fact table in the middle and a number
of associated dimension tables. This structure resembles a star and hence it is known as a star
schema.
-The primary key which is present in each dimension is related to a foreign key which is present
in the fact table.
-The size of the fact tables is large as compared to the dimension tables.

Figure – General structure of Star Schema

The following diagram shows the sales data of a company with respect to the four dimensions,
namely time, item, branch, and location.

-There is a fact table at the center. It contains the keys to each of four dimensions.

-The fact table also contains the attributes, namely dollars sold and units sold.

Advantages of star Schema-

1.Simplest DW Schema
Samarth

2.Easy to understand
3.Most Suitable for query processing
4.It is fully denormalized schema

Disadvantages of Star schema-

1.Data redundancy: Star schema can result in data redundancy, as the same data
may be stored in multiple places in the schema. This can lead to data
inconsistencies and difficulties in maintaining data accuracy.
2.Increased costs: Adding redundant data increases computing and storage costs.
This can be especially troubling when handling large datasets.

Snowflake Schema

-Snowflake schema acts like an extended version of a star schema.

-In a snowflake schema, the fact table is still located at the center of the
schema,surrounded by the dimension tables. However, each dimension table is further
broken down into multiple related tables.

Figure – General structure of Snowflake Schema

-Some dimension tables in the Snowflake schema are normalized.

Samarth

-Unlike Star schema, the dimensions table in a snowflake schema are normalized. For example,

the item dimension table in star schema is normalized and split into two dimension tables,

namely item and supplier table.

Now the item dimension table contains the attributes item_key, item_name, type, brand, and
supplier-key.

The supplier key is linked to the supplier dimension table. The supplier dimension table
contains the attributes supplier_key and supplier_type.
Due to normalization in the Snowflake schema, the redundancy is reduced and
therefore, it becomes easy to maintain and the save storage space.

Advantages of Snowflake Schema

1.It provides structured data which reduces the problem of data integrity.
2.It uses small disk space because data are highly structured.

Disadvantage of Snowflake Schema

1. The primary disadvantage of the snowflake schema is the additional maintenance efforts
required due to the increasing number of lookup tables. It is also known as a multi fact
star schema.

2. There are more complex queries and hence, difficult to understand.

Samarth

3. More tables more join so more query execution time.

Difference Between Star Schema and Snowflake Schema

Parameters Star Schema Snowflake Schema

Definition A star schema contains both A snowflake schema contains all three-
and Meaning dimension tables and fact dimension tables, fact tables, and sub-
tables in it. dimension tables.

Type of It is a top-down model type. It is a bottom-up model type.

Model

Space It makes use of more allotted It makes use of less allotted space.
Occupied space.

Time Taken With the Star Schema, the With the Snowflake Schema, the
for Queries process of execution of process of execution of queries takes
queries takes less time. more time.

Use of The Star Schema does not The Snowflake Schema makes use of
Normalizatio make use of normalization. both Denormalization as well as
n Normalization.

Complexity The design of a Star Schema The designing of a Snowflake Schema

of Design is very simple. is very complex.
Samarth

Complexity It is very easy to understand a It is comparatively more difficult to

of Star Schema. understand a Snowflake Schema.
Understandi
ng

Total The total number of foreign The total number of foreign keys is
Number of keys is less in the case of a more in the case of a Snowflake
Foreign Star Schema. Schema.
Keys

Data Data redundancy is Data redundancy is comparatively

Redundancy comparatively higher in Star lower in Snowflake Schema.
Schema.

Fact Constellation Schema

A fact constellation has multiple fact tables. It is also known as galaxy schema.
A Fact constellation means two or more fact tables sharing one or more dimensions.

■ Figure – General structure of Fact Constellation

The following diagram shows two fact tables, namely sales and shipping.
Samarth

The sales fact table is same as that in the star schema.

The shipping fact table has the five dimensions, namely item_key, time_key, shipper_key,
from_location, to_location.
The shipping fact table also contains two measures, namely dollars sold and units sold.
It is also possible to share dimension tables between fact tables. For example, time, item, and
location dimension tables are shared between the sales and shipping fact table.

Advantage: Provides a flexible schema.

Disadvantage: It is much more complex and hence, hard to implement and maintain.

Difference Between Fact Constellation Schema and Snowflake Schema

OLAP:-
-Online Analytical Processing Server (OLAP) is based on the multidimensional data model.
OLAP is a classification of software technology which authorizes analysts, managers, and
executives to gain insight into information through fast, consistent, interactive access in a wide
variety of possible views of data that has been transformed from raw information to reflect the
real dimensionality of the enterprise as understood by the clients.
Samarth

-Online Analytical Processing(OLAP) refers to a set of software tools used for data analysis in
order to make business decisions.
-It provides easy & efficient access to the various views of information to the users.
-The complex queries are also processed by using OLAP.
-It is a powerful technology for data discovery.
-It performs multidimensional analysis of business data.
-It has the ability to achieve fast access to shared multidimensional information.

Difference Between OLAP & OLTP

Types of OLAP:-
OLAP can be divided into following types:
1. MOCAP
2 .ROLAP
3. HOLAP

1. MOLAP
Samarth

-MOLAP stands for Multidimensional Online Analytical Processing,an application based on

multidimensional DBMSs.
-It is the classical form of OLAP & stores the data in an optimized multi-dimensional array
storage

Advantages

Excellent Performance: A MOLAP cube is built for fast information retrieval, and is optimal for
slicing and dicing operations.

Can perform complex calculations: All evaluations have been pre-generated when the cube is
created. Hence, complex calculations are not only possible, but they return quickly.

It performs fast query operation due to optimized storage, multidimensional indexing & caching.

Disadvantages

Limited in the amount of information it can handle: Because all calculations are performed when
the cube is built, it is not possible to contain a large amount of data in the cube itself.

Requires additional investment: Cube technology is generally proprietary and does not already
exist in the organization. Therefore, to adopt MOLAP technology, chances are other
investments in human and capital resources are needed.

-MOLAP comes with data redundancy.

-Sometimes the processing step can be lengthy, especially on large data

2. ROLAP

-ROLAP stands for Relational Online Analytical Processing., an application based on relational
DBMSs
-It works with relational databases.
-ROLAP depends on specialized schema design.
-It has the ability to drill down to the lowest level in the database.

Advantages

Can handle large amounts of information: The data size limitation of ROLAP technology
depends on the data size of the underlying RDBMS. So, ROLAP itself does not restrict the data
amount.

RDBMS already comes with a lot of features. So ROLAP technologies, (works on top of the
RDBMS) can control these functionalities.
Data can be stored efficiently.
Samarth

Disadvantages

Performance can be slow: Each ROLAP report is a SQL query (or multiple SQL queries) in the
relational database, the query time can be prolonged if the underlying data size is large.

Limited by SQL functionalities: ROLAP technology relies on upon developing SQL statements to
query the relational database, and SQL statements do not suit all needs.

3. HOLAP

-HOLAP stands for Hybrid Online Analytical Processing,an application using both relational and
multidimensional techniques.
- It uses relational tables to hold the larger quantities of detailed data.

Advantages of HOLAP
1.HOLAP provides benefits of both MOLAP and ROLAP.
2.It provides fast access at all levels of aggregation.
3.HOLAP balances the disk space requirement, as it only stores the aggregate information on
the OLAP server and the detail record remains in the relational database. So no duplicate copy
of the detail record is maintained.

Disadvantages of HOLAP

HOLAP architecture is very complicated because it supports both MOLAP and ROLAP servers.

Advantages of OLAP

-It enables managers to solve the problems.

-It controls the access to strategic information for more effective decision making.

-It enables the organization to respond more quickly to market demands.

-It enables users to analyze multidimensional data interactively from multiple perspectives.

-It does not require a large data warehouse.

Need of OLAP

-It Supports Multidimensional data.

-It provides fast,steady and proficient access to the various views of the information.

-Complex Queries can be processed.

Samarth

-It’s easy to analyze information by processing complex queries on multidimensional views of

data.

OLAP Guidelines

Dr. E.F. Codd the father of the relational model, created a list of rules to deal with the OLAP
systems. Users should priorities these rules according to their needs to match their business
requirements. These rules are as follows:

1. Multidimensional conceptual view: The OLAP should provide an appropriate multidimensional

Business model that suits the Business problems & Requirements.

2. Transparency: The OLAP tool should provide transparency to the input data for the users.

3. Accessibility: The OLAP tool should only access the data required only to the analysis
needed.

4. Consistent reporting performance: The Size of the database should not affect in any way the
performance.

5. Client/server architecture: The OLAP tool should use the client server architecture to ensure
better performance & flexibility.

6. Generic dimensionality: Data entered should be equivalent to structure & operation

requirements.
Samarth

7. Dynamic sparse matrix handling: The OLAP too should be able to manage the sparse matrix
& so maintain the level of performance.

8.Multi-user support: The OLAP should allow several user working concurrently to work
together.

9. Unrestricted cross-dimensional operations- The OLAP should be able to perform operations

across the dimensions of the cube.

10) Intuitive Data Manipulation: Data Manipulation fundamental the consolidation direction like
as reorientation (pivoting), drill-down and roll-up, and another manipulation to be accomplished
naturally and precisely via point-and-click and drag and drop methods on the cells of the
scientific model. It avoids the use of a menu or multiple trips to a user interface.

11. Flexible reporting: It is the ability of the tool to present the rows & column in a manner
suitable to be analyzed.

12. Unlimited dimensions & aggregation levels: This depends on the kind of Business, where
multiple dimensions & defining hierarchies can be made.

OLAP Operations:-

1.Roll-up
2.Drill-down

3.Slice and dice

4.Pivot (rotate)
5.Roll-up

1.Roll-up

-Roll-up performs aggregation on a data cube in any of the following ways −

-By climbing up a concept hierarchy for a dimension

-By dimension reduction

The following diagram illustrates how roll-up works.

Samarth

-Roll-up is performed by climbing up a concept hierarchy for the dimension location.Initially the
concept hierarchy was "street < city < province < country".

-On rolling up, the data is aggregated by ascending the location hierarchy from the level of city
to the level of country.

-The data is grouped into cities rather than countries.

-When roll-up is performed, one or more dimensions from the data cube are removed.

2.Drill-down

-Drill-down is the reverse operation of roll-up. It is performed by either of the following

ways −

-By stepping down a concept hierarchy for a dimension

-By introducing a new dimension.

The following diagram illustrates how drill-down works −

Samarth

-Drill-down is performed by stepping down a concept hierarchy for the dimension time.

-Initially the concept hierarchy was "day < month < quarter < year."

-On drilling down, the time dimension is descended from the level of quarter to the level of
month.

-When drill-down is performed, one or more dimensions from the data cube are added.

-It navigates the data from less detailed data to highly detailed data.

3.Slice

The slice operation selects one particular dimension from a given cube and provides a new sub-
cube. Consider the following diagram that shows how slice works.
Samarth

-Here Slice is performed for the dimension "time" using the criterion time = "Q1".
-It will form a new sub-cube by selecting one or more dimensions.

4.Dice

-Dice selects two or more dimensions from a given cube and provides a new sub-cube.
Consider the following diagram that shows the dice operation.
Samarth

-The dice operation on the cube based on the following selection criteria involves three
dimensions.

(location = "Toronto" or "Vancouver")

(time = "Q1" or "Q2")
(item =" Mobile" or "Modem")

5.Pivot

The pivot operation is also known as rotation. It rotates the data axes in view in order to provide
an alternative presentation of data. Consider the following diagram that shows the pivot
operation.

Operation Analytics and Investigating Metric Spike PROJECT 3RD
100% (1)
Operation Analytics and Investigating Metric Spike PROJECT 3RD
11 pages
SRS For ATM System
No ratings yet
SRS For ATM System
21 pages
Dimensional Modeling
100% (1)
Dimensional Modeling
12 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
Datawarehouse Concepts
No ratings yet
Datawarehouse Concepts
5 pages
A Multi-Dimensional Data Model
No ratings yet
A Multi-Dimensional Data Model
37 pages
DWM Unit 2. Data Warehousing Modeling & OLAP I
100% (2)
DWM Unit 2. Data Warehousing Modeling & OLAP I
16 pages
Untitled
No ratings yet
Untitled
1 page
Data Warehouse Basics & Schemas
100% (1)
Data Warehouse Basics & Schemas
25 pages
Dimensional Modeling: Prof. Sunita Sahu
No ratings yet
Dimensional Modeling: Prof. Sunita Sahu
50 pages
Data Cubemod2
100% (1)
Data Cubemod2
21 pages
Data Warehouse Design
No ratings yet
Data Warehouse Design
29 pages
DWDM Unit 2 PDF
No ratings yet
DWDM Unit 2 PDF
16 pages
Unit 2
No ratings yet
Unit 2
33 pages
Data Warehousing Lecture Notes
No ratings yet
Data Warehousing Lecture Notes
30 pages
Data Warehouse Ques
No ratings yet
Data Warehouse Ques
10 pages
Bi Lecture4 - 2023
No ratings yet
Bi Lecture4 - 2023
49 pages
Datawarehouse Concepts
No ratings yet
Datawarehouse Concepts
7 pages
Basics of Dimensional Modeling
100% (1)
Basics of Dimensional Modeling
14 pages
Chapter Four - Data Warehouse Design: SATA Technology and Business Collage
No ratings yet
Chapter Four - Data Warehouse Design: SATA Technology and Business Collage
10 pages
DW Concepts
No ratings yet
DW Concepts
7 pages
Dimensional Modeling and Schemas: Data Modeling Research Paper
No ratings yet
Dimensional Modeling and Schemas: Data Modeling Research Paper
11 pages
Data Warehouse Basics for Analysts
0% (1)
Data Warehouse Basics for Analysts
14 pages
5.data Warehouse
No ratings yet
5.data Warehouse
19 pages
LEA 103 CHAPTER 2-Part 1
No ratings yet
LEA 103 CHAPTER 2-Part 1
13 pages
Data Warehousing for Analysts
No ratings yet
Data Warehousing for Analysts
11 pages
Question Bank For Agri-Informatics
0% (2)
Question Bank For Agri-Informatics
2 pages
DWDM Unit 2
No ratings yet
DWDM Unit 2
104 pages
Dimensional Modeling Guide
No ratings yet
Dimensional Modeling Guide
14 pages
Data Mining
No ratings yet
Data Mining
55 pages
Unit 2
No ratings yet
Unit 2
8 pages
Unit 3
No ratings yet
Unit 3
18 pages
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
No ratings yet
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
40 pages
Dataware House Strcture
No ratings yet
Dataware House Strcture
13 pages
Chapter Eight
No ratings yet
Chapter Eight
33 pages
Final DWM
No ratings yet
Final DWM
30 pages
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
No ratings yet
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
39 pages
Unit - 4
No ratings yet
Unit - 4
36 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
11 pages
Faculty Course Assessment Guide
No ratings yet
Faculty Course Assessment Guide
126 pages
Abinitio Vijay - 8553385664
No ratings yet
Abinitio Vijay - 8553385664
28 pages
Data Cube
No ratings yet
Data Cube
6 pages
Data Warehouse Schemas & OLAP
No ratings yet
Data Warehouse Schemas & OLAP
12 pages
Option Chain File Using Guide
No ratings yet
Option Chain File Using Guide
10 pages
Experiment:1.3: Write A Program To Implement Sequential File Allocation Method. Ide Used: - Dev C++
No ratings yet
Experiment:1.3: Write A Program To Implement Sequential File Allocation Method. Ide Used: - Dev C++
4 pages
9 Step To Design Data Warehouse
No ratings yet
9 Step To Design Data Warehouse
24 pages
Unit 1
No ratings yet
Unit 1
26 pages
Contact Management System
No ratings yet
Contact Management System
11 pages
Mi0034 Database Management System Set1
No ratings yet
Mi0034 Database Management System Set1
27 pages
DW Unit IV Notes
No ratings yet
DW Unit IV Notes
36 pages
Science BSC Information Technology Semester 5 2019 November Next Generation Technologies Cbcs
No ratings yet
Science BSC Information Technology Semester 5 2019 November Next Generation Technologies Cbcs
21 pages
Introduction To DataWarehouse and DataMining
No ratings yet
Introduction To DataWarehouse and DataMining
35 pages
Multi Dimensional Data Model
No ratings yet
Multi Dimensional Data Model
21 pages
Awsglossary Ref
No ratings yet
Awsglossary Ref
69 pages
In The Name of Allah: Virtual University of Pakistan
No ratings yet
In The Name of Allah: Virtual University of Pakistan
16 pages
Data Warehousing Essentials
No ratings yet
Data Warehousing Essentials
14 pages
Os Answer 1
No ratings yet
Os Answer 1
3 pages
1
No ratings yet
1
35 pages
DWM Chp2 Notes
No ratings yet
DWM Chp2 Notes
21 pages
Data Mining Notes UNIT II
No ratings yet
Data Mining Notes UNIT II
25 pages
Advanced Database
No ratings yet
Advanced Database
29 pages
Lect-6-Data warehousing-Part-II
No ratings yet
Lect-6-Data warehousing-Part-II
37 pages
CH 3
No ratings yet
CH 3
60 pages
CS506 Highlight Handout
No ratings yet
CS506 Highlight Handout
633 pages
Professional Summary
No ratings yet
Professional Summary
6 pages
Obi Odi Lineage
No ratings yet
Obi Odi Lineage
31 pages
Unit 4
No ratings yet
Unit 4
41 pages
Unit 2-DATA WAREHOUSE
No ratings yet
Unit 2-DATA WAREHOUSE
28 pages
Unit 2
No ratings yet
Unit 2
30 pages
Paggawa NG Term Paper
100% (1)
Paggawa NG Term Paper
4 pages
DMDW Unit2
No ratings yet
DMDW Unit2
35 pages
Course Introduction: Dsecl Zc556 Stream Processing and Analytics Lecture No. 1.0
No ratings yet
Course Introduction: Dsecl Zc556 Stream Processing and Analytics Lecture No. 1.0
52 pages
Billing System
No ratings yet
Billing System
41 pages
RSAP 2019 Training Manual
No ratings yet
RSAP 2019 Training Manual
178 pages
CS441 FinalTerm PPT by AC 03222254114
No ratings yet
CS441 FinalTerm PPT by AC 03222254114
456 pages
DWH Unit 2
No ratings yet
DWH Unit 2
13 pages
Arpan CV
No ratings yet
Arpan CV
2 pages
Big Data Analytics Module 1
No ratings yet
Big Data Analytics Module 1
31 pages
DMDW
No ratings yet
DMDW
40 pages
Hariom Pandey Exp
No ratings yet
Hariom Pandey Exp
2 pages
Validations in Spring Boot
No ratings yet
Validations in Spring Boot
9 pages
(2025-05-27) - FPM - Lecture 9
No ratings yet
(2025-05-27) - FPM - Lecture 9
35 pages
50 SQL Questions Master SQL
100% (1)
50 SQL Questions Master SQL
3 pages
DWDM Unit - I Notes
No ratings yet
DWDM Unit - I Notes
24 pages
OTDM
No ratings yet
OTDM
10 pages
Dimensional Data Modeling With Databricks
No ratings yet
Dimensional Data Modeling With Databricks
23 pages
DWDM Unit-1 R23
No ratings yet
DWDM Unit-1 R23
33 pages
MultiDimensional Data Model
No ratings yet
MultiDimensional Data Model
14 pages
Dimensional Modeling
No ratings yet
Dimensional Modeling
47 pages

DWM 2

Uploaded by

DWM 2

Uploaded by

Samarth

Unit 2.Data Warehouse Modeling and Online Analytical Processing-I

● Who – Customer Names

● A dimension table contains dimensions of a fact.

Comparison between Database & Data Warehouse

It Supports OLTP (Online Transaction It Supports OLAP (Online Analytical

An organized accumulation of data called a A big, centralized repository of data that is

Designing is done using ER modelling Designing is done using data modelling

A database contains detailed data. Data warehouses keep highly summarized

databases are typically smaller in size. When compared to databases, data

Applications developers and operational Business analysts and executives frequently

-A data cube enables data to be modeled and viewed in multiple dimensions. A

Figure – General structure of Star Schema

Advantages of star Schema-

Disadvantages of Star schema-

-Snowflake schema acts like an extended version of a star schema.

Figure – General structure of Snowflake Schema

-Some dimension tables in the Snowflake schema are normalized.

namely item and supplier table.

Advantages of Snowflake Schema

Disadvantage of Snowflake Schema

2. There are more complex queries and hence, difficult to understand.

3. More tables more join so more query execution time.

Difference Between Star Schema and Snowflake Schema

Parameters Star Schema Snowflake Schema

Type of It is a top-down model type. It is a bottom-up model type.

Complexity The design of a Star Schema The designing of a Snowflake Schema

Complexity It is very easy to understand a It is comparatively more difficult to

Data Data redundancy is Data redundancy is comparatively

Fact Constellation Schema

■ Figure – General structure of Fact Constellation

The sales fact table is same as that in the star schema.

Advantage: Provides a flexible schema.

Difference Between Fact Constellation Schema and Snowflake Schema

Difference Between OLAP & OLTP

-MOLAP stands for Multidimensional Online Analytical Processing,an application based on

-MOLAP comes with data redundancy.

-Sometimes the processing step can be lengthy, especially on large data

-It enables managers to solve the problems.

-It enables the organization to respond more quickly to market demands.

-It does not require a large data warehouse.

-It Supports Multidimensional data.

-Complex Queries can be processed.

-It’s easy to analyze information by processing complex queries on multidimensional views of

1. Multidimensional conceptual view: The OLAP should provide an appropriate multidimensional

6. Generic dimensionality: Data entered should be equivalent to structure & operation

9. Unrestricted cross-dimensional operations- The OLAP should be able to perform operations

3.Slice and dice

-Roll-up performs aggregation on a data cube in any of the following ways −

-By climbing up a concept hierarchy for a dimension

The following diagram illustrates how roll-up works.

-The data is grouped into cities rather than countries.

-Drill-down is the reverse operation of roll-up. It is performed by either of the following

-By stepping down a concept hierarchy for a dimension

The following diagram illustrates how drill-down works −

(location = "Toronto" or "Vancouver")

You might also like