Data warehousing with MySQL
MySQL
MS-SQL
Oracle
MySQL
DB2
Flat Files
Free and Open Source Software
MySQL is licensed under
GPL.
The GPL is a Free and Open Source
Software (FOSS) license that grants
licensees many rights to the software
under the condition that, if they
choose to share the software, or
software built with GPL-licensed
software, they share it under the
same liberal terms.
Free and Open Source Software
Advantages of Open Source
MySQL has 5 million plus active
installation base.
New releases immediately
downloaded by users providing early
feedback on bugs and features.
Access to source code
Write your own features/proprietary
Storage Engine
Freedom !
Data Warehousing application
Data Warehouse is a relational
database.
It is designed for query and analysis
rather than for transaction
processing.
It enables an organization to
consolidate data from several
resources.
Extraction ,Transformation and Loading
Data
Source
Staging
Tables
MERGE
& BULK
INSERT
MERGE
Tables
Indexes,
Memory
Views,
Summary
SWH
AWH
HEAP
Extract
Load
Transform Storage
Performance
OLTP/ BI
Users
Extraction ,Transformation and
Loading
Staging database
LOAD DATA INFILE . Command.
Merging of SQLs
Segregating Informations
View enhancements
Index Enhancement
Memory Manipulation
Extraction, Transformation and Loading
Staging Area and its benefits
Relational Table structures are
flattened to support extract processes
in Staging Area.
First data is loaded into the temporary
table and then to the main DB tables.
Reduces the required space during ETL.
Data can be distributed to any number
of data marts
Partitioning and Storage Engine
The MERGE Table
A collection of identical
MyISAM tables used as one
You can use SELECT,
DELETE, UPDATE, and
INSERT on the collection of
tables.
Use it when having large
tables
DROP the MERGE table, you
drop only the MERGE spec.
Advantage : manageability
and performance
MERGE SALES
Table
Sales
for
Yr04
Aug04
Oct04
Sep04
Partitioning and Storage Engine
MERGING based on month as Range
JUN2004
JUL2004
JUN2004
OCT2004
AUG2004
SEP2004
OCT2004
Partitioning and Storage Engine
MERGE Table Example
mysql> CREATE TABLE jan04 ( -> a INT
NOT NULL AUTO_INCREMENT PRIMARY
KEY, -> message CHAR(20));
mysql> CREATE TABLE feb04 ( -> a INT
NOT NULL AUTO_INCREMENT PRIMARY
KEY, -> message CHAR(20));
mysql> CREATE TABLE year04 ( -> a
INT NOT NULL AUTO_INCREMENT, ->
message CHAR(20), INDEX(a)) ->
TYPE=MERGE UNION=(jan04,feb04)
INSERT_METHOD=LAST;
Partitioning and Storage Engine
MyISAM Storage Engine
Supports MERGE table.
Support fulltext indexing
INSERT DELAYED ... option very useful
when clients can't wait for the INSERT
to complete. Many client bundled
together and written in one block
Compress MyISAM tables with
myisampack to take up much less
space.
Benefit from higher performance on
SELECT statements
Partitioning and Storage Engine
Restrictions on MERGE tables
You can use only identical MyISAM tables
for a MERGE table.
MERGE tables use more file descriptors. If
10 clients are using a MERGE table that
maps to 10 tables, the server uses (10*10)
+ 10 file descriptors.
Key reads are slower. When you read a key,
the MERGE storage engine needs to issue a
read on all underlying tables to check
which one most closely matches the given
key.
Partitioning and Storage Engine
my.cnf parameters for DWH (example)
key_buffer =
1G
myisam_sort_buffer_size =
sort_buffer =
256M
5M
query_cache_type
query_cache_size
100M
key_buffer is the important one, this tells
mysql how much memory to cap itself
Business Intelligence
Using MySQL database server
Drastically reduce information retrieval by
distributing data into replicated clusters.
This enables parallel processing.
Tighter storage format (3 TB squeezed to
1TB)
Aggregate huge amount of data and deliver
reports for OLAP
Relieve overloaded OLTP databases
Availability, scalability and throughput for
the most demanding applications, and of
course affordability
Summary
Free and Open Source under GPL
MyISAM
Storage Engine
No Transactional Overhead
MERGE Table
Tighter storage format
Highly efficient