0% found this document useful (0 votes)

639 views62 pages

MonetDB/X100: Fast Column-Store Overview

This document provides an overview of MonetDB/X100, a fast column-store database. It summarizes improvements made in MonetDB and introduces MonetDB/X100. Key aspects of MonetDB/X100 include processing data in vectors of tuples rather than individually, optimized vectorized primitives, and pipelined query evaluation to improve scalability over MonetDB. Lightweight compression and maximizing disk scan sharing enable feeding the database's high throughput.

Uploaded by

nikescar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

639 views62 pages

MonetDB/X100: Fast Column-Store Overview

Uploaded by

nikescar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

MonetDB/X100 M tDB/X100 a ( y) fast column-store (very)

Marcin Z k M i Zukowski, Peter Boncz ki P t B

Sndor Hman, Niels Nes

CWI, Amsterdam, The Netherlands

Disclaimer a different one

This talk is about data-intensive applications pp
Data warehousing, analytical processing Scientific data information retrieval data,

Transaction processing is a different story

A lot of content wake up!

MonetDB/X100 overview 2

Outline
Traditional database performance p
Improvements in MonetDB

MonetDB/X100
Query execution Storage

MonetDB/X100 overview

Database performance p
TPC-H 1GB, Query 1 ,Q y Selects 98% of fact table (6M rows), computes net prices and aggregates all t i d t ll Performance:
C program: MySQL: DBMS X: ? 26.2s 28.1s

MonetDB/X100 overview

Database performance p
TPC-H 1GB, Query 1 ,Q y Selects 98% of fact table (6M rows), computes net prices and aggregates all t i d t ll Performance:
C program: MySQL: DBMS X: 0.2s 26.2s 28.1s

MonetDB/X100 overview

Database pe o a ce a a y ed atabase performance analyzed

Why so slow? y
Inefficient data storage format Inefficient query processing model

MonetDB/X100 overview

N-ary storage model (NSM) y g ( )

Fixed-width attributes in a record
101 Joe 103 Edward 27 21 Black Scissorhand

MonetDB/X100 overview

Real-life NSM implementation p

Slotted pages, example: p g , p
101 27 Joe Black

103 03

Edward Scissorhand

MonetDB/X100 overview

NSM problems p
Poor bandwidth use
Always read all the attributes Terrible on disk Bad in memory

Complex attribute access

Variable-length fields Null fields

MonetDB/X100 overview

Column stores to the rescue!

Store attributes separately p y

Read l R d only attributes used by a query ib db

MonetDB/X100 overview

Traditional column stores

Data path p
Read columns from disk Convert into NSM Use NSM-based processing

Examples: Sybase IQ, Vertica Not enough! g

Only I/O problem addressed

MonetDB/X100 overview

How databases run a query q y

Query
SELECT name, salary .19 salary*.19 AS tax FROM employee WHERE age > 25

MonetDB/X100 overview

Database operators p
Tuple-at-a-time iterator interface: - open() - next(): tuple - close() next() is called: - for each operator - for each tuple Complex code repeated over and over

MonetDB/X100 overview

Primitive functions
Provide data-specific computational functionality Called once for every operation on every tuple. e er t ple Even worse with complex tuple representation Perform one operation (e.g. addition) in one call

MonetDB/X100 overview

DBMS performance - IPT p

Lots of repeated, unnecessary code p , y
Operator logic Function calls Attribute access Most instructions NOT processing any actual data!

High instructions-per-tuple (IPT) factor

MonetDB/X100 overview

Modern CPUs
New CPU features over last 20 years y
Instruction and data cache Deep pipeline with out-of-order execution out of order Superscalar features multiple instructions at once SIMD instructions (SSE)

Great for e.g. multimedia processing but bad for database code!
MonetDB/X100 overview

DBMS performance - CPI p

CPU-unfriendly code y
Complicated code Poor use of CPU cache (both data and instructions) Processing one value at a time Compilers cant help much

High cycles-per-instruction (CPI) factor

MonetDB/X100 overview

DBMS performance p
Performance factors:
High instructions-per-tuple High cycles-per-instruction cycles per instruction Very high cycles-per-tuple (CPT)

Others can do better

Scientific computing,

How can we?

MonetDB/X100 overview

MonetDB
MonetDB 1993-now, developed at CWI , p
Peters PhD Column store Improves computational efficiency

Predecessor of MonetDB/X100

MonetDB/X100 overview

MonetDB: a column store

save disk I/O when scan-intensive queries / q need a few columns

MonetDB/X100 overview

MonetDB: a column store

save disk I/O when scan-intensive queries / q need a few columns avoid an expression interpreter to improve id i i t t t i computational efficiency

MonetDB/X100 overview

MonetDB in action
SELECT FROM WHERE id, name, (age-30)*50 as bonus people age > 30

MonetDB/X100 overview

MonetDB in action
SELECT FROM WHERE id, name, (age-30)*50 as bonus people age > 30

MonetDB/X100 overview

MonetDB in action
SELECT FROM WHERE

CPU Efficiency depends on nice id, name, (age-30)*50 as bonus - out-of-order execution people - few dependencies (control,data) age > 30 - compiler support

code

int gt ( select_g _float( {

Compilers love simple loops over arrays oid* res, , -l loop-pipelining i li i float* column, - automatic SIMD float val, int n)

for(int j=0 i=0; i<n; i++) j=0,i=0; if (column[i] >val) res[j++] = i; return j;

Simple, hardSimple hard coded semantics in operators

MonetDB/X100 overview

MonetDB: a column store

save disk I/O when scan-intensive queries / q need a few columns avoid an expression interpreter to improve id i i t t t i computational efficiency
Simple algebra Monet Interpreter Language (MIL) Hard-coded operator semantics no function calls Array-like processing

MonetDB/X100 overview

MonetDB problem p
SELECT FROM WHERE id, name, (age-30)*50 as bonus people age > 30

MATERIALIZED intermediate results

MonetDB/X100 overview

Materialization problem p
Extra main-memory bandwidth y
Performance is sub-optimal but still faster than anything else (5 years ago )

Reduces scalability
Cant afford writing to disk Only effective for limited data sizes and not all query types

MonetDB/X100 overview

MonetDB: a Faustian Pact

You want efficiency y
Simple hard-coded operators

I take scalabilit scalability

Result materialization and XQuery Supports SQL
Open-source download: C program: 0.2s monetdb.cwi.nl MonetDB: 3.7s 3 7s MySQL: DBMS X:
MonetDB/X100 overview

26.2s 28.1s

MonetDB/X100 /
My PhD thesis y Motivation:
lets fi M l fix MonetDB scalability DB l bili and improve the performance on the way

Core ideas:
New e e u o model e execution ode High performance column storage

MonetDB/X100 overview

Typical Relational DBMS Engine yp g

Query
SELECT name, salary .19 salary*.19 AS tax FROM employee WHERE age > 25

MonetDB/X100 overview

MonetDB/X100: Vectors /

MonetDB/X100 overview

MonetDB/X100: Vectors o et / 00 ecto s

Vector contains data of multiple tuples (~100-1000) All operations work on entire vectors Effect: much less operator.next() and primitive calls.
MonetDB/X100 overview 32

Vectors
Column slices as unary arrays
NOT: Vertical is a better table storage layout than horizontal (though we still think it often is) RATIONALE: - Simple array operations are p y p well-supported by compilers - SIMD friendly layout - Assumed cache-resident
MonetDB/X100 overview 33

Vectorized Primitives
int select_lt_int_col_int_val ( int *res, t es, int *col, int val, int n) { for(int j i 0; i<n; i++) j=i=0; if (col[i] < val) res[j++] = i; return j; }

Most primitives take just 0.5 (!) to 10 cycles per tuple 10-100+ times faster than tuple-at-a-time t l t ti

MonetDB/X100 overview

MonetDB/X100 /
Both efficiency y
Vectorized primitives

and scalability scalabilit

Pipelined query evaluation
C program: MonetDB/X100: MonetDB: MySQL: DBMS X:
MonetDB/X100 overview

0.2s 0.6s 3.7s 26.2s 28.1s 28 1s

Memory Hierarchy y y
X100 query engine

CPU cache h

RAM

ColumnBM (buffer manager) (raid) Disk(s)

MonetDB/X100 overview

Optimal Vector size? p

X100 query engine

All vectors together should fit the CPU cache Depends on the query
CPU cache h

RAM

ColumnBM (buffer manager) (raid) Disk(s)

MonetDB/X100 overview

Varying the Vector size y g

Less and less operator.next() and primitive function calls (interpretation overhead) ( interpretation overhead )

MonetDB/X100 overview

Varying the Vector size y g

Vectors start to exceed the CPU cache, causing additional memory traffic

MonetDB/X100 overview

MonetDB/MIL materializes co u s o et / ate a es columns

X100 query engine MonetDB/MIL

CPU cache h

RAM

ColumnBM (buffer manager) (raid) Disk(s)

MonetDB/X100 overview

Why is X100 so fast? y

Reduced interpretation overhead
100x less Function Calls

Good CPU cache use

High locality in the primitives g y p Cache-conscious data placement

No Tuple Navigation
Primitives only see arrays

Vectorization allows algorithmic optimization CPU and compiler-friendly function bodies

Multiple work units, loop-pipelining, SIMD

MonetDB/X100 overview

Feeding the Beast g

X100 uses < 100 cycles p tuple for TPC-H Q1 y per p Q
Q1 has ~30 bytes of used columns per tuple 3GHz CPU core eats 900MB/s No problem for RAM But disk-based data?

MonetDB/X100 overview

Using Disk in the 21th century

Poor random disk access needs to be compensated with more and more disk heads. (tens, hundreds thousands!)

Youre better off with scanning!

MonetDB/X100 overview

Using Disk in Data Warehousing g g

Goals: 1. 1 2. 3. Scan-based disk access *only* (full or partial scans) S b d di k * l * (f ll ti l ) Minimize bandwidth Benefit, not suffer from concurrency

Database strategies: replicate tables in multiple orders (goal 1) p p (g ) clustering join-tables in foreign-key order (goal 1) keep dimension tables in RAM (goal 1&2) scan-optimized indices (goal 1&2) use a column-store (goal 2) increase disk bandwidth with lightweight compression (goal 2) coordinate concurrent disk access (goals 2&3) di di k ( l
MonetDB/X100 overview 44

Feeding the Beast (1) g ( )

Two ideas pursued: p
Lightweight compression to enhance disk bandwidth Maximizing disk scan sharing in concurrent queries.

MonetDB/X100 overview

Compression to improve I/O bandwidth

0.9GB/s q y consumption / query p 1/3 CPU for decompression 1.8GB/s needed

new lightweight compression schemes

MonetDB/X100 overview 46

Key Ingredients y g
Compress relations on a per-column basis
Easy to exploit redundancy

Keep data compressed in main-memory main memory

More data can be buffered

Decompress vector at a time

Minimize main-memory overhead

Use light-weight, CPU-efficient algorithms

Exploit processing power of modern CPUs
MonetDB/X100 overview 47

Results

MonetDB/X100 overview

TPC-H 100 GB
Decent improvement with fast disks
TPC-H query Compression ratio 01 03 04 05 06 07 4.33 3.04 8.15 3.81 4.39 1.71 MonetDB/X100 on 1 CPU 4 disks Speedup 4.41 3.10 7.58 3.55 4.50 1.66 Time (s) 69.6 11.3 2.4 15.3 10.7 72.0 12 disks Speedup 1.29 1.48 2.67 1.06 2.35 0.84 Time (s) 50.9 6.0 1.8 16.2 4.6 40.8 DB2 8 CPUs 142 disks Time (s) 111.9 15.1 12.5 84.0 17.1 86.5

Linear speedup MonetDB/X100 overview with slow disks

Competes with DB2 using ~10x less resources 49

Feeding the Beast (2) g ( )

Two ideas pursued: p
Lightweight compression to enhance disk bandwidth Maximizing disk scan sharing in concurrent queries.

MonetDB/X100 overview

Concurrent scans
Multiple queries p q scanning the same table
Different start times Different scan ranges

Compete for disk access and buffer space FCFS request C S scheduling: poor latency
MonetDB/X100 overview 51

Normal scans in real life

MonetDB/X100 overview

Shared scans
Observation: queries q often do not need data in a sequential order q Idea: make queries share the scanning share process Two existing types:
Attach Elevator
MonetDB/X100 overview 53

Attach in real life

MonetDB/X100 overview

Elevator in real life

MonetDB/X100 overview

Existing shared scans g

Benefits
Less I/O operations Better data reuse

Problems
Sharing decisions static (when a query starts) Misses opportunities in a dynamic environment Not sensitive to different query types

MonetDB/X100 overview

Relevance scans
Core ideas
Dynamically adapt to the current situation Allow fully arbitrary data order

Goals:
Maximize data sharing Optimize latency and throughput Work for different types of queries

MonetDB/X100 overview

Relevance in real life

MonetDB/X100 overview

Results

MonetDB/X100 overview

Conclusion
Presented MonetDB/X100
A new database kernel developed at CWI d t b k ld l d t Uses block-oriented iterator model (vectorization) works amazingly well

So fast

must reduce hunger for hard disks

Column storage specialized in sequential access g p q + Lightweight compression schemes (give ~~ factor 3) + Cooperative Bandwidth Sharing (gives ~~ factor 2)

Good performance results

Fastest raw 100GB TPC-H performance around (** not fair) Beats IR systems on Terabyte TREC
MonetDB/X100 overview 60

Literature
MonetDB
P.A. Boncz. MIL Primitives for Querying a Fragmented World. VLDB Journal, 1999. P.A. Boncz. Monet: A Next-Generation DBMS Kernel For Query-Intensive Applications. Ph.d. thesis, 2002.

MonetDB/X100
P.A. Boncz, M.Zukowski, N.Nes. MonetDB/X100: Hyper-pipelining Query Execution, CIDR 2005. M. Zukowski S. Heman, N. Nes, P A Boncz Super Scalar RAM-CPU M Zukowski, S Heman N Nes P.A. Boncz. Super-Scalar RAM CPU Cache Compression. ICDE 2006. Compression 2006 M. Zukowski, S. Heman, N.Nes, P.A. Boncz. Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS, VLDB 2007.

Other
S. Padmanabhan, T. Malkemus, R. Agarwal, A. Jhingran. Block oriented processing of relational database operations in modern computer architectures. ICDE 2001. J. Goldstein, R. Ramakrishnan, and U. Shaft. Compressing relations and indexes. ICDE 1998. M. Stonebraker, et al. C-Store: A Column Oriented DBMS. VLDB 2005. D.J. Abadi, S.R. Madden, M.C. Ferreira. Integrating Compression and Execution in Column Oriented Database Column-Oriented Systems. SIGMOD 2006. S. Harizopoulos, V. Liang, D. Abadi, S. Madden. Performance Tradeoffs in Read-Optimized Databases. VLDB 2006. D.J. Abadi, D.S. Myers, D.J. DeWitt, S.R. Madden. Materialization Strategies in a Column-Oriented DBMS. ICDE 2007. C.A. L C A Lang, B. Bhattacharjee, T. Malkemus, S. Padmanabhan, K. Wong. Increasing buffer-locality for multiple B Bh tt h j T M lk S P d bh K W I i b ff l lit f lti l relational table scans through grouping and throttling. ICDE 2007.

MonetDB/X100 overview

The End

Thank you! Questions?

MonetDB/X100 overview

Focus 4 Test 1 GR A
80% (5)
Focus 4 Test 1 GR A
4 pages
Chloride 80-Net Ups Manual
100% (3)
Chloride 80-Net Ups Manual
126 pages
iGCSE Biology Study Guide
100% (1)
iGCSE Biology Study Guide
4 pages
Neoplasm Classification Guide
100% (1)
Neoplasm Classification Guide
15 pages
2015 고등 영어독해와작문 (안병규) 교과서PDF
No ratings yet
2015 고등 영어독해와작문 (안병규) 교과서PDF
184 pages
Chapter 1 5 Thesis Sample
100% (2)
Chapter 1 5 Thesis Sample
64 pages
Efficient and Flexible Information Retrieval Using Monetdb/X100
No ratings yet
Efficient and Flexible Information Retrieval Using Monetdb/X100
6 pages
Chapter 6 - Multiphase Systems: CBE2124, Levicky
No ratings yet
Chapter 6 - Multiphase Systems: CBE2124, Levicky
27 pages
Step by Step On Changing ECC Source Systems Without Affecting Data Modeling Objects in SAP BW
No ratings yet
Step by Step On Changing ECC Source Systems Without Affecting Data Modeling Objects in SAP BW
16 pages
GS 150
No ratings yet
GS 150
72 pages
U4 Database Management System (DBMS)
No ratings yet
U4 Database Management System (DBMS)
31 pages
Main Memory Database Insights
100% (1)
Main Memory Database Insights
16 pages
Database Systems Comparison Guide
No ratings yet
Database Systems Comparison Guide
27 pages
Cracking WEP and WPA Wireless Networks
No ratings yet
Cracking WEP and WPA Wireless Networks
10 pages
Database Management System DBMS
No ratings yet
Database Management System DBMS
118 pages
Cable Modem Hacking Guide
100% (5)
Cable Modem Hacking Guide
56 pages
Ethiopian Construction Claims Study
100% (1)
Ethiopian Construction Claims Study
128 pages
Lecture 03
No ratings yet
Lecture 03
33 pages
BestSub Heat Press Catalog 2024
No ratings yet
BestSub Heat Press Catalog 2024
37 pages
PDF Advanced Database Management System
No ratings yet
PDF Advanced Database Management System
282 pages
Chapter Three Searching and Sorting Algorithm
100% (1)
Chapter Three Searching and Sorting Algorithm
47 pages
DBMS Mod 1
No ratings yet
DBMS Mod 1
241 pages
CSC Advanced DB Features1
No ratings yet
CSC Advanced DB Features1
49 pages
DBMS Week1
No ratings yet
DBMS Week1
8 pages
DBMS 20 21 IVYear I SEM Lecture Notes
No ratings yet
DBMS 20 21 IVYear I SEM Lecture Notes
143 pages
Physics1 PDF
No ratings yet
Physics1 PDF
7 pages
1 Introduction
No ratings yet
1 Introduction
43 pages
Antim Prahar 2025 Data Base Management System
No ratings yet
Antim Prahar 2025 Data Base Management System
58 pages
Lesson 5 Freedom of The Human Person
No ratings yet
Lesson 5 Freedom of The Human Person
16 pages
Wifi Hack
No ratings yet
Wifi Hack
10 pages
Class 6
No ratings yet
Class 6
29 pages
Chapter 12 Relational Database
No ratings yet
Chapter 12 Relational Database
8 pages
Unit 2.2 - DBMS Functions
No ratings yet
Unit 2.2 - DBMS Functions
3 pages
Big Data강의자료-Storing Big Data
No ratings yet
Big Data강의자료-Storing Big Data
104 pages
DBT 1-12
No ratings yet
DBT 1-12
97 pages
CSC Advanced DB Features1
No ratings yet
CSC Advanced DB Features1
49 pages
Dbms
No ratings yet
Dbms
11 pages
DBMS Unit 5 Sem Exam
No ratings yet
DBMS Unit 5 Sem Exam
111 pages
Database System Overview
No ratings yet
Database System Overview
10 pages
Dbms Iat1 Sol1
No ratings yet
Dbms Iat1 Sol1
17 pages
Column Store Tutorial VLDB09
No ratings yet
Column Store Tutorial VLDB09
47 pages
DBMS Unit-1
No ratings yet
DBMS Unit-1
17 pages
Unit 3 DBMS
No ratings yet
Unit 3 DBMS
114 pages
DBMS Lecture 1
No ratings yet
DBMS Lecture 1
6 pages
Bhairahawa Engineering and Builders PVT - LTD: Core Contract Documents CLIENT: .
No ratings yet
Bhairahawa Engineering and Builders PVT - LTD: Core Contract Documents CLIENT: .
5 pages
CSEC Biology June 2014 P032
No ratings yet
CSEC Biology June 2014 P032
12 pages
Chapter 8
No ratings yet
Chapter 8
3 pages
Dbms Notes1-1
No ratings yet
Dbms Notes1-1
7 pages
Artigo Sobre Monetdb
No ratings yet
Artigo Sobre Monetdb
6 pages
DBMS - Quick Guide
No ratings yet
DBMS - Quick Guide
66 pages
History of Computers
No ratings yet
History of Computers
12 pages
Advanced Databases Course Guide
No ratings yet
Advanced Databases Course Guide
721 pages
MySQL 8: Advanced Features for DBAs
No ratings yet
MySQL 8: Advanced Features for DBAs
113 pages
Key DBMS Functions Explained
No ratings yet
Key DBMS Functions Explained
12 pages
Database Management Systems
No ratings yet
Database Management Systems
34 pages
Vectorization
No ratings yet
Vectorization
13 pages
21aim45a Dbms Module 1
No ratings yet
21aim45a Dbms Module 1
116 pages
CBS Databases&SQL MinorDegree EVEN SEM
No ratings yet
CBS Databases&SQL MinorDegree EVEN SEM
5 pages
Structure Syllabi
No ratings yet
Structure Syllabi
19 pages
CH 1
No ratings yet
CH 1
28 pages
Chapter A2
No ratings yet
Chapter A2
21 pages
Chapter Eight Database Management System
No ratings yet
Chapter Eight Database Management System
27 pages
Dbms Unit 01
No ratings yet
Dbms Unit 01
11 pages
Ch00 Intro Annotated
No ratings yet
Ch00 Intro Annotated
16 pages
Paper 2 Database-Chapter A.2
No ratings yet
Paper 2 Database-Chapter A.2
21 pages
Span 210-MW Syllabus Spring 2014
No ratings yet
Span 210-MW Syllabus Spring 2014
12 pages
Agriengineering 06 00187
No ratings yet
Agriengineering 06 00187
18 pages
Unit-1 (Part-1)
No ratings yet
Unit-1 (Part-1)
10 pages
DBMS Module 1
No ratings yet
DBMS Module 1
7 pages
DBMS & RDBMS MBA Presentation
No ratings yet
DBMS & RDBMS MBA Presentation
34 pages
CM2A
No ratings yet
CM2A
4 pages
Dbms 1
No ratings yet
Dbms 1
23 pages
Breaking The Memory Wall in MonetDB
No ratings yet
Breaking The Memory Wall in MonetDB
22 pages
Dbms Concepts
No ratings yet
Dbms Concepts
13 pages
Research Paper 2 Group 3 Watson
No ratings yet
Research Paper 2 Group 3 Watson
6 pages
Database and Data Modeling
No ratings yet
Database and Data Modeling
31 pages
INFO445: Advanced Database Design, Management, and Maintenance
No ratings yet
INFO445: Advanced Database Design, Management, and Maintenance
21 pages
Nokia 303 User Guide: Issue 1.1
No ratings yet
Nokia 303 User Guide: Issue 1.1
50 pages
Georges Renault Cvis II
No ratings yet
Georges Renault Cvis II
76 pages
Namma Kalvi 12th Zoology Question Bank em 217045
No ratings yet
Namma Kalvi 12th Zoology Question Bank em 217045
45 pages
Ocular Ischemic Syndrome Case Report
No ratings yet
Ocular Ischemic Syndrome Case Report
18 pages
Ocaml Lang
No ratings yet
Ocaml Lang
1 page
Three-Dimensional Printing (3D Printing) : by Dr. Vineet Srivastava
No ratings yet
Three-Dimensional Printing (3D Printing) : by Dr. Vineet Srivastava
9 pages
Day 4 English Worksheets-21.9.2024
No ratings yet
Day 4 English Worksheets-21.9.2024
3 pages
Steel Squares: Specifications
No ratings yet
Steel Squares: Specifications
1 page
Abs Paris
No ratings yet
Abs Paris
2 pages
Evolution and Benefits of DBMS
No ratings yet
Evolution and Benefits of DBMS
158 pages

MonetDB/X100: Fast Column-Store Overview

Uploaded by

MonetDB/X100: Fast Column-Store Overview

Uploaded by

MonetDB/X100 M tDB/X100 a ( y) fast column-store (very)

Marcin Z k M i Zukowski, Peter Boncz ki P t B

CWI, Amsterdam, The Netherlands

Disclaimer a different one

Transaction processing is a different story

A lot of content wake up!

Database pe o a ce a a y ed atabase performance analyzed

N-ary storage model (NSM) y g ( )

Real-life NSM implementation p

Complex attribute access

Column stores to the rescue!

Read l R d only attributes used by a query ib db

Traditional column stores

Examples: Sybase IQ, Vertica Not enough! g

How databases run a query q y

DBMS performance - IPT p

High instructions-per-tuple (IPT) factor

DBMS performance - CPI p

High cycles-per-instruction (CPI) factor

Others can do better

How can we?

MonetDB: a column store

MonetDB: a column store

int gt ( select_g _float( {

Simple, hardSimple hard coded semantics in operators

MonetDB: a column store

MATERIALIZED intermediate results

MonetDB: a Faustian Pact

I take scalabilit scalability

Typical Relational DBMS Engine yp g

MonetDB/X100: Vectors o et / 00 ecto s

and scalability scalabilit

0.2s 0.6s 3.7s 26.2s 28.1s 28 1s

ColumnBM (buffer manager) (raid) Disk(s)

Optimal Vector size? p

ColumnBM (buffer manager) (raid) Disk(s)

Varying the Vector size y g

Varying the Vector size y g

MonetDB/MIL materializes co u s o et / ate a es columns

ColumnBM (buffer manager) (raid) Disk(s)

Why is X100 so fast? y

Good CPU cache use

Vectorization allows algorithmic optimization CPU and compiler-friendly function bodies

Feeding the Beast g

Using Disk in the 21th century

Youre better off with scanning!

Using Disk in Data Warehousing g g

Feeding the Beast (1) g ( )

Compression to improve I/O bandwidth

new lightweight compression schemes

Keep data compressed in main-memory main memory

Decompress vector at a time

Use light-weight, CPU-efficient algorithms

Linear speedup MonetDB/X100 overview with slow disks

Competes with DB2 using ~10x less resources 49

Feeding the Beast (2) g ( )

Normal scans in real life

Attach in real life

Elevator in real life

Existing shared scans g

Relevance in real life

must reduce hunger for hard disks

Good performance results

Thank you! Questions?

You might also like