Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views35 pages

Postgres ConfChina2015 Suzuki-PostgreSQL ScaleOut

The document presents a talk on scaling out PostgreSQL, focusing on the motivations, technologies, and future directions for enhancing PostgreSQL's performance in handling large data workloads. Key topics include the use of Postgres-XC and Postgres-XL for multi-node updates, the importance of clustering efforts, and the impact of emerging technologies like IoT and big data analytics on PostgreSQL architecture. The speaker, Koichi Suzuki from NTT DATA Intellilink Corporation, emphasizes the need for innovative storage solutions and server architectures to meet growing demands.

Uploaded by

maydayorange
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views35 pages

Postgres ConfChina2015 Suzuki-PostgreSQL ScaleOut

The document presents a talk on scaling out PostgreSQL, focusing on the motivations, technologies, and future directions for enhancing PostgreSQL's performance in handling large data workloads. Key topics include the use of Postgres-XC and Postgres-XL for multi-node updates, the importance of clustering efforts, and the impact of emerging technologies like IoT and big data analytics on PostgreSQL architecture. The speaker, Koichi Suzuki from NTT DATA Intellilink Corporation, emphasizes the need for innovative storage solutions and server architectures to meet growing demands.

Uploaded by

maydayorange
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Postgres Conference

HangZhou, China

Scaling Out PostgreSQL


Present and Future

November 21st, 2015


NTT DATA INTELLILINK Corporation
Koichi Suzuki

Copyright © 2015 NTT DATA INTELLILINK Corporation


Introduction

Copyright © 2015 NTT DATA INTELLILINK Corporation 2


About the Speaker


Fellow at NTT DATA Intellilink Corporation

Principal, Technology Professionals at NTT DATA Group

In Charge Of


General Database Technology

Database in huge data warehouse and its design

PostgreSQL and its cluster technology

In The Past


Character Set Standard (Extended Unix Code, Unicode, etc)

Heisei-font development (Technical Committee)

Oracle Porting

Object-Relational Database

Copyright © 2015 NTT DATA INTELLILINK Corporation 3


Agenda


Scaling out motivation.


Postgres-XC and Postgres-XL


Other PostgreSQL cluster efforts (example)


Effort in PostgreSQL core


Impact to scale-out feature


Storage and server technology innovation

IoT and Big Data


Scale-out architecture in the future

Copyright © 2015 NTT DATA INTELLILINK Corporation 4


Scale-out Motivation


Performance requirements

Larger amount of data

Range of petabyte

More transactions in transactional workload

More than 20,000TPS
Scale-out on top of

Growing demands to big data analytic workload
Database Cluster

Aggregates

Scanning tens of billions of tuples


Use of commodity hardware/software platform

No dedicated hardware

Shared nothing preferred

Copyright © 2015 NTT DATA INTELLILINK Corporation 5


Clustering effort in PostgreSQL core


Streaming Replication

First active cluster

Originally for High-availability

Copy all the database update to slaves

A slave can fail-over when the master fails

Read Only Slave

Use streaming replication slave to run read query

Scales out read

Logical replication

More sophisticated update transfer to other database
servers

Configurable

Copyright © 2015 NTT DATA INTELLILINK Corporation 6


Postgres-XC


Multi-node update

Issue update to any cluster node

Transparent transaction ACID property

Atomic visibility of updates

Initially for transactional workloads

Copyright © 2015 NTT DATA INTELLILINK Corporation 7


Postgres-XL


Postgres-XC spin-off

More focus on analytic workload

More sophisticated execution for complexed queries

Same architecture and code base as XC

Copyright © 2015 NTT DATA INTELLILINK Corporation 8


Read Scale-out in PostgreSQL Master/Slave

Read/Write Possible time delay


Transactions Read-only Transactions

Master

Slave

WAL (or Redo Log)

Copyright © 2015 NTT DATA INTELLILINK Corporation 9


Scaling Out in Postgres XC/XL

Read/Write Transactions

No Delay in Update Visibility

Local
Local Local Local
Disk
Disk Disk Disk

Backend Transaction Synchronization

Copyright © 2015 NTT DATA INTELLILINK Corporation 10


DBT-1 Workload Scalability

DBT-1 (Rev)

Copyright © 2015 NTT DATA INTELLILINK Corporation 11


MPP Performance – DBT-3 (TPC-H)

By courtesy of Mason Sharp, Postgres-XL leader


Copyright © 2015 NTT DATA INTELLILINK Corporation 12
Scale Out Approach (1): Table Distribution/Replication

Categorize tables into two groups:

Large and frequently-updated tables

→ Distribute rows among nodes (Distributed Tables)


→ Based on a column value (distribution key)
→ Hash, modulo or round-robin

→ Parallelism among transactions (OLTP) or in SQL processing (OLAP)

Smaller and stable tables

→ Replicate among nodes (Replicated Tables)

→ Join Pushdown

Avoid joins between Distributed Tables with join keys different from distribution
key as possible.

Copyright © 2015 NTT DATA INTELLILINK Corporation 13


Node Configuration: Two-Tier Approach

Coordinator:


Maintains global catalog information

Build global SQL plan and SQL statements for datanodes

Interact with datanode to execute local SQL statements and accumulate
the result

Datanode


Maintains actual data (local data)

Run local SQL statement from Coordinator
(In XL, datanode may ask other datanodes for their local data)

Copyright © 2015 NTT DATA INTELLILINK Corporation 14


Coordinator and Datanode

Read/Write Transactions

Coordinator

Datanode

Copyright © 2015 NTT DATA INTELLILINK Corporation 15


Other PostgreSQL cluster effort


PG Cluster
– Multi-node update

– Backend update synchronization



PGPool
– SQL-based database replication

– Multi-node update
– Read scalability
– Now incorporated streaming replication as pgpool-II

Slony
– Trigger-based database replication
– Very flexible and robust as logical replication

Copyright © 2015 NTT DATA INTELLILINK Corporation 16


Why GTM? Two-Phase Commit Protocol doesn't work?

Two-Phase Commit Protocol Does:


Maintain database consistency in transactions updating more than one
node.

Two-Phase Commit Protocol Doesn't:


Maintain Atomic Visibility of Updates to other transactions (next slide)

Copyright © 2015 NTT DATA INTELLILINK Corporation 17


Atomic Visibility and GTM

Node A Node B
TXN 1

Updates A
and B
Inconsistent
Read!

Prepares A TXN 2
and B
Reads B and
gets old value
Commits A
and B Reads A and
gets new value

GTM monitors TXN


activity and make
new value available
at this timing.

Copyright © 2015 NTT DATA INTELLILINK Corporation 18


Final Configuration: GTM, Coordinator and Datanode

Read/Write Transactions

Coordinator

GTM

Datanode

Copyright © 2015 NTT DATA INTELLILINK Corporation 19


Configuration in Practice

Just like configuring many database servers to talk each other


Many pitfalls

Pgxc_ctl provides simpler way to configure the whole cluster

Provide only needed parameters

Pgxc_ctl will do the rest to issue needed commands and SQL
statements.
– Visit
http://sourceforge.net/p/postgres-xc/xc-wiki/PGOpen2013_Postgres_Open_2013/

Copyright © 2015 NTT DATA INTELLILINK Corporation 20


OLTP Workload Characteristics

Number of Transactions: Many


Number of Involved Table Rows: Small
Locality of Row Allocation: High
Update Frequency: High

Copyright © 2015 NTT DATA INTELLILINK Corporation 21


Scaling Out OLTP Workload

Read/Write Transactions

Run Transactions in Parallel

Coordinator

GTM

High workload

Datanode

Copyright © 2015 NTT DATA INTELLILINK Corporation 22


OLAP Workload Characteristics

Number of Transactions: Small


Number of Involved Table Rows: Huge
Locality of Row Allocation: Low
Update Frequency: Low

Copyright © 2015 NTT DATA INTELLILINK Corporation 23


Scaling Out OLAP Workload

SQL

May need less


Top level coordinators
Coordinator aggregation

GTM

Low workload

Datanode
Run Small Local SQLs for each
Datanode in Parallel

Copyright © 2015 NTT DATA INTELLILINK Corporation 24


Join Offloading: When row allocation is available

● Replicated Table and Partitioned Table


– Can determine which datanode to go from WHERE clause

Copyright © 2015 NTT DATA INTELLILINK Corporation 25


Join Offloading: When row allocation is available

● Replicated Table and Partitioned Table


– When the coordinator cannot determine which datanode to go from WHERE clause

Copyright © 2015 NTT DATA INTELLILINK Corporation 26


Aggregate Functions in PostgreSQL

Finalize Function State Transition


Function

Copyright © 2015 NTT DATA INTELLILINK Corporation 27


Aggregate Functions in Postgres-XC/XL

(Sum, Count)
AVG ← (Sum, Count)

State Transition
State Transition
Finalize Function Collector Function State Transition
Function
Function
Function

Coordinator Datanode

Similar to Map Reduce!

Copyright © 2015 NTT DATA INTELLILINK Corporation 28


Scale-out effort in PostgreSQL core


Efforts like Postgres-XC/XL
– Inter-node communication based upon FDW (Foreign Data Wrapper)

General PostgreSQL means to handle external data (not only
PostgreSQL)
– Introducing parallelism

Parallel sequential scan has just been committed

Some now discussed yet
– Update atomicity
– Node management/configuration
– Aggregate
– Other cluster-wide architecture

Visit https://wiki.postgresql.org/wiki/PG-EU_2015_Cluster_Summit for


details

Copyright © 2015 NTT DATA INTELLILINK Corporation 29


PostgreSQL Scale-Out Cluster
In the Future

Copyright © 2015 NTT DATA INTELLILINK Corporation 30


Impact to the approach


Increase demand for analytic workload
– IoT

Bigger amount of data

Not well-formatted

Just adding/archiving new/old data
– Big data analysis

SQL-based analysis (flexible and reasonable performance)

Semi-structured data (JSONB)

Storage innovation
– SSD via PCIe/NVMe

Solution to HDD performance bottleneck?
– Even faster storage like 3D-XPOINT

1000times faster than HDD

Storage via memory bus

New kernel support expected

Copyright © 2015 NTT DATA INTELLILINK Corporation 31


Impact to the approach (cont.)


New server architecture
– Soft-Defined

RSA (Rack-Scale architecture)

More suited for scale-out approach

GPU as CPU accelerator
– Parallel filter
– On-memory sort

Applicable to external sort?
– Data compression

Too much for GPU? Need FPGA?

Server backbone N/W
– 1Gig → 10Gig → 100Gig
– Suitable for scale-out approach

Copyright © 2015 NTT DATA INTELLILINK Corporation 32


Future scale-out technology forecast


Use datanode as data storage and intelligent scan
– Simpler statement
– Intelligent scan

Parallel scan: both intra/inter node

Allow coordinator to do more with less workload
– Use physical data distribution, index, etc. at coordinator
– More parallelism

Map-reduce aggregate

Allow local operation of cluster nodes
– Improve Global Transaction ID approach in XC

Make cluster more symmetric
– Get rid of central global transaction management

Simpler update synchronization
– Improve 2PC overhead

Copyright © 2015 NTT DATA INTELLILINK Corporation 33


Copyright © 2015 NTT DATA INTELLILINK Corporation
更多精彩,尽在PG社区
•PostgreSQL中国社区 : postgres.cn
•PostgreSQL专业1群 : 3336901(已满)
•PostgreSQL专业2群 : 100910388
•PostgreSQL专业3群 : 150657323
•文档翻译群 : 309292849

PostgresChina微信公众号 PostgreSQL用户会微博

Postgres Conference China 2015 中国用户大会

You might also like