Pentaho Data Lake-1

pentaho_data_lake-1

Uploaded by

hokusmanoli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

126 views2 pages

Pentaho Data Lake-1

pentaho_data_lake-1

Uploaded by

hokusmanoli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

DATASHEET

BLUEPRINT FOR BIG DATA SUCCESS: A BEST PRACTICE SOLUTION PATTERN

Filling the Data Lake

Simplify and Accelerate Hadoop Data Ingestion with a Scalable Approach

What is it?
As organizations scale up data onboarding from just a few • A flexible, scalable, and repeatable process to onboard a
sources going into Hadoop to hundreds or more, IT time growing number of data sources into Hadoop data lakes
and resources can be monopolized, creating hundreds of
• Streamlined data ingestion from hundreds or thousands
hard-coded data movement procedures – and the process
of disparate CSV files or database tables into Hadoop
is often highly manual and error-prone. The Pentaho Filling
• An automated, template-based approach to data work-
the Data Lake blueprint provides a template-based
flow creation
approach to solving these challenges, and is comprised of:
• Simplified regular data movement at scale into Hadoop in
the AVRO format

Why do it?
• Reduce IT time and cost spent building and maintaining • Automate business processes for efficiency and speed,
repetitive big data ingestion jobs, allowing valuable staff while maintaining data governance
to dedicate time to more strategic projects
• Enable more sophisticated analysis by business users
• Minimize risk of manual errors by decreasing dependence with new and emerging data sources
on hard-coded data ingestion procedures

Value of Pentaho
• Unique metadata injection capability accelerates time-to- • Ability to architect a governed process that is highly reus-
value by automating many onboarding jobs with just a few able
templates
• Robust integration with the broader Hadoop ecosystem
• Intuitive graphical user interface for big data integration and semi-structured data
means existing ETL developers can create repeatable data
movement flows without coding – in minutes, not hours
Example of how a Filling the Data Lake blueprint
implementation may look in a financial organization
This company uses metadata injection to move thousands of data sources into
Hadoop in a streamlined, dynamic integration process.

• Large financial services organization with thousands of input sources

• Reduce number of ingest processes through Metadata Injection

• Deliver transformed data directly into Hadoop in the AVRO Format

RDBMS

HADOOP AVRO FORMAT

INGEST PROCEDURES

CSV

DISPARATE DATA SOURCES DYNAMIC DATA INTEGRATION PROCESSES DYNAMIC TRANSFORMATIONS

Be social
with Pentaho:

Telecom Case Study - ETL Design Document
No ratings yet
Telecom Case Study - ETL Design Document
9 pages
AWS Cloud Practitioner Exam Prep
100% (1)
AWS Cloud Practitioner Exam Prep
111 pages
Pentaho Data Integration
100% (2)
Pentaho Data Integration
20 pages
Pentaho DI
No ratings yet
Pentaho DI
8 pages
White Paper:: Three Open Blueprints For Big Data Success
No ratings yet
White Paper:: Three Open Blueprints For Big Data Success
8 pages
ABHINAY VARMA PINNAMARAJU - Data Engineering
No ratings yet
ABHINAY VARMA PINNAMARAJU - Data Engineering
6 pages
Isas Etl Final
No ratings yet
Isas Etl Final
70 pages
Pentaho Data Integration Course
No ratings yet
Pentaho Data Integration Course
2 pages
Hitachi Vantara Iot Analytics Blueprint Datasheet
No ratings yet
Hitachi Vantara Iot Analytics Blueprint Datasheet
2 pages
Big Data Analysis
No ratings yet
Big Data Analysis
26 pages
PDI (Pentaho Data Integration)
100% (1)
PDI (Pentaho Data Integration)
37 pages
SwathiPBI Resume
No ratings yet
SwathiPBI Resume
5 pages
Getting Started With Pdi
No ratings yet
Getting Started With Pdi
38 pages
What's New in Pentaho BI Suite Enterprise Edition 3.8
No ratings yet
What's New in Pentaho BI Suite Enterprise Edition 3.8
7 pages
Data Analyisis Using Pentaho (Data Warehousing Concepts & Design) (CSL 327)
No ratings yet
Data Analyisis Using Pentaho (Data Warehousing Concepts & Design) (CSL 327)
6 pages
Exp 1 Introduction To Pentaho Pentaho Is A Company That Offers Pentaho Business Analytics, A Suite of Open Source Business Intelligence (BI)
No ratings yet
Exp 1 Introduction To Pentaho Pentaho Is A Company That Offers Pentaho Business Analytics, A Suite of Open Source Business Intelligence (BI)
2 pages
Business Intelligence Tools: (1) Pentaho
No ratings yet
Business Intelligence Tools: (1) Pentaho
5 pages
PDI BuildTransformation
No ratings yet
PDI BuildTransformation
8 pages
Pdi Upgrade
No ratings yet
Pdi Upgrade
40 pages
Pentaho Business Analytics Platform
No ratings yet
Pentaho Business Analytics Platform
4 pages
What's New in Pentaho Data Integration Enterprise Edition 4.2
No ratings yet
What's New in Pentaho Data Integration Enterprise Edition 4.2
6 pages
Pentaho Tutorial - Pentaho Data Integration (PDI) Tutorial
No ratings yet
Pentaho Tutorial - Pentaho Data Integration (PDI) Tutorial
13 pages
Product Roadmap: Last Updated: 5/20/2013
No ratings yet
Product Roadmap: Last Updated: 5/20/2013
14 pages
Multidimensional Data Modeling in Pentaho
No ratings yet
Multidimensional Data Modeling in Pentaho
6 pages
Pentaho Predictive Analytics
No ratings yet
Pentaho Predictive Analytics
4 pages
Chapter No.1 "The Rise of Pentaho Analytics Along With Big Data"
No ratings yet
Chapter No.1 "The Rise of Pentaho Analytics Along With Big Data"
12 pages
ITC 251 Lab Assessments 2 Answers Part 1 Report Writing
No ratings yet
ITC 251 Lab Assessments 2 Answers Part 1 Report Writing
4 pages
Pentaho 6.0: Putting Big Data To Work
No ratings yet
Pentaho 6.0: Putting Big Data To Work
40 pages
Advanced Reporting and Etl For Mongodb: Easily Build A 360-Degree View of Your Customers and More
No ratings yet
Advanced Reporting and Etl For Mongodb: Easily Build A 360-Degree View of Your Customers and More
20 pages
Pentaho Data Integration Datasheet
No ratings yet
Pentaho Data Integration Datasheet
2 pages
Final Report Pentaho
No ratings yet
Final Report Pentaho
19 pages
Hybrid Big Data Warehouse For On-Demand Decision Needs
No ratings yet
Hybrid Big Data Warehouse For On-Demand Decision Needs
6 pages
Install Pdi
No ratings yet
Install Pdi
19 pages
Data Stage FAQS
No ratings yet
Data Stage FAQS
34 pages
Pentaho & MySQL for BI Experts
No ratings yet
Pentaho & MySQL for BI Experts
39 pages
DWH Design MF
No ratings yet
DWH Design MF
73 pages
ETL Concepts and BPMN in Data Warehousing
No ratings yet
ETL Concepts and BPMN in Data Warehousing
13 pages
Pentaho Performance and Scalability Overview
No ratings yet
Pentaho Performance and Scalability Overview
11 pages
SAP BW 7.5 SP4 Powered by SAP HANA Overview & Roadmap
No ratings yet
SAP BW 7.5 SP4 Powered by SAP HANA Overview & Roadmap
31 pages
Applying ISO/IEC 25010 Standard To Prioritize and Solve Quality Issues of Automatic ETL Processes
No ratings yet
Applying ISO/IEC 25010 Standard To Prioritize and Solve Quality Issues of Automatic ETL Processes
4 pages
Pentaho Data Integration (PDI) Tutorial
No ratings yet
Pentaho Data Integration (PDI) Tutorial
33 pages
Resume Spark
No ratings yet
Resume Spark
4 pages
Work With Multidimensional
No ratings yet
Work With Multidimensional
162 pages
Summary Chapter 5 - 7 - Group 4
No ratings yet
Summary Chapter 5 - 7 - Group 4
47 pages
Unit 3
No ratings yet
Unit 3
14 pages
Pentaho Data Integration (PDI) Tutorial
No ratings yet
Pentaho Data Integration (PDI) Tutorial
33 pages
Spark Streaming & ML with OCI Data Flow
No ratings yet
Spark Streaming & ML with OCI Data Flow
23 pages
Selected Topics of Recent Trends in Information Technology
No ratings yet
Selected Topics of Recent Trends in Information Technology
21 pages
SEO-Optimized Resume: Data Warehousing Expert
No ratings yet
SEO-Optimized Resume: Data Warehousing Expert
4 pages
16-029 Pentaho Hadoop Ebook v7
No ratings yet
16-029 Pentaho Hadoop Ebook v7
22 pages
ELT Vs ETL
No ratings yet
ELT Vs ETL
13 pages
ODI Task and Effort Estimation-Draft
No ratings yet
ODI Task and Effort Estimation-Draft
6 pages
BMIS Chapter 4 SCMSB
No ratings yet
BMIS Chapter 4 SCMSB
35 pages
CHECKLIST: Analytics Project Framework
No ratings yet
CHECKLIST: Analytics Project Framework
2 pages
SAP MDG Overview
No ratings yet
SAP MDG Overview
29 pages
Etl VS Elt
No ratings yet
Etl VS Elt
8 pages
PDI2000 Labs
No ratings yet
PDI2000 Labs
281 pages
Course1 Description Database Design
No ratings yet
Course1 Description Database Design
58 pages
Assignment On Chapter 8 Data Warehousing and Management
No ratings yet
Assignment On Chapter 8 Data Warehousing and Management
13 pages
Top Best Practices SSIS PDF
No ratings yet
Top Best Practices SSIS PDF
10 pages
BI Lecture 2 - Data Warehousing - Data Integration
No ratings yet
BI Lecture 2 - Data Warehousing - Data Integration
18 pages
Big Data and Data Warehousing 1
No ratings yet
Big Data and Data Warehousing 1
24 pages
6-61456 Pentaho Hadoop Ebook
No ratings yet
6-61456 Pentaho Hadoop Ebook
22 pages
De Imp Qa
No ratings yet
De Imp Qa
12 pages
Data Warehouse: Key Concepts & Architecture
No ratings yet
Data Warehouse: Key Concepts & Architecture
30 pages
h8310 Deploying Pentaho Data Integration Dia
No ratings yet
h8310 Deploying Pentaho Data Integration Dia
29 pages
Azure Interview Questions
No ratings yet
Azure Interview Questions
7 pages
Introduction To DW
No ratings yet
Introduction To DW
59 pages
8202 E4D IBM Power7+ Data - Sheet
No ratings yet
8202 E4D IBM Power7+ Data - Sheet
86 pages
MixPanel-architecture June2018
No ratings yet
MixPanel-architecture June2018
14 pages
Crime Prediction Project Synopsis
No ratings yet
Crime Prediction Project Synopsis
4 pages
Best Practices and Challenges in Data Migration For Oracle Fusion Financials
No ratings yet
Best Practices and Challenges in Data Migration For Oracle Fusion Financials
21 pages
Notes Download Ba
No ratings yet
Notes Download Ba
104 pages
Dokumen - Pub - Essential Pentaho Etl A Self Study Reference and Practice Book For Etl Beginners
No ratings yet
Dokumen - Pub - Essential Pentaho Etl A Self Study Reference and Practice Book For Etl Beginners
104 pages
Dwbi Notes
No ratings yet
Dwbi Notes
32 pages
Pentaho Big Data Integration Analytics Datasheet1
No ratings yet
Pentaho Big Data Integration Analytics Datasheet1
2 pages
AI ML Data Pipeline
No ratings yet
AI ML Data Pipeline
10 pages
Chitragupta
No ratings yet
Chitragupta
4 pages
Modern Data Stack
No ratings yet
Modern Data Stack
23 pages
DWDM 2 Unit Notes
No ratings yet
DWDM 2 Unit Notes
14 pages
Lakshmi DE
No ratings yet
Lakshmi DE
3 pages
Pentaho Applies Machine Learning To Streamline Your Data Workflows in Four Distinct Ways Datasheet
No ratings yet
Pentaho Applies Machine Learning To Streamline Your Data Workflows in Four Distinct Ways Datasheet
2 pages
Chat GPT
No ratings yet
Chat GPT
12 pages
DM 5th Sem Unit-1
No ratings yet
DM 5th Sem Unit-1
8 pages
Job Ready Python 1st Edition Haythem Balti Kimberly A Weiss Download
No ratings yet
Job Ready Python 1st Edition Haythem Balti Kimberly A Weiss Download
160 pages

Pentaho Data Lake-1

Uploaded by

Pentaho Data Lake-1

Uploaded by

DATASHEET

BLUEPRINT FOR BIG DATA SUCCESS: A BEST PRACTICE SOLUTION PATTERN

Filling the Data Lake

Simplify and Accelerate Hadoop Data Ingestion with a Scalable Approach

• Large financial services organization with thousands of input sources

• Reduce number of ingest processes through Metadata Injection

• Deliver transformed data directly into Hadoop in the AVRO Format

HADOOP AVRO FORMAT

DISPARATE DATA SOURCES DYNAMIC DATA INTEGRATION PROCESSES DYNAMIC TRANSFORMATIONS

You might also like

• Large financial services organization with thousands of input sources

• Reduce number of ingest processes through Metadata Injection

• Deliver transformed data directly into Hadoop in the AVRO Format