100% found this document useful (1 vote)

446 views91 pages

Introduction to Hadoop and Cloudera

Mark Fei from Cloudera gives an introduction to Apache Hadoop. He discusses what Hadoop is, how it works, and the large ecosystem of companies and tools that use Hadoop. He then describes Cloudera's offerings including their Distribution of Apache Hadoop (CDH), Cloudera Manager for managing Hadoop clusters, Cloudera Enterprise for production support, and Cloudera University for training and certifications.

Uploaded by

srinath_vj3326

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

446 views91 pages

Introduction to Hadoop and Cloudera

Uploaded by

srinath_vj3326

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 91

An

Introduc+on to Hadoop
Mark Fei
Cloudera Strata + Hadoop World 2012, New York City, October 23, 2012

Who Am I? Mark Fei

Cloudera!
Durango, Colorado!

Current:! Past:!

Senior Instructor at Cloudera! Professional Services Education, VMware! Senior Member Technical Staff, Hill Associates! Sales Engineer, Nortel Networks! Systems Programmer, large Bank! Banking Applications software developer!

Whats Ahead?

Solid introduc+on to Apache Hadoop

What it is Why its relevant How it works The Ecosystem

No prior experience needed Feel free to ask ques+ons

What is Apache Hadoop?

Scalable data storage and processing

Open source Apache project Harnesses the power of commodity servers Distributed and fault-tolerant HDFS (storage) MapReduce (processing)

Core Hadoop consists of two main parts

A large ecosystem

Who uses Hadoop?

Vendor integration

BI / Analytics

ETL

Database

OS / Cloud / System Mgmt.

Hardware

About Cloudera

Cloudera is The commercial Hadoop company Founded by leading experts on Hadoop from Facebook, Google, Oracle and Yahoo Provides consulting and training services for Hadoop users Staff includes several committers to Hadoop projects

Cloudera Software

Clouderas Distribution including Apache Hadoop (CDH)

A single, easy-to-install package from the Apache Hadoop core repository Includes a stable version of Hadoop, plus critical bug fixes and solid new features from the development version 100% open source Apache Hadoop Apache Hive Apache Pig Apache HBase Apache Zookeeper Apache Flume, Apache Hue, Apache Oozie, Apache Sqoop, Apache Mahout

Components

A Coherent Platform
Components of the CDH Stack
File System Mount
FUSE-DFS

Storage
SDK

UI Framework
HUE

HUE SDK

Computation Integration Coordination Access

Workflow
APACHE OOZIE

Scheduling
APACHE OOZIE

Metadata
APACHE HIVE

Languages / Compilers Data Integration

APACHE FLUME, APACHE SQOOP HDFS, MAPREDUCE APACHE PIG, APACHE HIVE, APACHE MAHOUT

Fast Read/Write Access

APACHE HBASE

Coordination

APACHE ZOOKEEPER

Cloudera Manager, Free Edition

End-to-end Deployment and management of your CDH cluster

Zero to Hadoop in 15 minutes

Supports up to 50 nodes Free (but not open source)

Cloudera Enterprise

Clouderas Distribution including Apache Hadoop (CDH)

Big data storage, processing and analytics platform based on CDH End-to-end deployment, management, and operation of CDH Provides sophisticated cluster monitoring tools not present in the free version A team of experts on call to help you meet your Service Level Agreements (SLAs)

Cloudera Manager (full version)

Production support

Cloudera University

Training for the entire Hadoop stack

Cloudera Developer Training for Apache Hadoop Cloudera Administrator Training for Apache Hadoop Cloudera Training for Apache HBase Cloudera Training for Apache Hive and Pig Cloudera Essentials for Apache Hadoop More courses coming Including customized on-site private classes Cloudera Certified Developer for Apache Hadoop (CCDH) Cloudera Certified Administrator for Apache Hadoop (CCAH) Cloudera Certified Specialist in Apache HBase (CCSHB)

Public and private classes offered

Industry-recognized Certifications

Professional Services

Solutions Architects provide guidance and handson expertise

Use Case Discovery New Hadoop Deployment Proof of Concept Production Pilot Process and Team Development Hadoop Deployment Certification

How Did Apache Hadoop Originate?

Heavily inuenced by Googles architecture

Notably, the Google Filesystem and MapReduce papers Early adop+on by Yahoo, Facebook and others
Nutch spun off from Lucene Google publishes GFS paper Google publishes MapReduce paper Nutch rewritten for MapReduce

Other Web companies quickly saw the benets

2002

2003

2004

2005

Why Do We Have So Much Data?

And what are we supposed to do with it?

Velocity

Why were genera+ng data faster than ever

Processes are increasingly automated Systems are increasingly interconnected People are increasingly interac+ng online

Variety

What types of data are we producing?

Applica+on logs Text messages Social network connec+ons Tweets Photos

Not all of this maps cleanly to the rela+onal model

Volume

The result of this is that every single day

Twi]er processes 340 million messages Facebook stores 2.7 billion comments and Likes Google processes about 24 petabytes of data More than 200 million e-mail messages are sent Foursquare processes more than 2,000 check-ins

And every single minute

Where Does Data Come From?

Science

Medical imaging, sensor data, genome sequencing, weather data, satellite feeds, etc. Financial, pharmaceu+cal, manufacturing, insurance, online, energy, retail data Sales data, customer behavior, product databases, accoun+ng data, etc. Log les, health & status feeds, ac+vity streams, network messages, Web Analy+cs, intrusion detec+on, spam lters

Industry

Legacy

System Data

Analyzing Data: The Challenges

Huge volumes of data Mixed sources result in many dierent formats

XML CSV EDI Log les Objects SQL Text JSON Binary Etc.

What is Common Across Hadoop-able Problems?

Nature of the data

Complex data Mul+ple data sources Lots of it

Nature of the analysis

Batch processing Parallel execu+on Spread data over a cluster of servers and take the computa+on to the data

Benets of Analyzing With Hadoop

Previously impossible/imprac+cal to do this analysis Analysis conducted at lower cost Analysis conducted in less +me Greater exibility Linear scalability

What Analysis is Possible With Hadoop?

Text mining Index building Graph crea+on and analysis Pa]ern recogni+on

Collabora+ve ltering Predic+on models Sen+ment analysis Risk assessment

Eight Common Hadoop-able Problems

1. 2. 3. 4.

Modeling true risk Customer churn analysis Recommenda+on engine PoS transac+on analysis

Analyzing network data to predict failure Threat analysis Search quality Data sandbox

6. 7. 8.

1. Modeling True Risk

Challenge: How much risk exposure does an organiza+on really have with each customer?

Mul+ple sources of data and across mul+ple lines of business

Solu+on with Hadoop: Source and aggregate disparate data sources to build data picture

Structure and analyze

e.g. credit card records, call recordings, chat sessions, emails, banking ac+vity

Sen+ment analysis, graph crea+on, pa]ern recogni+on Financial Services (banks, insurance companies)

Typical Industry:

2. Customer Churn Analysis

Challenge: Why is an organiza+on really losing customers?

Solu-on with Hadoop: Rapidly build behavioral model from disparate data sources Structure and analyze with Hadoop

Data on these factors comes from dierent sources

Typical Industry:

Traversing Graph crea+on Pa]ern recogni+on

Telecommunica+ons, Financial Services

3. Recommenda+on Engine/Ad Targe+ng

Challenge: Using user data to predict which products to recommend Solu+on with Hadoop: Batch processing framework

Collabora+ve ltering

Allow execu+on in in parallel over large datasets

Typical Industry

Collec+ng taste informa+on from many users U+lizing informa+on to predict what similar users like Ecommerce, Manufacturing, Retail Adver+sing

4. Point of Sale Transac+on Analysis

Challenge: Analyzing Point of Sale (PoS) data to target promo+ons and manage opera+ons

Solu+on with Hadoop: Batch processing framework

Sources are complex and data volumes grow across chains of stores and other sources

Allow execu+on in in parallel over large datasets Op+mizing over mul+ple data sources U+lizing informa+on to predict demand Retail

Pa]ern recogni+on

Typical Industry:

5. Analyzing Network Data to Predict Failure

Challenge: Analyzing real-+me data series from a network of sensors

Solu+on with Hadoop: Take the computa+on to the data

Calcula+ng average frequency over +me is extremely tedious because of the need to analyze terabytes

Expand from simple scans to more complex data mining Discrete anomalies may, in fact, be interconnected

Be]er understand how the network reacts to uctua+ons

Iden+fy leading indicators of component failure Typical Industry:

U+li+es, Telecommunica+ons, Data Centers

6. Threat Analysis/Trade Surveillance

Challenge: Detec+ng threats in the form of fraudulent ac+vity or a]acks

Solu+on with Hadoop: Parallel processing over huge datasets Pa]ern recogni+on to iden+fy anomalies,

Large data volumes involved Like looking for a needle in a haystack

Typical Industry:

i.e., threats

Security, Financial Services, General: spam gh+ng, click fraud

7. Search Quality
Challenge: Providing real +me meaningful search results Solu+on with Hadoop: Analyzing search a]empts in conjunc+on with structured data Pa]ern recogni+on

Typical Industry:

Browsing pa]ern of users performing searches in dierent categories Web, Ecommerce

8. Data Sandbox
Challenge: Data Deluge

Dont know what to do with the data or what analysis to run

Solu+on with Hadoop: Dump all this data into an HDFS cluster Use Hadoop to start trying out dierent analysis on the data See pa]erns to derive value from data Typical Industry:

Common across all industries

Hadoop: How does it work?

Moores law and not

Disk Capacity and Price

Were genera+ng more data than ever before Fortunately, the size and cost of storage has kept pace

Capacity has increased while price has decreased

Year
1997 2004 2012 2.1 200 3,000

Capacity (GB)
$157 $1.05 $0.05

Cost per GB (USD)

Disk Capacity and Performance

Disk performance has also increased in the last 15 years Unfortunately, transfer rates havent kept pace with capacity
Year
1997 2004 2012

Capacity (GB)
2.1 200 3,000

Transfer Rate (MB/s)

16.6 56.5 210

Disk Read Time

126 seconds 59 minutes 3 hours, 58 minutes

Architecture of a Typical HPC System

Compute Nodes

Storage System

Fast Network

Architecture of a Typical HPC System

Compute Nodes

Storage System

Step 1: Copy input data

Fast Network

Architecture of a Typical HPC System

Compute Nodes

Storage System

Step 2: Process the data

Fast Network

Architecture of a Typical HPC System

Compute Nodes

Storage System

Step 3: Copy output data

Fast Network

You Dont Just Need Speed

The problem is that we have way more data than code

$ du -ks code/ 1,083 $ du ks data/ 854,632,947,314

You Need Speed At Scale

Compute Nodes

Storage System

Bottleneck

HDFS: HADOOP DISTRIBUTED FILESYSTEM

Because 10,000 hard disks are be]er than one

Collocated Storage and Processing

Solu+on: store and process data on the same nodes

Data locality: Bring the computa+on to the data Reduces I/O and boosts performance

"slave" nodes (storage and processing)

Hard Disk Latency

Disk seeks are expensive Solu+on: Read lots of data at once to amor+ze the cost
Current location of disk head

Where the data you need is stored

Introducing HDFS

Hadoop Distributed File System

Scalable storage inuenced by Googles le system paper HDFS is op+mized for Hadoop Values high throughput much more than low latency Its a user-space Java process Primarily accessed via command-line u+li+es and Java API

Its not a general-purpose lesystem

HDFS is (Mostly) UNIX-like

In many ways, HDFS is similar to a UNIX lesystem

Hierarchical UNIX-style paths (e.g. /foo/bar/myfile.txt) File ownership and permissions No CWD Cannot modify les once wri]en

There are also some major devia+ons from UNIX

HDFS High-Level Architecture

HDFS follows a master-slave architecture There are two essen+al daemons in HDFS

Master: NameNode

Responsible for namespace and metadata Namespace: le hierarchy Metadata: ownership, permissions, block loca+ons, etc. Responsible for storing actual datablocks

Slave: DataNode

Anatomy of a Small Hadoop Cluster

The diagram shows the HDFS-related daemons on a small cluster

Each "slave" node will run - DataNode daemon

The "master" node will run - NameNode daemon

HDFS Blocks

When a le is added to HDFS, its split into blocks This is a similar concept to na+ve lesystems

HDFS uses a much larger block size (64 MB), for performance
150 MB input le Block #1 (64 MB) Block #2 (64 MB)

Block #3 (remaining 22 MB)

HDFS Replica+on

Those blocks are then replicated across machines The rst block might be replicated to A, C and D
Block #1 A B Block #2 C D E

Block #3

HDFS Replica+on (contd)

The next block might be replicated to B, D and E

Block #1 A B Block #2 C D E

Block #3

HDFS Replica+on (contd)

The last block might be replicated to A, C and E

Block #1 A B Block #2 C D E

Block #3

HDFS Reliability

Replica+on helps to achieve reliability

Even when a node fails, two copies of the block remain These will be re-replicated to other nodes automa+cally
This failed node held blocks #1 and #3

X
A B C D E

Blocks #1 and #3 are still available here Block #1 is still available here Block #3 is still available here

DATA PROCESSING WITH MAPREDUCE

It not only works, its func+onal

MapReduce High-Level Architecture

Like HDFS, MapReduce has a master-slave architecture There are two daemons in classical MapReduce

Master: JobTracker

Responsible for dividing, scheduling and monitoring work Responsible for actual processing

Slave: TaskTracker

Anatomy of a Small Hadoop Cluster

The diagram shows both MapReduce and HDFS daemons

Each "slave" node will run - DataNode daemon - TaskTracker daemon

The "master" node will run - NameNode daemon - JobTracker daemon

Gentle Introduc+on to MapReduce

MapReduce is conceptually like a UNIX pipeline

$ egrep 941 78264 4312

The Map Func+on

Operates on each record individually

Map

Intermediate Processing

The Map func+ons output is grouped and sorted

This is the automa+c sort and shue process in Hadoop

$ egrep 'INFO|WARN|ERROR' app.log | cut -f3 | sort | uniq -c

Map

Sort and Shue

The Reduce Func+on

Operates on all records in a group

Oren used for sum, average or other aggregate func+ons

$ egrep 'INFO|WARN|ERROR' app.log | cut -f3 | sort | uniq -c

Map

Sort and Reduce shue

MapReduce History

MapReduce is not a language, its a programming model

A style of processing data you could implement in any language Many languages have func+ons named map and reduce These func+ons have largely the same purpose in Hadoop

MapReduce has its roots in func+onal programming

Popularized for large-scale data processing by Google

MapReduce Benets

Complex details are abstracted away from the developer

No le I/O No networking code No synchroniza+on A record consists of a key and corresponding value

Its scalable because you process one record at a +me

We oren care about only one of these

MapReduce Example in Python

MapReduce code for Hadoop is typically wri]en in Java

But possible to use nearly any language with Hadoop Streaming Ill show the log event counter using MapReduce in Python

Its very helpful to see the data as well as the code

Job Input

Each mapper gets a chunk of jobs input data to process

This chunk is called an InputSplit 2012-09-06 In most cases, this corresponds to a b"This lock in H DFS 22:16:49.391 CDT INFO can wait"

2012-09-06 2012-09-06 2012-09-06 2012-09-06 2012-09-06 2012-09-06 22:16:49.392 22:16:49.394 22:16:49.395 22:16:49.397 22:16:49.398 22:16:49.399 CDT CDT CDT CDT CDT CDT INFO "Blah blah" WARN "Hmmm..." INFO "More blather" WARN "Hey there" INFO "Spewing data" ERROR "Oh boy!"

Python Code for Map Func+on

1 2 3 4 5 6 7 8 9 10 11 12 13

Our map func+on will parse the event type

And then output that event (key) and a literal 1 (value)

Boilerplate Python stu

#!/usr/bin/env python import sys levels = ['TRACE', 'DEBUG', 'INFO', 'WARN', 'ERROR', 'FATAL'] for line in sys.stdin: fields = line.split() for field in fields: field = field.strip().upper() if field in levels: print "%s\t1" % field

Dene list of JUnit log levels Split every line (record) we receive on standard input into elds, normalized by case If this eld matches a log level, print it (and a 1)

Output of Map Func+on

The map func+on produces key/value pairs as output

INFO INFO WARN INFO WARN INFO ERROR 1 1 1 1 1 1 1

Input to Reduce Func+on

The Reducer receives a key and all values for that key

Keys are always passed to reducers in sorted order Although its not obvious here, values are unordered
ERROR INFO INFO INFO INFO WARN WARN 1 1 1 1 1 1 1

Python Code for Reduce Func+on

1 2 3 4 5 6 7 8 9 10 11 12 13

The Reducer rst extracts the key and value it was passed
#!/usr/bin/env python import sys previous_key = '' sum = 0 for line in sys.stdin: fields = line.split() key, value = line.split() value = int(value) # continued on next slide

Boilerplate Python stu

Ini+alize loop variables

Extract the key and value passed via standard input

Python Code for Reduce Func+on

14 15 16 17 18 19 20 21 22 23

Then simply adds up the value for each key

# continued from previous slide if key == previous_key: sum = sum + value else: if previous_key != '': print '%s\t%i' % (previous_key, sum) previous_key = key sum = 1 print '%s\t%i' % (previous_key, sum)

If key unchanged, increment the count If key changed, print sum for previous key Re-init loop variables Print sum for nal key

Output of Reduce Func+on

The output of this Reduce func+on is a sum for each level

ERROR INFO WARN 1 4 2

Recap of Data Flow

Map input
2012-09-06 2012-09-06 2012-09-06 2012-09-06 2012-09-06 2012-09-06 2012-09-06 22:16:49.391 22:16:49.392 22:16:49.394 22:16:49.395 22:16:49.397 22:16:49.398 22:16:49.399 CDT CDT CDT CDT CDT CDT CDT INFO "This can wait" INFO "Blah blah" WARN "Hmmm..." INFO "More blather" WARN "Hey there" INFO "Spewing data" ERROR "Oh boy!"

Map output
INFO INFO WARN INFO WARN INFO ERROR 1 1 1 1 1 1 1

Reduce input
ERROR INFO INFO INFO INFO WARN WARN 1 1 1 1 1 1 1

Reduce output
ERROR INFO WARN 1 4 2

Input Splits Feed the Map Tasks

Input for the en+re job is subdivided into InputSplits

An InputSplit usually corresponds to a single HDFS block Each of these serves as input to a single Map task
Input for entire job (192 MB) 64 MB

Mapper #1

64 MB

Mapper #2

64 MB

Mapper #3

Mappers Feed the Shue and Sort

Output of all Mappers is par++oned, merged, and sorted (No code required Hadoop does this automa+cally)
Mapper #1
INFO WARN INFO INFO ERROR 1 1 1 1 1 ERROR ERROR ERROR 1 1 1

Mapper #2

WARN INFO INFO INFO ERROR

1 1 1 1 1

INFO INFO INFO INFO INFO INFO INFO INFO

1 1 1 1 1 1 1 1

Mapper #N

WARN INFO WARN INFO ERROR

1 1 1 1 1

WARN WARN WARN WARN

1 1 1 1

Shue and Sort Feeds the Reducers

All values for a given key are then collapsed into a list

The key and all its values are fed to reducers as input
ERROR ERROR ERROR 1 1 1 ERROR 1 1 1

INFO INFO INFO INFO INFO INFO INFO INFO

1 1 1 1 1 1 1 1

Reducer #1
INFO 1 1 1 1 1 1 1 1

WARN WARN WARN WARN

1 1 1 1

Reducer #2
WARN 1 1 1 1

Each Reducer Has an Output File

These are stored in HDFS below your output directory

Use hadoop fs -getmerge to combine them into a local copy

INFO 8

Reducer #1

ERROR WARN

3 4

Reducer #2

Apache Hadoop Ecosystem: Overview

"Core Hadoop" consists of HDFS and MapReduce

These are the kernel of a much broader plavorm

Hadoop has many related projects

Most are open source Apache projects like Hadoop

Some help you integrate Hadoop with other systems Others help you analyze your data S+ll others, like Oozie, help you use Hadoop more eec+vely Also like Hadoop, they have funny names All of these are part of Clouderas CDH distribu+on

Ecosystem: Apache Flume

log les program output syslog custom source and many more

Ecosystem: Apache Sqoop

Integrates with any JDBC-compa+ble database

Retrieve all tables, a single table, or a por+on to store in HDFS Can also export data from HDFS back to the database
Database Hadoop Cluster

Ecosystem: Apache Hive

Hive allows you to do SQL-like queries on data in HDFS

SELECT customer.id, customer.name, sum(orders.cost) FROM customers INNER JOIN ON (customer.id = orders.customer_id) WHERE customer.zipcode = '63105' GROUP BY customer.id;

It turns this into MapReduce jobs that run on your cluster Reduces development +me Makes Hadoop more accessible to non-engineers

Ecosystem: Apache Pig

Apache Pig has a similar purpose to Hive

It has a high-level language (PigLa+n) for data analysis Scripts yield MapReduce jobs that run on your cluster

But Pigs approach is much dierent than Hive

Ecosystem: Apache HBase

NoSQL database built on HDFS Low-latency and high-performance for reads and writes Extremely scalable

Tables can have billions of rows And poten+ally millions of columns

You Should Be Using CDH

Clouderas Distribu+on including Apache Hadoop (CDH)

Combines Hadoop with many important ecosystem tools

The most widely used distribu+on of Hadoop A stable, proven and supported environment you can count on Such as Hive, Pig, Sqoop, Flume and many more All of these are integrated and work well together Its completely free Apache licensed its 100% open source too

How much does it cost?

When is Hadoop (Not) a Good Choice

Hadoop may be a great choice when

Hadoop may not be a great choice when

You need to process non-rela+onal (unstructured) data You are processing large amounts of data You can run your jobs in batch mode Youre processing small amounts of data Your algorithms require communica+on among nodes You need low latency or transac+ons And know how to integrate it with other systems

As always, use the best tool for the job

Managing The Elephant In The Room - Roles

System Administrators Developers Analysts Data Stewards

System Administrators

Required skills:

Strong Linux administra+on skills Networking knowledge Understanding of hardware Install, congure and upgrade Hadoop sorware Manage hardware components Monitor the cluster Integrate with other systems (e.g., Flume and Sqoop)

Job responsibili+es

Developers

Required Skills:

Strong Java or scrip+ng capabili+es Understanding of MapReduce and algorithms Write, package and deploy MapReduce programs Op+mize MapReduce jobs and Hive/Pig programs

Job responsibili+es:

Data Analyst/Business Analyst

Required skills:

SQL Understanding data analy+cs/data mining Extract intelligence from the data Write Hive and/or Pig programs

Job responsibili+es:

Data Steward

Required skills:

Data modeling and ETL Scrip+ng skills Cataloging the data (analogous to a librarian for books) Manage data lifecycle, reten+on Data quality control with SLAs

Job responsibili+es:

Combining Roles

System Administrator + Steward analogous to DBA Required skills:

Job responsibili+es:

Data modeling and ETL Scrip+ng skills Strong Linux administra+on skills

Manage data lifecycle, reten+on Data quality control with SLAs Install, congure and upgrade Hadoop sorware Manage hardware components Monitor the cluster Integrate with other systems (e.g., Flume and Sqoop)

Conclusion

Thanks for your +me! Ques+ons?

Python
No ratings yet
Python
272 pages
Lez.d-01-Hadoop (C)
No ratings yet
Lez.d-01-Hadoop (C)
29 pages
Unit IV Hadoop
No ratings yet
Unit IV Hadoop
90 pages
Python Interview Questions
No ratings yet
Python Interview Questions
134 pages
Svelte A Beginner S Guide 1st Edition Simon Holthausen-Kircher Instant Download
0% (1)
Svelte A Beginner S Guide 1st Edition Simon Holthausen-Kircher Instant Download
60 pages
Building Bitcoin in Rust
No ratings yet
Building Bitcoin in Rust
419 pages
HADOOP
100% (1)
HADOOP
35 pages
ReactJS Interview Questions & Answers
No ratings yet
ReactJS Interview Questions & Answers
186 pages
Modern C++ Programming Guide
No ratings yet
Modern C++ Programming Guide
491 pages
Slides PDF
No ratings yet
Slides PDF
30 pages
Cloudera Apache Hadoop 101
100% (1)
Cloudera Apache Hadoop 101
51 pages
Infrastructure & Security Plan
No ratings yet
Infrastructure & Security Plan
2 pages
Cloudera Hadoop Introduction PDF
100% (1)
Cloudera Hadoop Introduction PDF
50 pages
GE Breast Tomosynthesis SM
No ratings yet
GE Breast Tomosynthesis SM
692 pages
Python Programming - Basics To Advanced
No ratings yet
Python Programming - Basics To Advanced
107 pages
Hadoop V.01
No ratings yet
Hadoop V.01
24 pages
What Is Advaita
No ratings yet
What Is Advaita
1 page
Rewind
No ratings yet
Rewind
1 page
The 2G, 3G and 4G Wireless Network Infrastructure Market: 2014 - 2020 - With An Evaluation of WiFi and WiMAX
No ratings yet
The 2G, 3G and 4G Wireless Network Infrastructure Market: 2014 - 2020 - With An Evaluation of WiFi and WiMAX
4 pages
Alibaba Cloud Whitepaper PDF File-2024101411113200001
No ratings yet
Alibaba Cloud Whitepaper PDF File-2024101411113200001
77 pages
Introduction To Spring Boot
No ratings yet
Introduction To Spring Boot
41 pages
Top 500 Data Engineering Interview Questions
No ratings yet
Top 500 Data Engineering Interview Questions
126 pages
New Hire Checklist Download 20170907 PDF
No ratings yet
New Hire Checklist Download 20170907 PDF
1 page
Three Example Lagrange Multiplier Problems PDF
No ratings yet
Three Example Lagrange Multiplier Problems PDF
4 pages
Slide 6 NoSQL Database and HBase Tutorial
No ratings yet
Slide 6 NoSQL Database and HBase Tutorial
110 pages
Funnel Building for Entrepreneurs
100% (1)
Funnel Building for Entrepreneurs
6 pages
Bda Final Sem 7
No ratings yet
Bda Final Sem 7
120 pages
UNIT2
No ratings yet
UNIT2
307 pages
SQ - Root of Expressions
No ratings yet
SQ - Root of Expressions
10 pages
Activity Cost and Step Definitions
No ratings yet
Activity Cost and Step Definitions
5 pages
Entrepreneur Vs Employee
No ratings yet
Entrepreneur Vs Employee
12 pages
CS8392 OBJECT ORIENTED PROGRAMMING - Syllabus
No ratings yet
CS8392 OBJECT ORIENTED PROGRAMMING - Syllabus
14 pages
Introduction To Survey Research Design
No ratings yet
Introduction To Survey Research Design
18 pages
S2-11 - EAZC473 - DCT Calculation in EC3
No ratings yet
S2-11 - EAZC473 - DCT Calculation in EC3
3 pages
Transaction Statement: Account Number: 0455104000134361 Date: 2023-03-02 Currency: INR
No ratings yet
Transaction Statement: Account Number: 0455104000134361 Date: 2023-03-02 Currency: INR
7 pages
SAP Real Estate Management 101 PDF
No ratings yet
SAP Real Estate Management 101 PDF
107 pages
Introduction To Cadence Orcad Capture Cis
No ratings yet
Introduction To Cadence Orcad Capture Cis
11 pages
Intermediate Feb16
No ratings yet
Intermediate Feb16
14 pages
Data Set 23: Weights of Discarded Garbage For One Week: Stats Explore
No ratings yet
Data Set 23: Weights of Discarded Garbage For One Week: Stats Explore
2 pages
Activity 6.4.4: Basic Route Summarization: Topology Diagram
No ratings yet
Activity 6.4.4: Basic Route Summarization: Topology Diagram
3 pages
Sudoku Validator for Coders
No ratings yet
Sudoku Validator for Coders
2 pages
Tube Mill Inspection & Maintenance Services
No ratings yet
Tube Mill Inspection & Maintenance Services
1 page
SAP MM Transfer Requirements
No ratings yet
SAP MM Transfer Requirements
2 pages
E1 Exam Sol
No ratings yet
E1 Exam Sol
6 pages
Handout 11 - Introduction - To - MICT V20140101-1.0.0
No ratings yet
Handout 11 - Introduction - To - MICT V20140101-1.0.0
9 pages
Microsoft Research Plan
No ratings yet
Microsoft Research Plan
21 pages
Stacks and Queues
100% (1)
Stacks and Queues
41 pages
Dont Chase Shadows
No ratings yet
Dont Chase Shadows
4 pages
EST102 Programming in C
No ratings yet
EST102 Programming in C
131 pages
Bs Computer Science-Thesis Writing 2 Final Defense (Room 1) Day 1-December 3
No ratings yet
Bs Computer Science-Thesis Writing 2 Final Defense (Room 1) Day 1-December 3
2 pages
Yoga Vasishta of Valmiki - Valmiki
100% (1)
Yoga Vasishta of Valmiki - Valmiki
3,217 pages
Yoga Vasishta of Valmiki - Valmiki
100% (1)
Yoga Vasishta of Valmiki - Valmiki
3,217 pages
Java Stack-Based Solutions
No ratings yet
Java Stack-Based Solutions
7 pages
1.4 HDFS Lab 1H
No ratings yet
1.4 HDFS Lab 1H
23 pages
AVG Nov
No ratings yet
AVG Nov
32 pages
PG - IV Semester
No ratings yet
PG - IV Semester
150 pages
Data Structures & Algorithms: Instructio Ns To Candidates
No ratings yet
Data Structures & Algorithms: Instructio Ns To Candidates
1 page
Sveltekit en
No ratings yet
Sveltekit en
144 pages
Discover Public Domain Books
No ratings yet
Discover Public Domain Books
593 pages
Python Interview Companion
No ratings yet
Python Interview Companion
64 pages
Cloudera Overview PDF
No ratings yet
Cloudera Overview PDF
20 pages
Yarn Ha Federation
No ratings yet
Yarn Ha Federation
64 pages
Cloudera Administrator Training For Apache Hadoop
No ratings yet
Cloudera Administrator Training For Apache Hadoop
3 pages
Cloud Native Essentials - A 101 Tutorial To Start Your Cloud Native Journey
No ratings yet
Cloud Native Essentials - A 101 Tutorial To Start Your Cloud Native Journey
69 pages
All C & DS Interview Questions
No ratings yet
All C & DS Interview Questions
551 pages
Flink: Big Data Huawei Course
No ratings yet
Flink: Big Data Huawei Course
22 pages
Chapter - 1
No ratings yet
Chapter - 1
102 pages
Spark2x: Big Data Huawei Course
No ratings yet
Spark2x: Big Data Huawei Course
25 pages
Java All Notes
No ratings yet
Java All Notes
118 pages
Nosql PDF
No ratings yet
Nosql PDF
21 pages
ECS455 - 2 - 4 - Erlang B Formula PDF
No ratings yet
ECS455 - 2 - 4 - Erlang B Formula PDF
22 pages
Hadoop Administration Course Content PDF
No ratings yet
Hadoop Administration Course Content PDF
4 pages
Value Stream Mapping Fundamentals
88% (8)
Value Stream Mapping Fundamentals
28 pages
Java-1 8 PDF
100% (1)
Java-1 8 PDF
143 pages
Week 4 - Hadoop Ecosystem
No ratings yet
Week 4 - Hadoop Ecosystem
109 pages
C Programming
No ratings yet
C Programming
100 pages
Cloudera Academic Partnership 3 PDF
0% (1)
Cloudera Academic Partnership 3 PDF
103 pages
Gluon Documentation JavaFX
No ratings yet
Gluon Documentation JavaFX
191 pages
Kafka and Strom Event Processing in Realtime
No ratings yet
Kafka and Strom Event Processing in Realtime
46 pages
Install A Single Node Cluster of Kubernetes On Windows 10
No ratings yet
Install A Single Node Cluster of Kubernetes On Windows 10
52 pages
Mock Based Unit Testing
No ratings yet
Mock Based Unit Testing
17 pages
C++ Basics For Exit Exam
No ratings yet
C++ Basics For Exit Exam
72 pages
Synopsys License Setup Guide
No ratings yet
Synopsys License Setup Guide
26 pages
Sarkhan Rasullu Senior Java Developer
No ratings yet
Sarkhan Rasullu Senior Java Developer
4 pages
By Churamani Verma, Manish Kumar (Patni), Gaurav
100% (1)
By Churamani Verma, Manish Kumar (Patni), Gaurav
242 pages
Hadoop Interview Questions
No ratings yet
Hadoop Interview Questions
28 pages
HBase Interview Questions
No ratings yet
HBase Interview Questions
12 pages
Cloud Foundation - Presentation
No ratings yet
Cloud Foundation - Presentation
68 pages
100+ Hadoop Interview Questions From Interviews
No ratings yet
100+ Hadoop Interview Questions From Interviews
32 pages
Introduction To MFC: Microsoft Foundation Classes
No ratings yet
Introduction To MFC: Microsoft Foundation Classes
172 pages
Cloudera Administration
No ratings yet
Cloudera Administration
486 pages
C Programs
No ratings yet
C Programs
551 pages
C/C++ Tutorial by Alzanaty
No ratings yet
C/C++ Tutorial by Alzanaty
235 pages
Habari Active MQ Getting Started
No ratings yet
Habari Active MQ Getting Started
84 pages
IT Interview Questions
0% (1)
IT Interview Questions
172 pages
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
No ratings yet
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
62 pages
Crack The Interview
No ratings yet
Crack The Interview
257 pages