0% found this document useful (0 votes)

191 views29 pages

Installing Hadoop On Ubuntu

The document provides steps to install Hadoop on Ubuntu in standalone mode and run an example MapReduce program to verify the installation. It involves installing Java, downloading and verifying the Hadoop tarball, extracting and moving Hadoop files to /usr/local, configuring Hadoop's Java home by finding the default Java path, and running Hadoop commands to test the installation.

Uploaded by

SAYYAD RAFI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

191 views29 pages

Installing Hadoop On Ubuntu

Uploaded by

SAYYAD RAFI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Installing Hadoop on Ubuntu

AMRITPAL SINGH
Introduction
• Hadoop is a Java-based programming framework that supports the
processing and storage of extremely large datasets on a cluster of
inexpensive machines.

• It was the first major open source project in the big data playing field
and is sponsored by the Apache Software Foundation.
Introduction
• Hadoop 2.7 is comprised of four main layers:

• Hadoop Common is the collection of utilities and libraries that

support other Hadoop modules.

• HDFS, which stands for Hadoop Distributed File System, is responsible

for persisting data to disk.
Introduction
• YARN, short for Yet Another Resource Negotiator, is the "operating
system" for HDFS.

• MapReduce is the original processing model for Hadoop clusters. It

distributes work within the cluster or map, then organizes and
reduces the results from the nodes into a response to a query.

• Many other processing models are available for the 2.x version of
Hadoop.
Introduction
• Hadoop clusters are relatively complex to set up, so the project
includes a stand-alone mode which is suitable for learning about
Hadoop, performing simple operations, and debugging.

• We'll install Hadoop in stand-alone mode and run one of the example
example MapReduce programs it includes to verify the installation
Prerequisites
• An Ubuntu 16.04 server with a non-root user with sudo privileges

• Java
Steps
• Step 1 — Installing Java

• To get started, we'll update our package list:

• sudo apt-get update

• Next, install OpenJDK, the default Java Development Kit on Ubuntu

16.04.
Steps
• sudo apt-get install default-jdk

• Once the installation is complete, let's check the version.

• java –version

• openjdk version "1.8.0_91"

• OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-
3ubuntu1~16.04.1-b14)
• OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)
Steps
• Step 2 — Installing Hadoop

• With Java in place, we'll visit the Apache Hadoop Releases page to
find the most recent stable release.

• http://hadoop.apache.org/releases.html
Steps
Steps
• On the server, we'll use wget to fetch it:

• wget http://apache.mirrors.tds.net/hadoop/common/hadoop-
2.7.3/hadoop-2.7.3.tar.gz

• In order to make sure that the file we downloaded hasn't been

altered, we'll do a quick check using SHA-256.
Steps
Steps
Steps
Steps
• Again, we'll right-click to copy the file location, then use wget to
transfer the file:

• wget
https://dist.apache.org/repos/dist/release/hadoop/common/hadoop
-2.7.3/hadoop-2.7.3.tar.gz.mds
Steps
• Then run the verification:

• shasum -a 256 hadoop-2.7.3.tar.gz

• Output
• d489df3808244b906eb38f4d081ba49e50c4603db03efd5e594a1e98b
09259c2 hadoop-2.7.3.tar.gz
Steps
• Compare this value with the SHA-256 value in the .mds file:

• cat hadoop-2.7.3.tar.gz.mds
Steps
• You can safely ignore the difference in case and the spaces.

• The output of the command we ran against the file we downloaded

from the mirror should match the value in the file we downloaded
from apache.org.
Steps
• Now that we've verified that the file wasn't corrupted or changed,
we'll use the tar command with the -x flag to extract, -z to
uncompress, -v for verbose output, and -f to specify that we're
extracting from a file.

• Use tab-completion or substitute the correct version number in the

command below:
Steps
• tar -xzvf hadoop-2.7.3.tar.gz

• Finally, we'll move the extracted files into /usr/local, the appropriate
place for locally installed software.

• Change the version number, if needed, to match the version you

downloaded.
Steps
• sudo mv hadoop-2.7.3 /usr/local/Hadoop

• With the software in place, we're ready to configure its environment.

Steps
• Step 3 — Configuring Hadoop's Java Home

• Hadoop requires that you set the path to Java, either as an

environment variable or in the Hadoop configuration file.
Steps
• The path to Java, /usr/bin/java is a symlink to /etc/alternatives/java,
which is in turn a symlink to default Java binary.

• We will use readlink with the -f flag to follow every symlink in every
part of the path, recursively.

• Then, we'll use sed to trim bin/java from the output to give us the
correct value for JAVA_HOME.
Steps
• To find the default Java path

• readlink -f /usr/bin/java | sed "s:bin/java::“

• Output
• /usr/lib/jvm/java-8-openjdk-amd64/jre/
Steps
• You can copy this output to set Hadoop's Java home to this specific
version, which ensures that if the default Java changes, this value will
not.

• Alternatively, you can use the readlink command dynamically in the

file so that Hadoop will automatically use whatever Java version is set
as the system default.
Steps
• To begin, open hadoop-env.sh:

• sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Steps
• To begin, open hadoop-env.sh:

• sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Steps
• To begin, open hadoop-env.sh:

• sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Step 4 — Running Hadoop
• /usr/local/hadoop/bin/Hadoop

Technology in Teaching and Learning Post Test
87% (15)
Technology in Teaching and Learning Post Test
15 pages
Startup & SME MCQ Guide
100% (2)
Startup & SME MCQ Guide
8 pages
Samsung Le37m86bdx Le40m86bdx Le46m86bdx Le52m86bdx Le40m87bdx Le46m87bdx Chassis Gtu37,40,46,52sen LCD-TV SM
No ratings yet
Samsung Le37m86bdx Le40m86bdx Le46m86bdx Le52m86bdx Le40m87bdx Le46m87bdx Chassis Gtu37,40,46,52sen LCD-TV SM
196 pages
Digital Signatures & Authentication Protocols
No ratings yet
Digital Signatures & Authentication Protocols
19 pages
Operation Instructions DPV-Modul
No ratings yet
Operation Instructions DPV-Modul
30 pages
Unit 1 Cse332 MCQ
100% (4)
Unit 1 Cse332 MCQ
9 pages
myDATA API Documentation v0 6b - Eng
No ratings yet
myDATA API Documentation v0 6b - Eng
50 pages
Java OOP Concepts Quiz Answers
No ratings yet
Java OOP Concepts Quiz Answers
19 pages
Asotić, Din - Arrays in C++ - The Thrid Step in Mastering C++ Programming (2023, Independent) - Libgen - Li
No ratings yet
Asotić, Din - Arrays in C++ - The Thrid Step in Mastering C++ Programming (2023, Independent) - Libgen - Li
322 pages
Unit 2 Cse332 MCQ
No ratings yet
Unit 2 Cse332 MCQ
6 pages
Configuration Guide Version11 Docusign Included
100% (1)
Configuration Guide Version11 Docusign Included
82 pages
CompTIA SY0-401 Exam Prep Guide
100% (1)
CompTIA SY0-401 Exam Prep Guide
6 pages
Big Data Analytics PDF
No ratings yet
Big Data Analytics PDF
22 pages
Class 10 IT 402 Important Questions Updated Syllabus
No ratings yet
Class 10 IT 402 Important Questions Updated Syllabus
3 pages
CSE423
No ratings yet
CSE423
30 pages
Unit 1 CSE332
No ratings yet
Unit 1 CSE332
136 pages
Week 4-7 Nptel Haskell HRST
No ratings yet
Week 4-7 Nptel Haskell HRST
16 pages
Using Unicode Character Symbols in Excel
No ratings yet
Using Unicode Character Symbols in Excel
28 pages
A System For Efficient 3D Printed Stop-Motion Face Animation
No ratings yet
A System For Efficient 3D Printed Stop-Motion Face Animation
11 pages
Logistics Management - Chapter 5 PPT NFJnK1J2IS
No ratings yet
Logistics Management - Chapter 5 PPT NFJnK1J2IS
50 pages
Introduction To Docker: Ajeet Singh Raina Docker Captain - Docker, Inc
No ratings yet
Introduction To Docker: Ajeet Singh Raina Docker Captain - Docker, Inc
56 pages
Gigabyte Ga-Q77m-D2h Rev 1.01
No ratings yet
Gigabyte Ga-Q77m-D2h Rev 1.01
32 pages
Web Design & Marketing Basics
No ratings yet
Web Design & Marketing Basics
65 pages
Tutorial 03 Latch FF State Machines 1
No ratings yet
Tutorial 03 Latch FF State Machines 1
81 pages
BCA Operating Systems & Linux
No ratings yet
BCA Operating Systems & Linux
19 pages
Research Project Arduino Powered Automatic Fluid Dispenser
No ratings yet
Research Project Arduino Powered Automatic Fluid Dispenser
36 pages
Mini Hi-Fi Component System: MHC-RV6/RV5
No ratings yet
Mini Hi-Fi Component System: MHC-RV6/RV5
44 pages
Network Redundancy with STP
No ratings yet
Network Redundancy with STP
39 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
Hadoop Installaion 3.4.1 - 24-04-2025
No ratings yet
Hadoop Installaion 3.4.1 - 24-04-2025
68 pages
PeopleSoft v9.2 Product Review
No ratings yet
PeopleSoft v9.2 Product Review
163 pages
Install Single Node Hadoop on Ubuntu
No ratings yet
Install Single Node Hadoop on Ubuntu
13 pages
AEC Module 4 Notes
No ratings yet
AEC Module 4 Notes
97 pages
MicroCont LabManual Updated
No ratings yet
MicroCont LabManual Updated
53 pages
Anurag 1-6 Merged
No ratings yet
Anurag 1-6 Merged
60 pages
PowerPoint Tips for Presenters
No ratings yet
PowerPoint Tips for Presenters
12 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
Aryan
No ratings yet
Aryan
60 pages
Hadoop Installaion
No ratings yet
Hadoop Installaion
113 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
SQL Server Always On - Overview
No ratings yet
SQL Server Always On - Overview
4 pages
Big Data
No ratings yet
Big Data
32 pages
Week 1 Lab
No ratings yet
Week 1 Lab
8 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
Hadoop Setup Guide for Ubuntu Users
No ratings yet
Hadoop Setup Guide for Ubuntu Users
12 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
BD Lab File
No ratings yet
BD Lab File
39 pages
Big Data File
No ratings yet
Big Data File
32 pages
Bdamanual
No ratings yet
Bdamanual
8 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Install Apache Hadoop Using Cloudera
No ratings yet
Install Apache Hadoop Using Cloudera
132 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
15 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
20 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
6 pages
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
No ratings yet
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
9 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
Java-Hadoop 2.X Setting Up
No ratings yet
Java-Hadoop 2.X Setting Up
12 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Unit 1 Formula
No ratings yet
Unit 1 Formula
3 pages
Nmap Result
No ratings yet
Nmap Result
2 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
Setup 7
No ratings yet
Setup 7
11 pages
Manual Hadoop HIve Installation
No ratings yet
Manual Hadoop HIve Installation
4 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
8 pages
Hadoop 2.7.3 Setup On Ubuntu 15.10
No ratings yet
Hadoop 2.7.3 Setup On Ubuntu 15.10
7 pages
EX. NO Date Program NO Sign
No ratings yet
EX. NO Date Program NO Sign
80 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
HDFS Installation Steps
No ratings yet
HDFS Installation Steps
17 pages
Ubuntu & Hadoop Setup Guide
No ratings yet
Ubuntu & Hadoop Setup Guide
30 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Hadoop Setup Guide for Ubuntu 16.04/18.04
No ratings yet
Hadoop Setup Guide for Ubuntu 16.04/18.04
20 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
2.5 - Demo-Hadoop Install - Java SSH Configure
No ratings yet
2.5 - Demo-Hadoop Install - Java SSH Configure
16 pages
Installing Apache Hadoop (Single Node)
No ratings yet
Installing Apache Hadoop (Single Node)
27 pages
Hadoop Installation
No ratings yet
Hadoop Installation
18 pages
Hive Installation Guide
No ratings yet
Hive Installation Guide
15 pages
BDA Practical1 MC18-23
No ratings yet
BDA Practical1 MC18-23
17 pages
Hadoop Setup Guide for Beginners
No ratings yet
Hadoop Setup Guide for Beginners
14 pages
A Step-By-Step Approach On Installing Hadoop in Vmware Workstation
No ratings yet
A Step-By-Step Approach On Installing Hadoop in Vmware Workstation
9 pages
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
No ratings yet
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
11 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
Installationof Hadoop 3
No ratings yet
Installationof Hadoop 3
6 pages
Hadoop Cluster Creation
No ratings yet
Hadoop Cluster Creation
8 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
7 pages
Hadoop
No ratings yet
Hadoop
5 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Hadoop 2.7.2 Setup for Developers
No ratings yet
Hadoop 2.7.2 Setup for Developers
3 pages
HADOOP 1.X Installation Steps On Ubuntu
No ratings yet
HADOOP 1.X Installation Steps On Ubuntu
3 pages
Install Hadoop
No ratings yet
Install Hadoop
2 pages
SEL-351R: Intelligent Control Made Simple
No ratings yet
SEL-351R: Intelligent Control Made Simple
4 pages

Installing Hadoop On Ubuntu

Uploaded by

Installing Hadoop On Ubuntu

Uploaded by

Installing Hadoop on Ubuntu

• Hadoop Common is the collection of utilities and libraries that

• HDFS, which stands for Hadoop Distributed File System, is responsible

• MapReduce is the original processing model for Hadoop clusters. It

• To get started, we'll update our package list:

• sudo apt-get update

• Next, install OpenJDK, the default Java Development Kit on Ubuntu

• Once the installation is complete, let's check the version.

• openjdk version "1.8.0_91"

• In order to make sure that the file we downloaded hasn't been

• shasum -a 256 hadoop-2.7.3.tar.gz

• The output of the command we ran against the file we downloaded

• Use tab-completion or substitute the correct version number in the

• Change the version number, if needed, to match the version you

• With the software in place, we're ready to configure its environment.

• Hadoop requires that you set the path to Java, either as an

• readlink -f /usr/bin/java | sed "s:bin/java::“

• Alternatively, you can use the readlink command dynamically in the

• sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

• sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

• sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

You might also like