0% found this document useful (0 votes)

12 views37 pages

Installation of Hadoop

The document provides a comprehensive guide for installing Hadoop 3.2.4, including prerequisites like Java 8 and 7-Zip, detailed installation steps, and environment variable setup. It also includes instructions for running MapReduce jobs, specifically for character counting and word counting, along with example code for Mapper, Reducer, and Driver classes. Additionally, it outlines important commands and links for accessing Hadoop's web interface and managing HDFS.

Uploaded by

Satyam Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views37 pages

Installation of Hadoop

Uploaded by

Satyam Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Installation of Hadoop

Prerequisites/requirement:
1.
 Java 8 runtime environment (JRE):
Hadoop 3 requires a Java 8
installation. I prefer using the
offline installer.

 Java 8 development Kit (JDK)

 To unzip downloaded Hadoop

binaries, we should install 7zip

https://www.java.com/en/download/
windows_offline.jsp

https://www.oracle.com/java/
technologies/downloads/#java8-
windows
https://www.7-zip.org/download.html

https://www.apache.org/dyn/
closer.cgi/hadoop/common/hadoop-
3.2.4/hadoop-3.2.4.tar.gz

Steps:
 Extract hadoop-3.2.4.tar.gz

 Create a folder with the name

hadoopsetup(this pc  c drive 
hadoopsetup

 Copy the downloaded(hadoop-

3.2.4.tar.gz) file and paste into the file
hadoopsetup
 Right click on the gz file  click on
show more option  7-Zip  Extract
to “hadoop-3.2.4.tar\”

 Click on hadoop-3.2.4.tar file  click

on show more option  7-Zip 
Extract to “hadoop-3.2.4\” (actual
content of Hadoop will be taken out
and now we can access the file)

 Take the file (hadoop-3.2.4) out.

2. Download libraries from

following link:

 https://1drv.ms/f/s!
ArSg3Xpur4Grml7l087JBp_4bzks?
e=aSqIQV
 After unpacking the package, we
should add the Hadoop native IO
libraries.
 Copy 7 files and  click on hadoop-
3.2.4 file  bin  paste all 7 files

3. Setting up environment variables:

C:\hadoopsetup\hadoop-3.2.4

C:\Progra~1\Java\jdk-1.8

%HADOOP_HOME%\bin

%JAVA_HOME%\bin

 Advanced system settings  click

on  Environment variables 
create two variables
 C:\hadoopsetup\hadoop-3.2.4
(Hadoop_Home)
 C:\Progra~1\Java\jdk-1.8 (Java
_Home)

 Click on path  specify 2 things

here

 %HADOOP_HOME%\bin

%JAVA_HOME%\bin

Inside Hadoop  click on etc 

core-site.xml  open with notepad
 inside configuration tags paste
the given content  save
CORE-SITE
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9820</
value>
</property>

Now open file Hadoop-env  open

with notepad  inside
configuration tags paste the given
content  save
HADOOP ENV
set JAVA_HOME=C:\Progra~1\
Java\jdk-1.8 (path of java)

HDFS-SITE
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</
name>
<value>file:///C:/hadoopsetup/
hadoop-3.2.4/data/dfs/namenode</
value>
</property>
<property>
<name>dfs.datanode.data.dir</
name>
<value>file:///C:/hadoopsetup/
hadoop-3.2.4/data/dfs/datanode</
value>
</property>

MAPRED-SITE
<property>
<name>mapreduce.framework.nam
e</name>
<value>yarn</value>
<description>MapReduce
framework name</description>
</property>

YARN-SITE
<property>
<name>yarn.nodemanager.aux-
services</name>
<value>mapreduce_shuffle</value>
<description>Yarn Node Manager
Aux Service</description>
</property>

Formatting File System

hdfs namenode -format

STARTING HADOOP
.\start-dfs.cmd
./start-yarn.cmd
jps
Important Links
http://localhost:9870/dfshealth.html
http://localhost:9864/datanode.html
http://localhost:8088/cluster
COMMANDS:
1. cd C:\hadoopsetup\hadoop-
3.2.4\sbin
2. start-all.cmd
3. hdfs dfs -ls /
4. -mkdir /data
5. -touchz /data/test.dat
6. hdfs dfs -ls /data/
7. -du /data/test.dat
8. hdfs dfs -put "C:\Users\ASUS\
Desktop\Queries.txt" /zahra
9. hdfs dfs -ls /zahra
10. hdfs dfs -cat /zahra/Queries.txt
11. hdfs dfs -rm -r /abc/student.txt
12. hdfs dfs -copyToLocal
/zahra/Queries.txt C:\
13. hdfs dfs -get /data/folder "C:\
Users\ASUS\Desktop\
HADOOPFILES"
14. hdfs dfs -appendToFile -
/data/folder
15. hdfs dfs -cp /abc/student.txt
/data/
16. hdfs dfs -mv /data/student.txt
/zeenat/
17. hdfs dfs -rmdir /test
----------------------- to remove
directory
18. hdfs dfs -rm /zeenat/student.txt
--------------------- removing
files
19. hdfs dfs -rm -r
/abc/student.txt--------------------
removing directories/files.
20. C:\hadoopsetup\hadoop-3.2.4\
sbin>hdfs dfs -usage mkdir
21. C:\hadoopsetup\hadoop-3.2.4\
sbin>hdfs dfs -help
22. -moveFromLocal "C:\Users\
ASUS\Desktop\HADOOPFILES\
hdfs.txt" /zeenat
23. hdfs dfs -getmerge
/zeenat/myfile.txt /zeenat/hdfs.txt
"C:/Users/ASUS/Desktop/HADO
OPFILES/result.txt"
24. Command that is used to list
files of local file system
hdfs dfs -ls
file:///C:/Users/ASUS/Desktop/H
ADOOPFILES
25. Command that is used to display
content of local file system
hdfs dfs -cat
file:///C:/Users/ASUS/Desktop/H
ADOOPFILES/result.txt
26. hdfs dfs -checksum
/zeenat/hdfs.txt
27. hdfs dfs -chgrp zeenat
/zeenat/hdfs.txt
28. hdfs dfs -chown Asus:zeenat
/zeenat/myfile.txt
29. hdfs dfs -expunge
Word Count example:
cd C:\hadoopsetup\hadoop-3.2.4\sbin

start-all.cmd

start-yarn.cmd

jps
 Hadoopsetup
 Hadoop3.2.4
 Share
 Hadoop
 Mapreduce
 Hadoop-mapreduce-example-3.2.4
 Open notepad, write/add some
content and save it

 hdfs dfs -mkdir /directory1

 hdfs dfs -put "C:\Users\ASUS\

Desktop\HADOOPFILES\ab.txt"
/directory1

 hadoop jar C:\hadoopsetup\hadoop-

3.2.4\share\hadoop\mapreduce\
hadoop-mapreduce-examples-
3.2.4.jar wordcount /directory1
outputdir1

 (copy path of jar file and name of jar

file paste the name of file by
adding .jar enter class
name(wordcount) input directory
and outputdirectrory)

 Localhost9870

 Localhost8088 (to check the status)

 In localhost9870 click user Asus click

on outputdirectory download the file

Problem 1: Character Count

Objective: Count the number of
occurrences of each character in a text file.
Step-by-Step Solution:
1. Setup the Project:
o Create a new Java project in
Eclipse.
o Add the Hadoop library to the build
path.
2. Create the Mapper Class:
o This Mapper reads each line of text,
splits it into characters, and emits
each character with a count of one.
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import
org.apache.hadoop.mapreduce.Mapper;

public class CharCountMapper extends

Mapper<Object, Text, Text, IntWritable> {
private final static IntWritable one =
new IntWritable(1);
private Text character = new Text();

public void map(Object key, Text value,

Context context) throws IOException,
InterruptedException {
String line = value.toString();
for (char c : line.toCharArray()) {
character.set(Character.toString(c));
context.write(character, one);
}
}
}
3. Create the Reducer Class:
 This Reducer sums the counts for each
character.
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import
org.apache.hadoop.mapreduce.Reducer;

public class CharCountReducer extends

Reducer<Text, IntWritable, Text,
IntWritable> {
private IntWritable result = new
IntWritable();

public void reduce(Text key,

Iterable<IntWritable> values, Context
context) throws IOException,
InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
4. Create the Driver Class:
This class sets up and runs the
MapReduce job.
import
org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import
org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import
org.apache.hadoop.mapreduce.Job;
import
org.apache.hadoop.mapreduce.lib.input
.FileInputFormat;
import
org.apache.hadoop.mapreduce.lib.outp
ut.FileOutputFormat;
public class CharCount {
public static void main(String[] args)
throws Exception {
Configuration conf = new
Configuration();
Job job = Job.getInstance(conf,
"character count");
job.setJarByClass(CharCount.class);
job.setMapperClass(CharCountMapper.
class);
job.setCombinerClass(CharCountRedu
cer.class);
job.setReducerClass(CharCountReduce
r.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.cl
ass);
FileInputFormat.addInputPath(job, new
Path(args[0]));
FileOutputFormat.setOutputPath(job,
new Path(args[1]));
System.exit(job.waitForCompletion(tru
e) ? 0 : 1);
}
}
5. Run the Job:
Create input and output directories.
Place the input text file in the input
directory.
Run the job from Eclipse, passing the
input and output paths as arguments.

2. WordCount
 -mkdir /input
 Copy a text from local system and
paste into “input” directory in
HDFS
 Open eclipse
 Click on file……> new……>java
project
 Enter project name
“MapReduceWordCount”
 Click on next…..> finish
 Right Click on
“MapReduceWordCount”…..>
new….> package……>enter
package name
“com.mapreduce.wc”…..>finish
 Again right click on
“MapReduceWordCount”…..>new
….>buildpath…..>configure build
path…..>libraries…..> addexternal
JARs.
 Go to
hadoopsetup….>hadoop3.2.4……
>Hadoop….>share….>Hadoop….>
client…>add all Jar files…..>
 Click on addexternal
JARs….>common…..>add all Jar
files.
 Click on addexternal
JARs….>…..>common…..>lib…..
>add all Jar files.
 Click on addexternal
JARs….>yarn…..>add all Jar files.
 Click on addexternal
JARs….>mapreduce…..>add all Jar
files.
 Click on addexternal
JARs….>hdfs…..>add all Jar files.
 After adding all jar files click on
apply and close
 Click on package name
“com.mapreduce.wc”…..>new…..>
class…..> in the Name field enter
“WordCount”……>finish
 Copy and paste the
program…..>file….save
 If any error occurs in
program………..> right click on
“MapReduceWordCount”…..>new
….>buildpath…..>configure build
path…..>libraries…..> addexternal
JARs…..>Jar files of (lib folder of
yarn ,hdfs, mapreduce, common)
 Go to project file….>right
click……>export……>inside java
folder click on JAR file….>next
 To change path click on browse and
choose any location……>create
folder with name
“JARFILES”…..>click on
JARFILES folder save the file
WordCountMApReduce…..>save
……>finish…..>ok
 hdfs dfs -mkdir /input
 hdfs dfs -put C:\Users\ASUS\
Desktop\HADOOPFILES\bid
data.txt /input
 hdfs dfs -put C:\Users\ASUS\
Desktop\HADOOPFILES\
bigdata.txt /input
 hadoop jar C:\Users\ASUS\
Downloads\b.jar
com.mapreduce.wc/WordCount
/input/bigdata.txt /output
 hadoop-3.2.4\sbin>hdfs dfs -cat
/output/*
 Mapper class: In Hadoop's MapReduce
framework, the Mapper class is a core component
responsible for processing the input data and
producing key-value pairs that are used as the input
for the subsequent stages of the MapReduce job.
 context.context:
 It helps us in interacting with outside world
(other components of Hadoop like yarn,
mapreduce…)
 Mapper class: Processes input data and
produces key-value pairs.
 map () method: Transforms each input record
into intermediate key-value pairs.
 Context: Used to emit key-value pairs to the
framework for further processing.
 dMAPPER OUTPUT<K,V>…….>
PARTITIONER OUTPUT…….>
<K,LIST[V]>……………..>REDUCER

 PARTITIONER OUTPUT: Generates list of

values against every keys

 Reducer class: In Hadoop's MapReduce

framework, the Reducer class plays a crucial role
in processing the intermediate key-value pairs
generated by the Mapper. It is responsible for
aggregating, summarizing, or otherwise processing
the data to produce the final output of the
MapReduce job.

 Reducer class: Processes the intermediate key-

value pairs generated by the Mapper.
 reduce () method: Aggregates or processes
the list of values for each key to produce the
final output.
 Context: Used to emit the final key-value
pairs, which are written as the output of the
MapReduce job.

Driver Class: The major component in a MapReduce

job is a Driver Class.
 It is responsible for setting up a MapReduce job to
run in hadoop.
 public static void main(String[] args) throws
Exception {
 Configuration conf = new Configuration();
 The configuration object contains all hadoop
settings necessary to launch your app.
 It’s in the key value format and is read fcrom the
xml files from /etc/hadoop. You can also use
configuration to change configuration parameters.

 Job job = Job.getInstance(conf, "word count");

 It allows the user to configure the job, submit it,
control its execution, and query the state.

 job.setJarByClass(WordCount.class);
 //specify various job-specific parameters

 job.setMapperClass(TokenizerMapper.class);
 //setting mapper class

 job.setCombinerClass(IntSumReducer.class);
 //setting combiner class

 job.setReducerClass(IntSumReducer.class);
 //setting reducer class
 job.setOutputKeyClass(Text.class);
 //setting output key

 job.setOutputValueClass(IntWritable.class);
 //setting output value

 FileInputFormat.addInputPath(job, new
Path(args[0]));
 FileOutputFormat.setOutputPath(job, new
Path(args[1]));
 Driver Class: The main entry point of a
MapReduce job, responsible for setting up,
configuring, and submitting the job to the Hadoop
cluster.
 Job Configuration: Defines input/output paths,
Mapper/Reducer classes, and other job settings.
 Job Submission: Submits the job to the cluster and
monitors its progress until completion.

Hive:
 cd C:\hadoopsetup\hadoop-3.2.4\sbin
 start-all.cmd
 start-yarn.cmd
 cd C:\hive\apache-hive-3.1.2-bin\
apache-hive-3.1.2-bin\bin
 hive --service schematool -dbType
derby -initSchema
 hdfs dfsadmin -safemode leave
 hive --service schematool -dbType
derby -initSchema
 C:\hive\apache-hive-3.1.2-bin\apache-
hive-3.1.2-bin\bin>hive
 hive> create database if not exists abc;
 hive> show databases;
 show databases like 'm*';
 describe database abc;
 drop database abc;
 Create table syntax:
 CREATE TABLE table_name
( column_name1 data_type,
column_name2 data_type, ... )
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION 'path']
 ROW FORMAT: Optional
specification of how rows are formatted
(e.g., DELIMITED).
 STORED AS: Optional file format for
storing the data (e.g., TEXTFILE,
PARQUET).
 hive> create table customer(id INT,
fname STRING, lname STRING, city
STRING)
 > ROW FORMAT DELIMITED
 > FIELDS TERMINATED BY '|'
 > STORED AS TEXTFILE;
 Describe customer;
 Create any text file insert data into file
and save the file…….
 LOAD DATA LOCAL INPATH
'C:/Users/ASUS/Desktop/HADOOPFI
LES/hive.txt' into table customer;
…………..(paste the path of saved text
file)
 Select * from customer;
 Drop table customer;
 alter table customer rename to
employees;
 alter table employees add columns
(salary int);
 hive> alter table employees
 > change column lname mname
string;
 hive> alter table employees replace
columns(id int, fname string, mname
string, city string);
 DML:
 hive> insert into table stu values(100,
'Rohan', 10, 'ECE');
 hive> insert into stu values (200,
'Priya', 9, 'CE'), (300, 'Amit', 7, 'CSE'),
(400, 'mohit', 10, 'CSE');
 hive> create table result(id INT, name
STRING, marks INT, branch
STRING);
 Append data from existing table:
 hive> insert into result select id, name,
marks, course from stu;
 Truncate: hive> truncate table result;
 INSERT OVERWRITE TABLE result
SELECT * FROM stu;

 Hive Partitioning:
 Partitioning in Hive is a way of
dividing a large table into smaller, more
manageable pieces based on the value
of one or more columns. This helps in
faster query execution by scanning only
relevant partitions instead of the entire
table.
 cd C:\hadoopsetup\hadoop-3.2.4\sbin
 start-all.cmd
 start-yarn.cmd
 cd C:\hive\apache-hive-3.1.2-bin\
apache-hive-3.1.2-bin\bin
 hdfs dfsadmin -safemode leave
 hive --service schematool -dbType
derby -initSchema
 hive
 hive> show tables;
 create table students(id INT, name
STRING, branch STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> STORED AS TEXTFILE;
 load data local inpath 'C:\Users\ASUS\
Desktop\HADOOPFILES\
hivepartitioning.txt' into table students;
 create table part_stu_branch(id INT,
name STRING)
 partitioned by (branch STRING);
 set hive.exec.dynamic.partition.mode =
nonstrict;
 insert overwrite table part_stu_branch
partition(branch)
> select id, name, branch from
students;
 Open another command prompt terminal
 C:\hadoopsetup\hadoop-3.2.4\sbin
 start-all.cmd
 hdfs dfs -ls /user/hive/warehouse/part_stu_branch
 hdfs dfs -ls
"/user/hive/warehouse/part_stu_branch/branch=CS
E"
 hdfs dfs -cat
"/user/hive/warehouse/part_stu_branch/branch=CS
E/000000_0"
 Hive Bucketing:
 SET hive.enforce.bucketing=true;
 > create table st_bucket(id INT, name STRING,
branch STRING)
> clustered by (id)into 3 buckets
> row format delimited
> fields terminated by ',';
 insert overwrite table st_bucket select * from
students;
 Open new cmd terminal
 hdfs dfs -ls "/user/hive/warehouse/st_bucket"
 hdfs dfs -cat
"/user/hive/warehouse/st_bucket/000000_0"

 Hive Operators:
Hive operators are used in Hive Query Language
(HiveQL) to perform various types of data
manipulation and calculations, much like operators
in SQL.
1. Relational Operators:
Relational operators compare two values and return
a Boolean result (TRUE or FALSE).
= (Equal to): Checks if two values are equal.
!= or <>, (Not equal to): Checks if two values are
not equal.
> (Greater than): Checks if the left value is
greater than the right value.
< (Less than): Checks if the left value is less than
the right value.
>= (Greater than or equal to): Checks if the left
value is greater than or equal to the right value.
<= (Less than or equal to): Checks if the left
value is less than or equal to the right value.

E.g:
 SELECT * FROM students WHERE
id > 20;
 select * from students where id >=20
AND branch = 'ME';

Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Hadoop Installation & MapReduce Guide
No ratings yet
Hadoop Installation & MapReduce Guide
13 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Palak
No ratings yet
Palak
10 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Cp5261 Da Lab Me-Cse 2021 - Edit
No ratings yet
Cp5261 Da Lab Me-Cse 2021 - Edit
88 pages
Mapreduce Program
No ratings yet
Mapreduce Program
3 pages
BDA Record
No ratings yet
BDA Record
58 pages
Hadoop Mini Project
No ratings yet
Hadoop Mini Project
8 pages
Bda Manual
No ratings yet
Bda Manual
33 pages
Sanoob BDA 1 S Merged
No ratings yet
Sanoob BDA 1 S Merged
8 pages
Big Data Lab Manual Printout
No ratings yet
Big Data Lab Manual Printout
51 pages
Hadoop Lab Practical Guide
No ratings yet
Hadoop Lab Practical Guide
69 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
Dsbda GRP B Print
No ratings yet
Dsbda GRP B Print
17 pages
Data Science
No ratings yet
Data Science
82 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
59 pages
PDC All Labs
100% (1)
PDC All Labs
129 pages
BDA Manual
No ratings yet
BDA Manual
41 pages
Step 2 - First MapReduce Program
No ratings yet
Step 2 - First MapReduce Program
25 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Big Datalab
No ratings yet
Big Datalab
4 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Big Data
No ratings yet
Big Data
28 pages
Bda Lab S
No ratings yet
Bda Lab S
92 pages
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
No ratings yet
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
210 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Execute WordCount in Hadoop CDH
No ratings yet
Execute WordCount in Hadoop CDH
10 pages
Bda Lab Manual 2024
No ratings yet
Bda Lab Manual 2024
45 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
CSF443 Lab-Report Nimish Shandilya 1000016934
No ratings yet
CSF443 Lab-Report Nimish Shandilya 1000016934
17 pages
Hadoop Administrator Training - Lab Hand Book
No ratings yet
Hadoop Administrator Training - Lab Hand Book
12 pages
BDA Lab 8 Manual
No ratings yet
BDA Lab 8 Manual
7 pages
Big Data Analytics Lab
No ratings yet
Big Data Analytics Lab
18 pages
Hadoop Lab Notes: Nicola Tonellotto November 15, 2010
No ratings yet
Hadoop Lab Notes: Nicola Tonellotto November 15, 2010
9 pages
BDA University Questions
No ratings yet
BDA University Questions
10 pages
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
No ratings yet
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
61 pages
Running Jar Program
No ratings yet
Running Jar Program
3 pages
Dsbda 11
No ratings yet
Dsbda 11
15 pages
Big Data Manual
No ratings yet
Big Data Manual
19 pages
Anushka Shetty 35
No ratings yet
Anushka Shetty 35
34 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Exp 5 - 9
No ratings yet
Exp 5 - 9
25 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
New Bda Manual
No ratings yet
New Bda Manual
80 pages
Da Lab Record - Merged
No ratings yet
Da Lab Record - Merged
48 pages
Bda-Wordcount 250805 135324
No ratings yet
Bda-Wordcount 250805 135324
5 pages
BIG Data File
No ratings yet
BIG Data File
28 pages
Lab Manual
No ratings yet
Lab Manual
34 pages
DSBDA GRP B Print
No ratings yet
DSBDA GRP B Print
21 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
OOPs Coding Problems
No ratings yet
OOPs Coding Problems
4 pages
Techniques For Operations Efficiency 4
No ratings yet
Techniques For Operations Efficiency 4
28 pages
SQL Server Clustering
No ratings yet
SQL Server Clustering
2 pages
Business Intelligence Handbook
No ratings yet
Business Intelligence Handbook
33 pages
NOXON Iradio Manual GB
No ratings yet
NOXON Iradio Manual GB
60 pages
Process Synchronization Basics
No ratings yet
Process Synchronization Basics
58 pages
Matsonic MS8127C
No ratings yet
Matsonic MS8127C
80 pages
HP 516 Vs HP 419 Printer
No ratings yet
HP 516 Vs HP 419 Printer
8 pages
Amazon Complaint
No ratings yet
Amazon Complaint
103 pages
Sankara Subramanian-Resume
No ratings yet
Sankara Subramanian-Resume
7 pages
Howto Logging
No ratings yet
Howto Logging
17 pages
New Template-JEKK (Jurnal)
No ratings yet
New Template-JEKK (Jurnal)
4 pages
Chapter 13 Database Development Process - Database Design
No ratings yet
Chapter 13 Database Development Process - Database Design
7 pages
Machinepca: Distrbuted Vearsion - Oniral Sstem
No ratings yet
Machinepca: Distrbuted Vearsion - Oniral Sstem
10 pages
T.Y. BCA Sem-5 & 6 Syllabus-2022-23
No ratings yet
T.Y. BCA Sem-5 & 6 Syllabus-2022-23
27 pages
New Cisco Certification
No ratings yet
New Cisco Certification
2 pages
Getting Started Tutorial LS-DYNA
No ratings yet
Getting Started Tutorial LS-DYNA
39 pages
Salinan Dari Copy of Genshin Impact Materials Tracker (By Oble)
No ratings yet
Salinan Dari Copy of Genshin Impact Materials Tracker (By Oble)
242 pages
Result Prediction by Mining Replays in Dota 2: Filip Johansson, Jesper Wikström
No ratings yet
Result Prediction by Mining Replays in Dota 2: Filip Johansson, Jesper Wikström
29 pages
Ethical Hacking
No ratings yet
Ethical Hacking
2 pages
Lab 6
No ratings yet
Lab 6
3 pages
AcronisBackup 12.5 Userguide en-US
No ratings yet
AcronisBackup 12.5 Userguide en-US
261 pages
IT Project Manager
No ratings yet
IT Project Manager
3 pages
WERC Warehouse Management Systems Pres1 - BH
No ratings yet
WERC Warehouse Management Systems Pres1 - BH
16 pages
DMDX Timing PDF
No ratings yet
DMDX Timing PDF
34 pages
Project Realization
No ratings yet
Project Realization
8 pages
Geeetech A20M 3D Printer Guide
No ratings yet
Geeetech A20M 3D Printer Guide
56 pages
Galaxy Beauty Academy-1
No ratings yet
Galaxy Beauty Academy-1
26 pages
5-Reducing Project Duration
100% (1)
5-Reducing Project Duration
12 pages
Process Simulator & Visio: Optimize Business Models
No ratings yet
Process Simulator & Visio: Optimize Business Models
2 pages

Installation of Hadoop

Uploaded by

Installation of Hadoop

Uploaded by

Installation of Hadoop

 Java 8 development Kit (JDK)

 To unzip downloaded Hadoop

 Create a folder with the name

 Copy the downloaded(hadoop-

 Click on hadoop-3.2.4.tar file  click

 Take the file (hadoop-3.2.4) out.

2. Download libraries from

3. Setting up environment variables:

 Advanced system settings  click

 Click on path  specify 2 things

Inside Hadoop  click on etc 

Now open file Hadoop-env  open

Formatting File System

 hdfs dfs -mkdir /directory1

 hdfs dfs -put "C:\Users\ASUS\

 hadoop jar C:\hadoopsetup\hadoop-

 (copy path of jar file and name of jar

 Localhost8088 (to check the status)

 In localhost9870 click user Asus click

Problem 1: Character Count

public class CharCountMapper extends

public void map(Object key, Text value,

public class CharCountReducer extends

public void reduce(Text key,

 PARTITIONER OUTPUT: Generates list of

 Reducer class: In Hadoop's MapReduce

 Reducer class: Processes the intermediate key-

Driver Class: The major component in a MapReduce

 Job job = Job.getInstance(conf, "word count");

You might also like