Department of Artificial Intelligence and Data Science 23ADR508 / Big Data Analytics
EX NO: 2
MAP REDUCE APPLICATION – WORD COUNT
DATE:
AIM:
To implement word count example using MapReduce.
DESCRIPTION:
In MapReduce word count example, we find out the frequency of each word. Here, the role
of Mapper is to map the keys to the existing values and the role of Reducer is to aggregate the keys
of common values. So, everything is represented in the form of Key-valuepair.
PROGRAM:
//wordcount.class
package wordcount2;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class Wordcount2 {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
5 717823I228
Department of Artificial Intelligence and Data Science 23ADR508 / Big Data Analytics
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "Wordcount2");
job.setJarByClass(Wordcount2.class);
job.setJobName("WordCounter");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
CODE :
1. Create Directory
Command : hdfs dfs -mkdir /experiment/ex1
Output:
6 717823I228
Department of Artificial Intelligence and Data Science 23ADR508 / Big Data Analytics
2. Put the file in another directory
Command : hdfs dfs -put /home/edureka/Desktop/i228.txt /experiment/
ex1 Output:
2. Run java class for wordcount
Command : hadoop jar /home/edureka/Desktop/hadoop-mapreduce-examples-2.2.0.jar wordcount
/experiment/ex1/i228.txt /experiment/output
Input file:
7 717823I228
Department of Artificial Intelligence and Data Science 23ADR508 / Big Data Analytics
Output file :
Lion
Tiger
PREPARATION 30
LAB PERFORMANCE 30
REPORT 40
TOTAL 100
INITIAL OF THE FACULTY
Result:
Thus, the word count example using MapReduce is successfully executed and output was
verified.
8 717823I228