Ex.
No: HADOOP COMMANDS
AIM:
To study and explain the implementation of Hadoop commands.
COMMANDS:
1) Ls: This command is used to list all the files. Use lsr for recursive approach. It is useful
when we want a hierarchy of a folder.
SYNTAX:
bin/hdfs dfs -ls <path>
EX:
[cloudera@quickstart ~]$ hdfs dfs -ls / Found 21 items
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:14 /19140404
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:29 /19140404partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:08 /Partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /Computer
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:10 /Desktop
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /File
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:53 /partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:41 /parthamramachandran
drwxrwxrwx - hdfs supergroup 0 2017-10-23 09:15 /benchmarks
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:08 /cloudera
drwxr-xr-x - hbase supergroup 0 2022-04-10 21:55 /hbase
-rw-r--r—1 cloudera supergroup 0 2022-03-13 23:26 /msg.txt
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:10 /one
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:59 /simple
drwxr-xr-x - solr solr 0 2017-10-23 09:18 /solr
drwxrwxrwt - hdfs supergroup 0 2022-02-25 20:30 /tmp
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /two
drwxr-xr-x - hdfs supergroup 0 2017-10-23 09:17 /user
drwxr-xr-x - hdfs supergroup 0 2017-10-23 09:17 /var
-rw-r--r-- 1 cloudera supergroup 0 2022-03-15 22:52 /ramachandran.txt
-rw-r--r-- 1 cloudera supergroup 0 2022-03-15 23:15 /ramachandran1.txt
2) Mkdir: To create a directory. In Hadoop dfs there is no home directory by default. So
let’s first create it.
SYNTAX:
bin/hdfs dfs -mkdir <folder name>
EX:
[cloudera@quickstart ~]$ hdfs dfs -mkdir /bigdata
[cloudera@quickstart ~]$ hdfs dfs -ls /
Found 22 items
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:14 /19140404
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:29 /19140404partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:08 /Partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /Computer
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:10 /Desktop
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /File
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:53 /partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:41 /parthamramachandran
drwxrwxrwx - hdfs supergroup 0 2017-10-23 09:15 /benchmarks
drwxr-xr-x - cloudera supergroup 0 2022-04-17 22:12 /bigdata
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:08 /cloudera
drwxr-xr-x - hbase supergroup 0 2022-04-10 21:55 /hbase
-rw-r--r-- 1 cloudera supergroup 0 2022-03-13 23:26 /msg.txt
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:10 /one
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:59 /simple
drwxr-xr-x - solr solr 0 2017-10-23 09:18 /solr
drwxrwxrwt - hdfs supergroup 0 2022-02-25 20:30 /tmp
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /two
drwxr-xr-x - hdfs supergroup 0 2017-10-23 09:17 /user
drwxr-xr-x - hdfs supergroup 0 2017-10-23 09:17 /var
-rw-r--r-- 1 cloudera supergroup 0 2022-03-15 22:52 /ramachandran.txt
-rw-r--r-- 1 cloudera supergroup 0 2022-03-15 23:15 /ramachandran1.txt
3) Touchz: It creates an empty file.
SYNTAX:
bin/hdfs dfs -touchz <file_path>
EX:
[cloudera@quickstart ~]$ hdfs dfs -touchz /bigdata/hello404.txt
[cloudera@quickstart ~]$ hdfs dfs -ls /bigdata
Found 1 item
-rw-r--r-- 1 cloudera supergroup 0 2022-03-09 03:47 /bigdata/hello404.txt
4) copyFromLocal (or) put: To copy files/folders from local file system to hdfs store.
This is the most important command. Local filesystem means the files present on the
OS.
SYNTAX:
bin/hdfs dfs -copyFromLocal <local file path> <dest(present on hdfs)>
EX:
[cloudera@quickstart ~]$ hdfs dfs -copyFromLocal hello404.txt /bigdata
copyFromLocal: ' /bigdata/hello404.txt': File exists
[cloudera@quickstart ~]$ hdfs dfs -ls /bigdata
Found 2 items
-rw-r--r-- 1 cloudera supergroup 0 2022-04-17 22:15 /bigdata/file404.txt
-rw-r--r-- 1 cloudera supergroup 58 2022-04-17 22:31 /bigdata/hello404.txt
5) cat: To print file contents.
SYNTAX:
bin/hdfs dfs -cat <path>
EX:
[cloudera@quickstart ~]$ hdfs dfs -cat /bigdata/hello404.txt
Hello I am partham from national engineering college
bye
6) movefromlocal: This command will move file from local to hdfs.
SYNTAX:
bin/hdfs dfs -moveFromLocal <local src> <dest(on hdfs)>
EX:
[cloudera@quickstart ~]$ mkdir bda
[cloudera@quickstart ~]$ cd bda
[cloudera@quickstart bda]$ vi bda404.txt
[cloudera@quickstart bda]$ cat bda404.txt thanks for visiting bda lab
welcome
[cloudera@quickstart bda]$ cd
[cloudera@quickstart ~]$ hdfs dfs -moveFromLocal bda/bda404.txt /bigdata
22/04/17 22:45:19 WARN hdfs.DFSClient: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1281)
atjava.lang.Thread.join(Thread.java:1355)
atorg.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStr
eam.java:967)at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.ja
va:705)at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:89 4)
[cloudera@quickstart ~]$ hdfs dfs -ls /bigdata
Found 3 items
-rw-r--r-- 1 cloudera supergroup36 2022-04-17 22:45 /bigdata/bda404.txt
-rw-r--r-- 1 cloudera supergroup0 2022-04-17 22:15 /bigdata/file404.txt
-rw-r--r-- 1 cloudera supergroup58 2022-04-17 22:31
/bigdata/hello404.txt
7) copy: This command is used to copy files within hdfs.
SYNTAX:
bin/hdfs dfs -cp <src(on hdfs)> <dest(on hdfs)>
EX:
[cloudera@quickstart ~]$ hdfs dfs -mkdir /bda
[cloudera@quickstart ~]$ hdfs dfs -ls /
Found 23 items
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:14 /19140404
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:29 /19140404partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:08 /Partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /Computer
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:10 /Desktop
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /File
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:53 /partham
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:41 /parthamramachandran
drwxr-xr-x - cloudera supergroup 0 2022-04-17 22:49 /bda
drwxrwxrwx - hdfs supergroup 0 2017-10-23 09:15 /benchmarks
drwxr-xr-x - cloudera supergroup 0 2022-04-17 22:45 /bigdata
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:08 /cloudera
drwxr-xr-x - hbase supergroup 0 2022-04-10 21:55 /hbase
-rw-r--r-- 1 cloudera supergroup 0 2022-03-13 23:26 /msg.txt
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:10 /one
drwxr-xr-x - cloudera supergroup 0 2022-03-13 22:59 /simple
drwxr-xr-x - solr solr 0 2017-10-23 09:18 /solr
drwxrwxrwt - hdfs supergroup 0 2022-02-25 20:30 /tmp
drwxr-xr-x - cloudera supergroup 0 2022-03-13 23:13 /two
drwxr-xr-x - hdfs supergroup 0 2017-10-23 09:17 /user
drwxr-xr-x - hdfs supergroup 0 2017-10-23 09:17 /var
-rw-r--r-- 1 cloudera supergroup 0 2022-03-15 22:52 /ramachandran.txt
-rw-r--r-- 1 cloudera supergroup 0 2022-03-15 23:15
/ramachandran1.txt [cloudera@quickstart ~]$ hdfs dfs -ls /bigdata Found 3 items
-rw-r--r-- 1 cloudera supergroup36 2022-04-17 22:45 /bigdata/bda404.txt
-rw-r--r-- 1 cloudera supergroup0 2022-04-17 22:15 /bigdata/file404.txt
-rw-r--r-- 1 cloudera supergroup58 2022-04-17 22:31
/bigdata/hello404.txt
[cloudera@quickstart ~]$ hdfs dfs -cp /bigdata/bda404.txt /bda
[cloudera@quickstart ~]$ hdfs dfs -ls /bda
Found 1 items
-rw-r--r-- 1 cloudera supergroup 36 2022-04-17 22:52 /bda/bda404.txt
8) move: This command is used to move files within hdfs.
SYNTAX:
bin/hdfs dfs -mv <src(on hdfs)> <src(on hdfs)>
EX:
[cloudera@quickstart ~]$ hdfs dfs -ls /bigdata Found 3 items
-rw-r--r-- 1 cloudera supergroup 36 2022-04-17 22:45 /bigdata/bda404.txt
-rw-r--r-- 1 cloudera supergroup 0 2022-04-17 22:15 /bigdata/file404.txt
-rw-r--r-- 1 cloudera supergroup 58 2022-04-17 22:31
/bigdata/hello404.txt [cloudera@quickstart ~]$ hdfs dfs -mv
/bigdata/hello404.txt /bda [cloudera@quickstart ~]$ hdfs dfs -ls /bigdata
Found 2 items
-rw-r--r-- 1 cloudera supergroup 36 2022-04-17 22:45 /bigdata/bda404.txt
-rw-r--r-- 1 cloudera supergroup 0 2022-04-17 22:15
/bigdata/file404.txt [cloudera@quickstart ~]$ hdfs dfs -ls /bda Found 2 items
-rw-r--r-- 1 cloudera supergroup 36 2022-04-17 22:52 /bda/bda404.txt
-rw-r--r-- 1 cloudera supergroup 58 2022-04-17 22:31 /bda/hello404.txt
9) du: It will give the size of each file in directory.
SYNTAX:
bin/hdfs dfs -du <dirName>
EX:
[cloudera@quickstart ~]$ hdfs dfs -cat /bda/bda404.txt thanks for visiting bda lab
welcome
[cloudera@quickstart ~]$ hdfs dfs -du /bda/bda404.txt 36 36 /bda/bda404.txt
10) dus: This command will give the total size of directory/file.
SYNTAX:
bin/hdfs dfs -dus <dirName>
EX:
[cloudera@quickstart ~]$ hdfs dfs -du -s /bda 94 94 /bda
11) stat: It will give the last modified time of directory or path. In short it will give stats of
the directory or file.
SYNTAX:
bin/hdfs dfs -stat <hdfs file>
EX:
[cloudera@quickstart ~]$ hdfs dfs -stat /bigdata 2022-04-18 05:58:50
[cloudera@quickstart ~]$ hdfs dfs -stat
/bda/bda404.txt 2022-04-18 05:52:22
12) set replication factor: This command is used to change the replication factor of a
file/directory in HDFS.
SYNTAX:
bin/hdfs dfs -setrep -R -w 6 filename.txt
EX:
[cloudera@quickstart ~]$ hdfs dfs -setrep -R 5 /bda/bda404.txt Replication 5 set:
/bda/bda404.txt
[cloudera@quickstart ~]$ hdfs dfs -du /bda/bda404.txt 36 180 /bda/bda404.txt
13) remove: Remove a file from HDFS.
SYNTAX:
$ hadoop fs -rm /user/hadoop/sample1.txt /user/text/
EX:
[cloudera@quickstart ~]$ hdfs dfs -rm /bda/bda404.txt Deleted /bda/bda404.txt
[cloudera@quickstart ~]$ hdfs dfs -du /bda/bda404.txt du: `/bda/bda404.txt':
No such file or directory
RESULT:
Thus, the implementation of word count using map reduce function is successfully
executed and verified