Hadoop 3.
x Installation with HA – Automatic Failover
Hosts Details:
IP Address FQDN Hostname Role in Storage Layer Role in Processing Layer
Namenode, NFS Client,
192.168.56.181 h3n1.hadoop.com h3n1 Resource Manager
Zookeeper, ZKFC
Namenode, NFS Client,
192.168.56.182 h3n2.hadoop.com h3n2 NA
Zookeeper, ZKFC
Namenode, Datanode,
192.168.56.183 h3n3.hadoop.com h3n3 Node Manager
Zookeeper
192.168.56.184 h3n4.hadoop.com h3n4 Namenode,Datanode Node Manager
192.168.56.185 h3n5.hadoop.com h3n5 Datanode Node Manager
192.168.56.186 h3edge.hadoop.com h3edge Edge Node, NFS Server Eco System Tools
TAR Balls need to be downloaded for this installation:
TAR Ball Name Download Location TAR Ball location in VM
hadoop-3.0.0-alpha4.tar https://archive.apache.org/dist/hadoop/core/hadoop-3.0.0-alpha4/ /var/www/html/hadoop_tools/
Clone the VMs and change the IPs addresses as above.
1. Setup Password-less SSH for hadoop cluster installation
NOTE: This step should be followed in all the masters (Active NN, Stand by NN, RM, etc)
rm -rf ~/.ssh/id_rsa*
ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa
ls -ltr ~/.ssh
for i in 192.168.56.{181,182,183,184,185,186}; do sshpass -p welcome1 ssh-copy-id $i; done
2. Add below on any one of the master host
sudo vi /etc/clustershell/groups.d/local.cfg
nn: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184
jn: 192.168.56.181 192.168.56.182 192.168.56.183
dn: 192.168.56.183 192.168.56.184 192.168.56.185
zk: 192.168.56.181 192.168.56.182 192.168.56.183
rm: 192.168.56.181 192.168.56.182 192.168.56.183
hadoop: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184 192.168.56.185
all: 192.168.56.181 192.168.56.182 192.168.56.183 192.168.56.184 192.168.56.185 192.168.56.186
sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg
192.168.56.182:/etc/clustershell/groups.d/local.cfg
sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg
192.168.56.183:/etc/clustershell/groups.d/local.cfg
sudo sshpass -p "welcome1" scp /etc/clustershell/groups.d/local.cfg
192.168.56.184:/etc/clustershell/groups.d/local.cfg
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
clush -g all -b "date"
3. Configure NTPD Service.
clush -g all -b "sudo sed -i 's/^server /#server /g' /etc/ntp.conf"
clush -g all -x 192.168.56.181 -b "echo 'server 192.168.56.181 prefer' | sudo tee -a /etc/ntp.conf >
/dev/null 2>&1"
If you don't have internet access to your hosts:
clush -w 192.168.56.181 -b "echo 'server 127.127.1.0' | sudo tee -a /etc/ntp.conf > /dev/null 2>&1"
clush -w 192.168.56.181 -b "echo 'fudge 127.127.1.0 stratum 10' | sudo tee -a /etc/ntp.conf > /dev/null
2>&1"
Restart NTPD Service & Sync Time:
clush -g all -b "sudo systemctl restart ntpd"
clush -g all -x 192.168.56.181 -b "/usr/sbin/ntpdate -d 192.168.56.181"
clush -g all -x 192.168.56.181 -b "/usr/sbin/ntpq -p"
clush -g all -b "date"
4. Download hadoop 3x tarball
Download hadoop-3.0.0-alpha4.tar.gz from internet and untar as below
http://mirror.fibergrid.in/apache/hadoop/common/
From Clush node:
clush -g all -b "sudo unlink /usr/local/hadoop" > /dev/null 2>&1;
clush -g all -b "sudo rm -rf /usr/local/hadoop-3.0.0-alpha4"
clush -g all -b "sudo tar -xvzf /var/www/html/hadoop_tools/hadoop-3.0.0-alpha4.tar.gz -C /usr/local/"
clush -g all -b "du -sch /usr/local/hadoop-3.0.0-alpha4"
clush -g all -b "sudo ln -s /usr/local/hadoop-3.0.0-alpha4 /usr/local/hadoop"
clush -g all -b "sudo chown -R hdpuser:hdpadmin /usr/local/hadoop*"
clush -g all -b "ls -ltr /usr/local | grep -i hadoop"
6. Setup HOME paths
sudo vi /etc/profile
Copy below to the end of file.
export JAVA_HOME=/usr/java/default
export ZOOKEEPER_HOME=/usr/local/zookeeper
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_HOME_WARN_SUPPRESS=1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_ROOT_LOGGER="WARN,DRFA"
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export YARN_HOME=$HADOOP_HOME
export YARN_HOME_WARN_SUPPRESS=1
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME_WARN_SUPPRESS=1
export HADOOP_COMMON_HOME=$HADOOP_HOME
PATH=$PATH:$HOME/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${ZOOKEEPER_HOME}/bin:${J
AVA_HOME}/bin
export PATH
Copy /etc/profile file to all other nodes from h3n1
clush -g all -x 192.168.56.181 --copy /etc/profile --dest /tmp/
clush -g all -x 192.168.56.181 "sudo cp /tmp/profile /etc/"
clush -g all -b "source /etc/profile"
Add above EXPORT commands to env files (excluding PATH).
sudo vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
sudo vi /usr/local/hadoop/etc/hadoop/yarn-env.sh
7. Change xml files as below
sudo vi /usr/local/hadoop/etc/hadoop/core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>h3n1.hadoop.com:2181,h3n2.hadoop.com:2181,h3n3.hadoop.com:2181</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://h3n1.hadoop.com:8485;h3n2.hadoop.com:8485;h3n3.hadoop.com:8485/mycluster</v
alue>
</property>
<property>
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
<name>topology.script.file.name</name>
<value>/usr/local/hadoop/etc/hadoop/topology.sh</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>360</value>
</property>
<property>
<name>fs.trash.checkpoint.interval</name>
<value>2</value>
</property>
sudo vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>h3n1,h3n2,h3n3, h3n4</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.h3n1</name>
<value>h3n1.hadoop.com:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.h3n2</name>
<value>h3n2.hadoop.com:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.h3n3</name>
<value>h3n3.hadoop.com:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.h3n4</name>
<value>h3n4.hadoop.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.h3n1</name>
<value>h3n1.hadoop.com:9870</value>
</property>
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
<property>
<name>dfs.namenode.http-address.mycluster.h3n2</name>
<value>h3n2.hadoop.com:9870</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.h3n3</name>
<value>h3n3.hadoop.com:9870</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.h3n4</name>
<value>h3n4.hadoop.com:9870</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.block.size</name>
<value>268435456</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/mnt/disk1/name </value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/mnt/disk1/data,file:/mnt/disk2/data,file:/mnt/disk3/data</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/mnt/disk1/jnedits</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hdpuser/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/mnt/disk1/snn</value>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>/mnt/disk1/snn</value>
</property>
<property>
<name>dfs.namenode.checkpoint.period</name>
<value>3600</value>
</property>
<property>
<name>dfs.ha.log-roll.period</name>
<value>600</value>
</property>
<property>
<name>dfs.acl.enable</name>
<value>true</value>
</property>
sudo vi /usr/local/hadoop/etc/hadoop/slaves
192.168.56.183
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
192.168.56.184
192.168.56.185
Create directories in namenodes and datanodes. From Clush node:
clush -g nn -b "sudo mkdir -p /mnt/disk1/name"
clush -g jn -b "sudo mkdir -p /mnt/disk1/jnedits"
clush -g nn -b "sudo mkdir -p /mnt/disk1/snn"
clush -g dn -b "sudo mkdir -p /mnt/disk1/data"
clush -g dn -b "sudo mkdir -p /mnt/disk2/data"
clush -g dn -b "sudo mkdir -p /mnt/disk3/data"
clush -g all -b "sudo chown -R hdpuser:hdpadmin /mnt"
Copy all the XMLs to other nodes.
clush -g all -x 192.168.56.181 "sudo rm -rf /tmp/hadoop"
clush -g all -x 192.168.56.181 --copy /usr/local/hadoop/etc/hadoop --dest /tmp/
clush -g all -x 192.168.56.181 "sudo cp -r /tmp/hadoop /usr/local/hadoop/etc/"
8. Create Log directories to store logs and copy updated xmls to other machines.
clush -g all -b "sudo mkdir /usr/local/hadoop/logs"
clush -g all -b "sudo chmod 777 -R /usr/local/hadoop/logs"
clush -g all -b "sudo chown hdpuser:hdpadmin -R /usr/local/hadoop/logs"
9. Setting up zookeeper.
Add below on the master host
clush -g zk -b "date"
clush -g zk -b "sudo unlink /usr/local/zookeeper" > /dev/null 2>&1;
clush -g zk -b "sudo rm -rf /usr/local/zookeeper-3.4.6"
clush -g zk -b "sudo tar -xvzf /var/www/html/hadoop_tools/zookeeper-3.4.6.tar.gz -C /usr/local/"
clush -g zk -b "du -sch /usr/local/zookeeper-3.4.6"
clush -g zk -b "sudo ln -s /usr/local/zookeeper-3.4.6 /usr/local/zookeeper"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /usr/local/zookeeper*"
clush -g zk -b "ls -ltr /usr/local/ | grep -i zookeeper"
Change the zookeeper configuration file
sudo cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg
sudo sed -i 's/^dataDir/#dataDir/g' /usr/local/zookeeper/conf/zoo.cfg
sudo vi /usr/local/zookeeper/conf/zoo.cfg
Comment dataDir property and add below at the end of the file:
dataDir=/mnt/disk1/zkdata
server.1=h3n1.hadoop.com:2888:3888
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
server.2=h3n2.hadoop.com:2888:3888
server.3=h3n3.hadoop.com:2888:3888
Copy zookeeper folder to all other hosts
clush -g zk -x 192.168.56.181 "sudo rm -rf /tmp/conf "
clush -g zk -x 192.168.56.181 --copy /usr/local/zookeeper/conf --dest /tmp/
clush -g zk -x 192.168.56.181 "sudo cp -r /tmp/conf /usr/local/zookeeper/"
clush -g zk -b " echo; echo -e "ZOO_LOG_DIR=/usr/local/zookeeper/logs" | sudo tee -a
/usr/local/zookeeper/bin/zkEnv.sh > /dev/null"
clush -g zk -b "sudo mkdir -p /mnt/disk1/zkdata"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /mnt/disk1/zkdata"
clush -g zk -b "sudo chown -R hdpuser:hdpadmin /usr/local/zookeeper*"
clush -g zk -b "sudo touch /mnt/disk1/zkdata/myid"
clush -w 192.168.56.181 -b "echo 1 | sudo tee /mnt/disk1/zkdata/myid > /dev/null"
clush -w 192.168.56.182 -b "echo 2 | sudo tee /mnt/disk1/zkdata/myid > /dev/null"
clush -w 192.168.56.183 -b "echo 3 | sudo tee /mnt/disk1/zkdata/myid > /dev/null"
clush -g zk -b "cat /mnt/disk1/zkdata/myid"
zookeeper myid file should show as below
h3n1 - 1
h3n2 - 2
h3n3 - 3
Start Zookeeper in all the nodes.
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"
To check whether zookeeper is working fine.
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh status"
or in each node, with their hostname
zkCli.sh -server h3n1.hadoop.com:2181
clush -g all -b "jps | grep -v Jps; echo;"
9. Setup Rack Topology
Rack Awareness: Create topology.sh file as below.
sudo vi /usr/local/hadoop/etc/hadoop/topology.sh
#==================================
while [ $# -gt 0 ] ; do
nodeArg=$1
exec< /usr/local/hadoop/etc/hadoop/topology.data
result=""
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
while read line ; do
ar=( $line )
if [ "${ar[0]}" = "$nodeArg" ] ; then
result="${ar[1]}"
fi
done
shift
if [ -z "$result" ] ; then
echo -n "/default"
else
echo -n "$result "
fi
done
#==================================
sudo chmod 755 /usr/local/hadoop/etc/hadoop/topology.sh
Create topology.data file as below.
sudo vi /usr/local/hadoop/etc/hadoop/topology.data
192.168.56.181 /rack1
192.168.56.182 /rack2
192.168.56.183 /rack1
192.168.56.184 /rack2
192.168.56.185 /rack2
hdfs dfsadmin -printTopology
10. Start Hadoop Daemons
Format ZKFC service:
clush -g nn -b "/usr/local/hadoop/bin/hdfs zkfc -formatZK -force"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
Start Journal Nodes:
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
Only for the first time activities:
In Active NN:
hdfs namenode -format
hdfs --daemon start namenode
In all Standby NNs:
hdfs namenode -bootstrapStandby
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
hdfs --daemon start namenode
clush -g all -b "jps | grep -v Jps; echo;"
Check below folders:
clush -g nn -b "ls /mnt/disk1/name/current/"
clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/"
Start DataNodes
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
clush -g all -b "jps | grep -v Jps; echo;"
Check status of each Name Node
hdfs haadmin -getServiceState h3n1
hdfs haadmin -getServiceState h3n2
hdfs haadmin -getServiceState h3n3
hdfs haadmin -getServiceState h3n4
Fail over to another node.
hdfs haadmin -failover h3n1 h3n2
hdfs haadmin -failover h3n2 h3n3
hdfs haadmin -failover h3n3 h3n4
hdfs haadmin -failover h3n4 h3n1
hdfs haadmin -getServiceState h3n1
hdfs haadmin -getServiceState h3n2
hdfs haadmin -getServiceState h3n3
hdfs haadmin -getServiceState h3n4
Check below folders:
clush -g nn -b "ls /mnt/disk1/name/current/"
clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/current/"
clush -g dn -b "ls /mnt/disk1/data/current/"
clush -g dn -b "ls /mnt/disk2/data/current/"
clush -g dn -b "ls /mnt/disk3/data/current/"
To save namespace
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace
hdfs dfsadmin -safemode leave
10. Start Hadoop Storage Layer
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
Stop if any services started earlier.
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon stop datanode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop namenode"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon stop journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop zkfc"
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh stop"
clush -g all -b "jps | grep -v Jps; echo;"
To start entire cluster:
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
clush -g all -b "jps | grep -v Jps; echo;"
To see the fsimage & edits files
clush -g nn -b "ls /mnt/disk1/name/current/"
clush -g jn -b "ls /mnt/disk1/jnedits/mycluster/current/"
seen_txid: This contains the last transaction ID of the last checkpoint (merge of edits into a fsimage) or edit log roll
(finalization of current edits_inprogress and creation of a new one). The file is not updated on every transaction,
only on a checkpoint or an edit log roll.
committed-txid: Tracks last transaction ID committed by a NameNode
last-promised-epoch: When NN becomes active, it increments the last-promised-epoch. While writing edits to Edit
log, NN will send this epoch to JN to confirm the latest Active NN. Edits from previous Active will be discorded.
last-writer-epoch: This contains the epoch number associated with the NN who last actually wrote a transaction.
Command to roll the edits manually:
hdfs dfsadmin -rollEdits
11. Some interesting points about storage layer
Check Default Hadoop values in 2.X:
sudo jar -tf /usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-alpha4.jar | grep core-
sudo jar -tf /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-alpha4.jar | grep hdfs-
sudo jar -xf /usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-alpha4.jar core-
default.xml
sudo jar -xf /usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-alpha4.jar hdfs-default.xml
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
14. To start YARN
clush -g rm -b "date"
In H3N1,
sudo vi /usr/local/hadoop/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mycluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>h3n1,h3n2,h3n3</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.h3n1</name>
<value>h3n1.hadoop.com</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.h3n2</name>
<value>h3n2.hadoop.com</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.h3n3</name>
<value>h3n3.hadoop.com</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.h3n1</name>
<value>h3n1.hadoop.com:8088</value>
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
</property>
<property>
<name>yarn.resourcemanager.webapp.address.h3n2</name>
<value>h3n2.hadoop.com:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.h3n3</name>
<value>h3n3.hadoop.com:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>h3n1.hadoop.com:2181,h3n2.hadoop.com:2181,h3n3.hadoop.com:2181</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.client.failover-proxy-provider</name>
<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/apps/yarn/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>1296000</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASS
PATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
</property>
sudo cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml
sudo vi /usr/local/hadoop/etc/hadoop/mapred-site.xml
add below between <configuration> and </configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>h3n1.hadoop.com:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>h3n1.hadoop.com:19888</value>
</property>
Copy all the XMLs to other nodes.
clush -g all -x 192.168.56.181 "sudo rm -rf /tmp/hadoop"
clush -g all -x 192.168.56.181 --copy /usr/local/hadoop/etc/hadoop --dest /tmp/
clush -g all -x 192.168.56.181 "sudo cp -r /tmp/hadoop /usr/local/hadoop/etc/"
Start YARN daemons:
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon stop resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon stop nodemanager"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon start resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon start nodemanager"
clush -g all -b "jps | grep -v Jps; echo;"
In H3N1:
mapred --daemon stop historyserver
mapred --daemon start historyserver
yarn rmadmin -getServiceState h3n1
yarn rmadmin -getServiceState h3n2
yarn rmadmin -getServiceState h3n3
To stop entire cluster:
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon stop historyserver"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon stop resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon stop nodemanager"
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon stop datanode"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon stop journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop zkfc"
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh stop"
clush -g all -b "jps | grep -v Jps; echo;"
To start entire cluster:
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon start resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon start nodemanager"
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon start historyserver"
clush -g all -b "jps | grep -v Jps; echo;"
http://192.168.56.181:9870
http://192.168.56.181:8088
To maintain, Log aggregation,
hdfs dfs -mkdir -p /apps/yarn/logs
hdfs dfs -chmod -R 777 /apps
hdfs dfs -mkdir -p /tmp
hdfs dfs -chmod -R 777 /tmp
hdfs dfs -ls /
Check jars are working fine or not.
yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar
Running a mapreduce program:
hdfs dfs -rm -r /out
yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha4.jar
wordcount /sample.txt /out
Check for MRAppMaster, YarnChild by checking jps command as below while running the job. We can see how RM
is launching these and running the jobs.
clush -g all -b "jps | grep -v Jps; echo;"
Check YARN logs thru UI:
http://192.168.56.181:8088/
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
If you are using VMs from Windows, add the host details to C:\Windows\System32\drivers\etc to resolve
the hostname and show logs.
Check Old YARN logs thru UI:
http://h3n1.hadoop.com:19888/
This is history server URL.
Check YARN Logs thru command line:
yarn application -list -appStates ALL
yarn logs -applicationId application_1464914540546_0002
Check for tracking URL and see the logs in browser.
To stop entire cluster:
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon stop historyserver"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon stop resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon stop nodemanager"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon stop datanode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon stop zkfc"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon stop journalnode"
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh stop"
clush -g all -b "jps | grep -v Jps; echo;"
To start entire cluster:
clush -g zk -b "/usr/local/zookeeper/bin/zkServer.sh start"
clush -g jn -b "/usr/local/hadoop/bin/hdfs --daemon start journalnode"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start zkfc"
clush -g nn -b "/usr/local/hadoop/bin/hdfs --daemon start namenode"
clush -g dn -b "/usr/local/hadoop/bin/hdfs --daemon start datanode"
clush -g rm -b "/usr/local/hadoop/bin/yarn --daemon start resourcemanager"
clush -g dn -b "/usr/local/hadoop/bin/yarn --daemon start nodemanager"
clush -w 192.168.56.181 -b "/usr/local/hadoop/sbin/mapred --daemon start historyserver"
clush -g all -b "jps | grep -v Jps; echo;"
20. To delete entire Hadoop installation,
clush -g all -b "sudo unlink /usr/local/hadoop" > /dev/null 2>&1;
clush -g all -b "sudo rm -rf /usr/local/hadoop*"
clush -g zk -b "sudo unlink /usr/local/zookeeper" > /dev/null 2>&1;
clush -g zk -b "sudo rm -rf /usr/local/zookeeper*"
clush -g nn -b "sudo umount -l /mnt/disk1/nfsedits" > /dev/null 2>&1;
clush -g all -b "sudo rm -rf /mnt/disk1/*"
By: Venkata Narasimha Rao B, Contact: +91 9342707000
Hadoop 3.x Installation with HA – Automatic Failover
clush -g all -b "sudo rm -rf /mnt/disk2/*"
clush -g all -b "sudo rm -rf /mnt/disk3/*"
clush -g all -b "sudo ls /mnt/*"
clush -g all -b "sudo sed -i '/JAVA_HOME/,\$d' /etc/profile"
clush -g nn -b "sudo sed -i '/nn:/,\$d' /etc/clustershell/groups.d/local.cfg"
By: Venkata Narasimha Rao B, Contact: +91 9342707000