Hadoop 2.x Quick Notes :: Part - 2
Bhaskar S | 12/25/2014 |
Overview
In Part-1 we laid out the steps to install, setup, and start-up Hadoop 2.x on a single node (localhost).
In this part, we will layout the steps to setup a 3-node Hadoop 2.x cluster.
Installation and Setup
We will install Hadoop 2.x on a 3-node Ubuntu 14.04 LTS based cluster.
To simulate a 3-node cluster, we leveraged VirtualBox to create three virtual machines with identical specs (1 CPU, 4GB RAM, 20 GB Disk) running on a Ubuntu 14.04 host computer (with one ethernet card).
CAUTION :: The only VirtualBox network setting that will work for this 3-node cluster setup is the Host-only Adapter option.
To use the Host-only networking for a virtual machine, we first need to define a new adapter for Host-only network by choosing the menu option File->Preferences->Network and then clicking on the Host-only Networks tab.
The following screenshot shows the Adapter entry under Host-only Networks:
The following screenshot shows the DHCP Server entry under Host-only Networks:
Use the above defined Host-only Adapter for a virtual machine by choosing the menu option Machine->Settings->Network.
The following screenshot shows the network settings for one of our virtual machines:
NOTE :: The host computer will be able to connect to the 3 virtual machines. The virtual machines will be able to connect to each other. The virtual machines will *NOT* be able to connect to the host computer or the internet.
We named the the virtual machine nodes vb-host-1, vb-host-2, and vb-host-3 respectively.
In addition, we assigned static IP address to each of the three virtual machine hosts as 192.168.50.101 (vb-host-1), 192.168.50.102 (vb-host-2), and 192.168.50.103 (vb-host-3) respectively.
The following screenshot shows how we assigned a static IP address using the Ubuntu NetworkManager:
Next, we modified the /etc/hosts file in each of the three virtual machine hosts vb-host-1, vb-host-2, and vb-host-3 to add the host names and IP addresses.
The following screenshot shows the modified /etc/hosts file:
Make sure these are the only entries in the /etc/hosts file. Delete all other entries.
In our 3-node cluster, we will designate vb-host-1 as the master node and have vb-host-2 and vb-host-3 as the slave nodes.
Following are the steps to install and setup Hadoop 2.x on our 3-node cluster:
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Ensure Java SE 7 or above is installed.
We installed Oracle Java SE 8 by issuing the following commands:
$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java8-installer
$ sudo apt-get install oracle-java8-set-default
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Ensure openssh-server is installed.
We installed openssh-server by issuing the following command:
sudo apt-get install openssh-server
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Ensure the group hadoop as well as the user hadoop with home directory /home/hadoop exists.
We created the group hadoop and the user hadoop with home directory /home/hadoop by issuing the following commands:
$ sudo groupadd hadoop
$ sudo useradd -g hadoop hadoop
$ sudo passwd hadoop
$ sudo usermod -a -G sudo hadoop
$ sudo mkdir -p /home/hadoop
$ sudo chown -R hadoop:hadoop /home/hadoop
$ sudo useradd -D --base-dir /home/hadoop --shell /bin/bash
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Ensure IPv6 networking is disabled. For this, we modify the /etc/sysctl.conf file (using sudo) and add the following lines to the end of the file and save the changes:
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
After saving these changes, REBOOT the node
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Login as user hadoop. You should be in the home directory /home/hadoop
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Download the latest stable version of Hadoop 2.x from the project site located at the URL hadoop.apache.org
The current stable 2.x version at this time is the release 2.6.0.
Extract the downloaded package hadoop-2.6.0.tar.gz under the home directory /home/hadoop. The extracted package will be in the sub-directory hadoop-2.6.0.
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Modify the .bashrc file located in the home directory /home/hadoop and add the following lines to the end of the file and save the changes:
export HADOOP_PREFIX=/home/hadoop/hadoop-2.6.0
export HADOOP_HOME=/home/hadoop/hadoop-2.6.0
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=.:$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Here is a copy of the .bashrc
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Create a base data directory $HADOOP_HOME/data.
Also, create a logs directory $HADOOP_HOME/logs.
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Hadoop 2.x uses ssh between the nodes in the cluster. We need to setup password-less ssh access. For this we need to generate the ssh public key for each of the nodes. Execute the following commands:
ssh-keygen -t rsa -P ""
cat /home/hadoop/.ssh/id_rsa.pub >> /home/hadoop/.ssh/authorized_keys
On the master node vb-host-1:
The master node vb-host-1 needs password-less access to slave nodes vb-host-2 and vb-host-3. For this we need to distribute the ssh public key of vb-host-1 to the other two nodes vb-host-2 and vb-host-3. Execute the following commands:
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@vb-host-2
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@vb-host-3
On the master node vb-host-1:
Add slave nodes names vb-host-2 and vb-host-3 to the configuration file $HADOOP_CONF_DIR/slaves. For this execute the following command:
echo 'vb-host-2' > $HADOOP_CONF_DIR/slaves
echo 'vb-host-3' >> $HADOOP_CONF_DIR/slaves
Here is a copy of the slaves file
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Edit the file $HADOOP_CONF_DIR/hadoop-env.sh using any text editor.
Modify the line that begins with export HADOOP_OPTS= and change it to look like export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Djava.library.path=$HADOOP_HOME/lib/native"
Also, modify the line that begins with export HADOOP_LOG_DIR= and change it to look like export HADOOP_LOG_DIR=$HADOOP_HOME/logs
Here is a copy of the hadoop-env.sh file
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Edit the file $HADOOP_CONF_DIR/core-site.xml using any text editor.
The contents should look like the following:
<property> <name>fs.defaultFS</name> <value>hdfs://vb-host-1:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop-2.6.0/data</value> </property>
Here is a copy of the core-site.xml file
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Edit the file $HADOOP_CONF_DIR/hdfs-site.xml using any text editor.
The contents should look like the following:
<property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> <property> <name>dfs.namenode.http-address</name> <value>vb-host-1:50070</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>vb-host-1:50090</value> </property>
Here is a copy of the hdfs-site.xml file
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Edit the file $HADOOP_CONF_DIR/yarn-site.xml using any text editor.
The contents should look like the following:
<property> <name>yarn.resourcemanager.hostname</name> <value>vb-host-1</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
Here is a copy of the yarn-site.xml file
On the nodes vb-host-1, vb-host-2, and vb-host-3:
Make a copy of the file $HADOOP_CONF_DIR/mapred-site.xml.template to $HADOOP_CONF_DIR/mapred-site.xml
Edit the file $HADOOP_CONF_DIR/mapred-site.xml using any text editor.
The contents should look like the following:
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>vb-host-1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>vb-host-1:19888</value> </property>
Here is a copy of the mapred-site.xml file
On the master node vb-host-1:
Just like with any filesystem, one needs to prepare the Hadoop Distributed File System (HDFS).
To do that, execute the following command:
hdfs namenode -format
The following is the typical output:
14/12/25 20:44:57 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = vb-host-1/192.168.1.100 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 2.6.0 STARTUP_MSG: classpath = /home/hadoop/hadoop-2.6.0/etc/hadoop:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/activation-1.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/curator-client-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-math3-3.1.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-digester-1.8.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jetty-6.1.26.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-lang-2.6.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-io-2.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/gson-2.2.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jets3t-0.9.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jsch-0.1.42.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/guava-11.0.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/stax-api-1.0-2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/curator-framework-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/hadoop-auth-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-codec-1.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/httpclient-4.2.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/xz-1.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/servlet-api-2.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-httpclient-3.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/hamcrest-core-1.3.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/zookeeper-3.4.6.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/httpcore-4.2.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/hadoop-annotations-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/asm-3.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-net-3.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-logging-1.1.3.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/curator-recipes-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/junit-4.11.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jersey-json-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-el-1.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/htrace-core-3.0.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0-tests.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-nfs-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-io-2.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/asm-3.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-el-1.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/htrace-core-3.0.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/hadoop-hdfs-nfs-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/hadoop-hdfs-2.6.0-tests.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/hdfs/hadoop-hdfs-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/activation-1.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-client-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/guice-3.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jetty-6.1.26.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-lang-2.6.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-io-2.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jline-0.9.94.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/guava-11.0.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-codec-1.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/xz-1.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/javax.inject-1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/servlet-api-2.5.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jettison-1.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/asm-3.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-json-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-common-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-registry-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-api-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-common-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-client-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/junit-4.11.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.6.0.jar:/home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-tests.jar:/home/hadoop/hadoop-2.6.0/contrib/capacity-scheduler/*.jar STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z STARTUP_MSG: java = 1.8.0_25 ************************************************************/ 14/12/25 20:44:57 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 14/12/25 20:44:57 INFO namenode.NameNode: createNameNode [-format] Formatting using clusterid: CID-95ccd75a-2b91-4478-a47b-729ad801f600 14/12/25 20:44:58 INFO namenode.FSNamesystem: No KeyProvider found. 14/12/25 20:44:58 INFO namenode.FSNamesystem: fsLock is fair:true 14/12/25 20:44:58 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 14/12/25 20:44:58 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true 14/12/25 20:44:58 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 14/12/25 20:44:58 INFO blockmanagement.BlockManager: The block deletion will start around 2014 Dec 26 11:27:58 14/12/25 20:44:58 INFO util.GSet: Computing capacity for map BlocksMap 14/12/25 20:44:58 INFO util.GSet: VM type = 64-bit 14/12/25 20:44:58 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB 14/12/25 20:44:58 INFO util.GSet: capacity = 2^21 = 2097152 entries 14/12/25 20:44:58 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false 14/12/25 20:44:58 INFO blockmanagement.BlockManager: defaultReplication = 2 14/12/25 20:44:58 INFO blockmanagement.BlockManager: maxReplication = 512 14/12/25 20:44:58 INFO blockmanagement.BlockManager: minReplication = 1 14/12/25 20:44:58 INFO blockmanagement.BlockManager: maxReplicationStreams = 2 14/12/25 20:44:58 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false 14/12/25 20:44:58 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000 14/12/25 20:44:58 INFO blockmanagement.BlockManager: encryptDataTransfer = false 14/12/25 20:44:58 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000 14/12/25 20:44:58 INFO namenode.FSNamesystem: fsOwner = hadoop (auth:SIMPLE) 14/12/25 20:44:58 INFO namenode.FSNamesystem: supergroup = hadoop 14/12/25 20:44:58 INFO namenode.FSNamesystem: isPermissionEnabled = true 14/12/25 20:44:58 INFO namenode.FSNamesystem: HA Enabled: false 14/12/25 20:44:58 INFO namenode.FSNamesystem: Append Enabled: true 14/12/25 20:44:58 INFO util.GSet: Computing capacity for map INodeMap 14/12/25 20:44:58 INFO util.GSet: VM type = 64-bit 14/12/25 20:44:58 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB 14/12/25 20:44:58 INFO util.GSet: capacity = 2^20 = 1048576 entries 14/12/25 20:44:58 INFO namenode.NameNode: Caching file names occuring more than 10 times 14/12/25 20:44:58 INFO util.GSet: Computing capacity for map cachedBlocks 14/12/25 20:44:58 INFO util.GSet: VM type = 64-bit 14/12/25 20:44:58 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB 14/12/25 20:44:58 INFO util.GSet: capacity = 2^18 = 262144 entries 14/12/25 20:44:58 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 14/12/25 20:44:58 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 14/12/25 20:44:58 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 14/12/25 20:44:58 INFO namenode.FSNamesystem: Retry cache on namenode is enabled 14/12/25 20:44:58 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 14/12/25 20:44:58 INFO util.GSet: Computing capacity for map NameNodeRetryCache 14/12/25 20:44:58 INFO util.GSet: VM type = 64-bit 14/12/25 20:44:58 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB 14/12/25 20:44:58 INFO util.GSet: capacity = 2^15 = 32768 entries 14/12/25 20:44:58 INFO namenode.NNConf: ACLs enabled? false 14/12/25 20:44:58 INFO namenode.NNConf: XAttrs enabled? true 14/12/25 20:44:58 INFO namenode.NNConf: Maximum size of an xattr: 16384 14/12/25 20:44:59 INFO namenode.FSImage: Allocated new BlockPoolId: BP-932372205-192.168.1.100-1419611278968 14/12/25 20:44:59 INFO common.Storage: Storage directory /home/hadoop/hadoop-2.6.0/data/dfs/name has been successfully formatted. 14/12/25 20:44:59 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 14/12/25 20:44:59 INFO util.ExitUtil: Exiting with status 0 14/12/25 20:44:59 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at vb-host-1/192.168.1.100 ************************************************************/
This command must be executed *ONLY ONCE*.
On the master node vb-host-1:
We first need to start the NameNode.
To do that, execute the following command:
hadoop-daemon.sh start namenode
The following is the typical output:
starting namenode, logging to /home/hadoop/hadoop-2.6.0/logs/hadoop-hadoop-namenode-vb-host-1.out
NOTE :: To stop the NameNode, execute the following command:
hadoop-daemon.sh stop namenode
Next, we need to start the SecondaryNameNode.
hadoop-daemon.sh start secondarynamenode
The following is the typical output:
starting secondarynamenode, logging to /home/hadoop/hadoop-2.6.0/logs/hadoop-hadoop-secondarynamenode-vb-host-1.out
NOTE :: To stop the SecondaryNameNode, execute the following command:
hadoop-daemon.sh stop secondarynamenode
The following screenshot shows the output of executing the jps command on the master node:
On the slave nodes vb-host-2 and vb-host-3:
We need to start the DataNode.
To do that, execute the following command:
hadoop-daemon.sh start datanode
The following is the typical output:
starting datanode, logging to /home/hadoop/hadoop-2.6.0/logs/hadoop-hadoop-datanode-vb-host-x.out
NOTE :: To stop the DataNode, execute the following command:
hadoop-daemon.sh stop datanode
The following screenshot shows the output of executing the jps command on the slave node:
The following screenshot shows the web browser pointing to the NameNode URL http://vb-host-1:50070:
The following screenshot shows the result of clicking on the Datanodes tab:
The following screenshot shows the web browser pointing to the SecondaryNameNode URL http://vb-host-1:50090:
On the master node vb-host-1:
We need to start the ResourceManager.
To do that, execute the following command:
yarn-daemon.sh start resourcemanager
The following is the typical output:
starting resourcemanager, logging to /home/hadoop/hadoop-2.6.0/logs/yarn-hadoop-resourcemanager-vb-host-1.out
NOTE :: To stop the ResourceManager, execute the following command:
yarn-daemon.sh stop resourcemanager
The following screenshot shows the output of executing the jps command on the master node:
The following screenshot shows the web browser pointing to the YARN Cluster URL http://vb-host-1:8088/cluster:
On the slave nodes vb-host-2 and vb-host-3:
We need to start the NodeManager.
To do that, execute the following command:
yarn-daemon.sh start nodemanager
The following is the typical output:
starting nodemanager, logging to /home/hadoop/hadoop-2.6.0/logs/yarn-hadoop-nodemanager-vb-host-x.out
NOTE :: To stop the NodeManager, execute the following command:
yarn-daemon.sh stop nodemanager
The following screenshot shows the output of executing the jps command on the slave node:
The following screenshot shows the web browser showing the Nodes of the YARN Cluster through the URL http://vb-host-1:8088/cluster/nodes:
This completes the installation, the necessary setup, and the start-up of our 3-node Hadoop 2.x cluster.
We were also able to successfully use HDFS and demonstrated the execution of the MapReduce example on our 3-node Hadoop 2.x cluster.
References