Skip to content
This repository has been archived by the owner on Apr 5, 2022. It is now read-only.

Pseudo Distributed Hadoop Setup

Thomas Risberg edited this page May 14, 2013 · 24 revisions

Basic instructions for setting up Hadoop in a Pseudo-Distributed Mode

Make sure you have a current JDK installed (Java 6 or better)

Configure ssh

Make sure you can ssh to the system

Create ssh key and add it to authorized keys

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/ >> ~/.ssh/authorized_keys
chmod 622 ~/.ssh/authorized_keys

Try connecting to local host with ssh (should not be prompted for password)

ssh localhost

Download hadoop-1.0.4-bin.tar.gz

Navigate to

Click Download link, then click Download release now! link

Pick a download mirror

Click hadoop-1.0.4/ directory link

Download "hadoop-1.0.4-bin.tar.gz"

Unpack the downloaded tar

I've unpacked it in a directory named '~/Hadoop'

cd ~/Hadoop
tar xvzf ~/Downloads/hadoop-1.0.4-bin.tar.gz

Create a file to source for the environment settings

Create a hadoop-1.0.4-env file with the following content (modify the JAVA_HOME and HADOOP_PREFIX to match your system):

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-

export HADOOP_PREFIX="/home/trisberg/Hadoop/hadoop-1.0.4"


Modify the main config files in the hadoop-1.0.4/conf directory

core-site.xml should have this content (you can modify the 'hadoop.tmp.dir' directory to your liking):

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

        <description>A base for other temporary directories.</description>

hdfs-site.xml should have this content:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->


mapred-site.xml should have this content:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->


Modify the hadoop-1.0.4.conf/ file.

You need to add the JAVA_HOME setting that we also set in step 5 above (again adjust this to match your system):


# The java implementation to use.  Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-


We are ready to run:

Start by sourcing the environment settings

source hadoop-1.0.4-env

Format the hadoop file system

You only do this step once for a new cluster!

hadoop namenode -format

Start Hadoop namenode, datanode and secondary-namenode

Check that you have the dfs daemons running


You should see something like:

[trisberg@localhost ~]$ jps
27932 SecondaryNameNode
27827 DataNode
26384 NameNode
27988 Jps

Start Hadoop job-tracker and task-tracker

Check that you have the dfs and mapred daemons running


You should see something like:

[trisberg@localhost ~]$ jps
28170 TaskTracker
27932 SecondaryNameNode
28053 JobTracker
27827 DataNode
26384 NameNode
28259 Jps

Once the cluster is up and running you can access the web interfaces on these adresses:

This would be a good time to run the tests for the spring-hadoop project.

When you are done testing you can use these commands to shut the cluster down: