Skip to content
Kevin Haack edited this page Jul 13, 2018 · 21 revisions

Apache Hadoop 2.7.0 Docker image

docker pull sequenceiq/hadoop-docker:2.7.0

Start docker image with the right ports

docker run -it -p 50070:50070 -p 50075:50075 -p 2401:8088 -h localhost sequenceiq/hadoop-docker:2.7.0 /etc/bootstrap.sh -bash -u $(date -u +%m%d%H%M%Y)

Start script

cd $HADOOP_PREFIX
alias hadoop='$HADOOP_PREFIX/bin/hadoop'
alias hdfs='$HADOOP_PREFIX/bin/hdfs'
sed -i "6i <property>\n<name>dfs.permissions.enabled</name>\n<value>false</value>\n</property><property><name>dfs.safemode.threshold.pct</name><value>0</value></property><property><name>dfs.webhdfs.enabled</name><value>true</value></property>" /usr/local/hadoop-2.7.0/etc/hadoop/hdfs-site.xml
/usr/local/hadoop/sbin/./stop-dfs.sh;/usr/local/hadoop/sbin/./start-dfs.sh;hadoop fs -mkdir /user/DICE;hadoop fs -mkdir /user/DICE/repo;hadoop fs -mkdir /user/DICE/workflow

Directories and config files

Hadoop home directory

/usr/local/hadoop/

Hadoop log directory

/usr/local/hadoop/logs

Or visit http://localhost:2401/logs/.

Browse the HDFS System

http://localhost:50070/explorer.html

Sequence diagram

Upload files

The following diagramm represents the communication between the microservices and the Hadoop Docker image. In this sequence uploads the user a arbitrary number of files.

Commands

restart nodes

/usr/local/hadoop/sbin/./stop-dfs.sh;/usr/local/hadoop/sbin/./start-dfs.sh