HowTo configure defaultFS for hadoop on singlenode/yarn setup #66

blogmaniak · 2015-09-30T01:04:59Z

Hi - I have built a gce structure using ./bdutil deploy --bucket anintelclustergen1-m-disk -n 2 -P anintelcluster -e extensions/spark/spark_on_yarn_env.sh.

In the bucket paraments, both in command and bdutil_evn.sh, I have specified a non-boot bucket.
In the core-site.xml (under hadoop/etc) on the master, it show the xml with the correct bucket value under defaultFS.
However, the hadoop console (50070) does not show the nonboot bucket attached, but shows the boot disk attached on the name node.

Node Last contact Admin State Capacity Used Non DFS Used Remaining Blocks Block pool used Failed Volumes Version
anintelcluster.c.anintelcluster.internal:50010 (10.240.0.2:50010) 0 In Service 98.4 GB 28 KB 6.49 GB 91.91 GB 0 28 KB (0%) 0 2.7.1

Is it possible to specify a non-boot bucket with the singlenode setup?
If not, what needs to be done to be able to specify the non-boot disk, which will both get attached to instance as read/write and also be used by hadoop for storage etc?

dennishuo · 2015-10-22T00:03:27Z

So, the GCS connector actually isn't able to be mounted as a local filesystem, it simply plugs into Hadoop at Hadoop's FileSystem.java layer. This means it gets used as the FileSystem for Hadoop-specific jobs, but doesn't change the way the local filesystem uses a real disk as a block device.

The GCS connector also lives independently alongside Hadoop's HDFS. So, when you're looking at 50070, you're seeing the actual HDFS setup which writes blocks out to the local disk and not to GCS, which would be accessible as a "hdfs:///" path for Hadoop jobs. In general, if you've configured defaultFS to use a GCS path, you can just ignore whatever the NameNode on 50070 is reporting, since in that case your typical Hadoop jobs simply won't interact with the HDFS setup at all.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HowTo configure defaultFS for hadoop on singlenode/yarn setup #66

HowTo configure defaultFS for hadoop on singlenode/yarn setup #66

blogmaniak commented Sep 30, 2015

dennishuo commented Oct 22, 2015

HowTo configure defaultFS for hadoop on singlenode/yarn setup #66

HowTo configure defaultFS for hadoop on singlenode/yarn setup #66

Comments

blogmaniak commented Sep 30, 2015

dennishuo commented Oct 22, 2015