You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 27, 2022. It is now read-only.
Hi - I have built a gce structure using ./bdutil deploy --bucket anintelclustergen1-m-disk -n 2 -P anintelcluster -e extensions/spark/spark_on_yarn_env.sh.
In the bucket paraments, both in command and bdutil_evn.sh, I have specified a non-boot bucket.
In the core-site.xml (under hadoop/etc) on the master, it show the xml with the correct bucket value under defaultFS.
However, the hadoop console (50070) does not show the nonboot bucket attached, but shows the boot disk attached on the name node.
Node Last contact Admin State Capacity Used Non DFS Used Remaining Blocks Block pool used Failed Volumes Version
anintelcluster.c.anintelcluster.internal:50010 (10.240.0.2:50010) 0 In Service 98.4 GB 28 KB 6.49 GB 91.91 GB 0 28 KB (0%) 0 2.7.1
Is it possible to specify a non-boot bucket with the singlenode setup?
If not, what needs to be done to be able to specify the non-boot disk, which will both get attached to instance as read/write and also be used by hadoop for storage etc?
The text was updated successfully, but these errors were encountered:
So, the GCS connector actually isn't able to be mounted as a local filesystem, it simply plugs into Hadoop at Hadoop's FileSystem.java layer. This means it gets used as the FileSystem for Hadoop-specific jobs, but doesn't change the way the local filesystem uses a real disk as a block device.
The GCS connector also lives independently alongside Hadoop's HDFS. So, when you're looking at 50070, you're seeing the actual HDFS setup which writes blocks out to the local disk and not to GCS, which would be accessible as a "hdfs:///" path for Hadoop jobs. In general, if you've configured defaultFS to use a GCS path, you can just ignore whatever the NameNode on 50070 is reporting, since in that case your typical Hadoop jobs simply won't interact with the HDFS setup at all.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi - I have built a gce structure using ./bdutil deploy --bucket anintelclustergen1-m-disk -n 2 -P anintelcluster -e extensions/spark/spark_on_yarn_env.sh.
In the bucket paraments, both in command and bdutil_evn.sh, I have specified a non-boot bucket.
In the core-site.xml (under hadoop/etc) on the master, it show the xml with the correct bucket value under defaultFS.
However, the hadoop console (50070) does not show the nonboot bucket attached, but shows the boot disk attached on the name node.
Node Last contact Admin State Capacity Used Non DFS Used Remaining Blocks Block pool used Failed Volumes Version
anintelcluster.c.anintelcluster.internal:50010 (10.240.0.2:50010) 0 In Service 98.4 GB 28 KB 6.49 GB 91.91 GB 0 28 KB (0%) 0 2.7.1
Is it possible to specify a non-boot bucket with the singlenode setup?
If not, what needs to be done to be able to specify the non-boot disk, which will both get attached to instance as read/write and also be used by hadoop for storage etc?
The text was updated successfully, but these errors were encountered: