Sparking Water Shell: Cloud size under 12 Exception

In my last blog, I compared Sparking Water and H2O. Before I made Sparking-shell work, I run into a lot of issues. One of annoying errors was runtime exception: Cloud size under xx. Searched internet and found many people have the similar problems. There are many recommendations, ranging from downloading the latest and matching version, to set to certain parameters during startup. Unfortunately none of them were working for me. But finally I figured out the issue and would like to share my solution in this blog.

After I downloaded Sparking Water, unzipped the file, and run sparking-shell command as shown from http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.2/2/index.html. It looked good initially.

[sparkling-water-2.2.2]$ bin/sparkling-shell --conf "spark.executor.memory=1g"

-----
  Spark master (MASTER)     : local[*]
  Spark home   (SPARK_HOME) : /opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2
  H2O build version         : 3.14.0.7 (weierstrass)
  Spark build version       : 2.2.0
  Scala version             : 2.11
----

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://10.132.110.145:4040
Spark context available as 'sc' (master = yarn, app id = application_3608740912046_1343).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0.cloudera1
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_144)
Type in expressions to have them evaluated.
Type :help for more information.

scala> 

But when I run command val h2oContext = H2OContext.getOrCreate(spark), it gave me many errors as follows:

scala> import org.apache.spark.h2o._
import org.apache.spark.h2o._

scala> val h2oContext = H2OContext.getOrCreate(spark)
17/11/05 10:07:48 WARN internal.InternalH2OBackend: Increasing 'spark.locality.wait' to value 30000
17/11/05 10:07:48 WARN internal.InternalH2OBackend: Due to non-deterministic behavior of Spark broadcast-based joins
We recommend to disable them by configuring `spark.sql.autoBroadcastJoinThreshold` variable to value `-1`:
sqlContext.sql("SET spark.sql.autoBroadcastJoinThreshold=-1")
17/11/05 10:07:48 WARN internal.InternalH2OBackend: The property 'spark.scheduler.minRegisteredResourcesRatio' is not specified!
We recommend to pass `--conf spark.scheduler.minRegisteredResourcesRatio=1`
17/11/05 10:07:48 WARN internal.InternalH2OBackend: Unsupported options spark.dynamicAllocation.enabled detected!
17/11/05 10:07:48 WARN internal.InternalH2OBackend:
The application is going down, since the parameter (spark.ext.h2o.fail.on.unsupported.spark.param,true) is true!
If you would like to skip the fail call, please, specify the value of the parameter to false.

java.lang.IllegalArgumentException: Unsupported argument: (spark.dynamicAllocation.enabled,true)
  at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:46)
  at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:38)
  at scala.collection.immutable.List.foreach(List.scala:381)
  at org.apache.spark.h2o.backends.internal.InternalBackendUtils$class.checkUnsupportedSparkOptions(InternalBackendUtils.scala:38)
  at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkUnsupportedSparkOptions(InternalH2OBackend.scala:30)
  at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkAndUpdateConf(InternalH2OBackend.scala:60)
  at org.apache.spark.h2o.H2OContext.<init>(H2OContext.scala:90)
  at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:355)
  at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:383)
  ... 50 elided

You can see I need to pass in more parameters when starting sparking-shell. Change the parameters as follows:

bin/sparkling-shell \
--master yarn \
--conf spark.executor.memory=1g \
--conf spark.scheduler.maxRegisteredResourcesWaitingTime=1000000 \
--conf spark.ext.h2o.fail.on.unsupported.spark.param=false \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.sql.autoBroadcastJoinThreshold=-1 \
--conf spark.locality.wait=30000 \
--conf spark.scheduler.minRegisteredResourcesRatio=1

Ok, this time it looked better, at least warning messages disappeared. But got error message java.lang.RuntimeException: Cloud size under 2.

[sparkling-water-2.2.2]$ bin/sparkling-shell \
> --master yarn \
> --conf spark.executor.memory=1g \
> --conf spark.scheduler.maxRegisteredResourcesWaitingTime=1000000 \
> --conf spark.ext.h2o.fail.on.unsupported.spark.param=false \
> --conf spark.dynamicAllocation.enabled=false \
> --conf spark.sql.autoBroadcastJoinThreshold=-1 \
> --conf spark.locality.wait=30000 \
> --conf spark.scheduler.minRegisteredResourcesRatio=1

-----
  Spark master (MASTER)     : yarn
  Spark home   (SPARK_HOME) : /opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2
  H2O build version         : 3.14.0.7 (weierstrass)
  Spark build version       : 2.2.0
  Scala version             : 2.11
----

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://10.132.110.145:4040
Spark context available as 'sc' (master = yarn, app id = application_3608740912046_1344).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0.cloudera1
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_144)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.apache.spark.h2o._
import org.apache.spark.h2o._

scala> val h2oContext = H2OContext.getOrCreate(spark)
java.lang.RuntimeException: Cloud size under 2
  at water.H2O.waitForCloudSize(H2O.java:1689)
  at org.apache.spark.h2o.backends.internal.InternalH2OBackend.init(InternalH2OBackend.scala:117)
  at org.apache.spark.h2o.H2OContext.init(H2OContext.scala:121)
  at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:355)
  at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:383)
  ... 50 elided

Not big deal. Anyway I didn’t specify the number of executors used and executor memory. It may complain about the size of the H2O cluster is too small. Add the following three parameters.

–conf spark.executor.instances=12 \
–conf spark.executor.memory=10g \
–conf spark.driver.memory=8g \

Rerun the whole thing got the same error with the size number changing to 12. It did not look right to me. Then I check out the H2O error logfile and found tons of messages as follows:

11-06 10:23:16.355 10.132.110.145:54325  30452  #09:54321 ERRR: Got IO error when sending batch UDP bytes: java.net.ConnectException: Connection refused
11-06 10:23:16.790 10.132.110.145:54325  30452  #06:54321 ERRR: Got IO error when sending batch UDP bytes: java.net.ConnectException: Connection refused

It looks like Sparking Water can not connect to Spark cluster. After some investigation, I then realized I installed and run sparking-shell from edge node on the BDA. If the H2O cluster was running inside a Spark application, the communication of Spark cluster on BDA is through BDA’s private network, or InfiniteBand network. Edge node can not directly communicate to IB network on BDA. With this assumption in mind, I installed and run Sparking Water on one of BDA nodes, it worked perfectly without any issue. Problem solved!

Advertisements

7 thoughts on “Sparking Water Shell: Cloud size under 12 Exception

  1. Pingback: Access Sparkling Water via R Studio | My Big Data World

  2. Pingback: Running H2O Cluster in Background and at Specific Port Number | My Big Data World

  3. Pingback: Weird Ref-count mismatch Message from H2O | My Big Data World

  4. Pingback: Use Python for H2O | My Big Data World

  5. Pingback: Parquet File Can not Be Read in Sparkling Water H2O | My Big Data World

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s