sqlcontext spark example apache-spark hive apache-spark-sql hivecontext

apache-spark - sqlcontext - spark sql example



Usar TestHiveContext/HiveContext en pruebas unitarias (1)

Es una vieja pregunta, pero me encontré con problemas similares, terminé usando spark-testing-base :

import com.holdenkarau.spark.testing.SharedSparkContext import org.apache.spark.sql.hive.test.TestHiveContext import org.scalatest.FunSuite class RowToProtoMapper$Test extends FunSuite with SharedSparkContext { test("route mapping") { val hc = new TestHiveContext(sc) /* Some test */ } }

Estoy tratando de hacer esto en pruebas unitarias:

val sConf = new SparkConf() .setAppName("RandomAppName") .setMaster("local") val sc = new SparkContext(sConf) val sqlContext = new TestHiveContext(sc) // tried new HiveContext(sc) as well

Pero entiendo esto:

[scalatest] Exception encountered when invoking run on a nested suite - java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient *** ABORTED *** [scalatest] java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient [scalatest] at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346) [scalatest] at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:120) [scalatest] at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:163) [scalatest] at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:161) [scalatest] at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:168) [scalatest] at org.apache.spark.sql.hive.test.TestHiveContext.<init>(TestHive.scala:72) [scalatest] at mypackage.NewHiveTest.beforeAll(NewHiveTest.scala:48) [scalatest] at org.scalatest.BeforeAndAfterAll$class.beforeAll(BeforeAndAfterAll.scala:187) [scalatest] at mypackage.NewHiveTest.beforeAll(NewHiveTest.scala:35) [scalatest] at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:253) [scalatest] at mypackage.NewHiveTest.run(NewHiveTest.scala:35) [scalatest] at org.scalatest.Suite$class.callExecuteOnSuite$1(Suite.scala:1491)

El código funciona perfectamente cuando ejecuto usando spark-submit, pero no en pruebas unitarias. ¿Cómo soluciono esto para las pruebas unitarias?