apache-spark - sqlcontext - spark sql example
Usar TestHiveContext/HiveContext en pruebas unitarias (1)
Es una vieja pregunta, pero me encontré con problemas similares, terminé usando spark-testing-base :
import com.holdenkarau.spark.testing.SharedSparkContext
import org.apache.spark.sql.hive.test.TestHiveContext
import org.scalatest.FunSuite
class RowToProtoMapper$Test extends FunSuite with SharedSparkContext {
test("route mapping") {
val hc = new TestHiveContext(sc)
/* Some test */
}
}
Estoy tratando de hacer esto en pruebas unitarias:
val sConf = new SparkConf()
.setAppName("RandomAppName")
.setMaster("local")
val sc = new SparkContext(sConf)
val sqlContext = new TestHiveContext(sc) // tried new HiveContext(sc) as well
Pero entiendo esto:
[scalatest] Exception encountered when invoking run on a nested suite - java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient *** ABORTED ***
[scalatest] java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
[scalatest] at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
[scalatest] at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:120)
[scalatest] at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:163)
[scalatest] at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:161)
[scalatest] at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:168)
[scalatest] at org.apache.spark.sql.hive.test.TestHiveContext.<init>(TestHive.scala:72)
[scalatest] at mypackage.NewHiveTest.beforeAll(NewHiveTest.scala:48)
[scalatest] at org.scalatest.BeforeAndAfterAll$class.beforeAll(BeforeAndAfterAll.scala:187)
[scalatest] at mypackage.NewHiveTest.beforeAll(NewHiveTest.scala:35)
[scalatest] at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:253)
[scalatest] at mypackage.NewHiveTest.run(NewHiveTest.scala:35)
[scalatest] at org.scalatest.Suite$class.callExecuteOnSuite$1(Suite.scala:1491)
El código funciona perfectamente cuando ejecuto usando spark-submit, pero no en pruebas unitarias. ¿Cómo soluciono esto para las pruebas unitarias?