1 Apache Spark - Brigham Young University

Spark SQL • Shark, a backend modified Hive running over Spark. – Limited integration with Spark – Hive optimizer not designed for Spark • Spark SQL reuses parts of Shark, – Hive data loading – In-memory column store • Spark SQL also adds – RDD-aware optimizer – Rich language interfaces 35 ................
................