Elastic Spark Programming Framework (ESPF)

[Pages:51]Elastic Spark Programming Framework (ESPF)

A Dependency-Injection Based Programming Framework for Spark Applications

Bruce Kuo, Software Engineer, APAC Data, email: bruce3557@yahoo-

1

Outline

Motivation & Related Work Prerequisite Programming Framework Integration with Components Conclusion Q&A

2

Motivation & Related Work

3

Native Spark Application

public class GainsChartDataGeneration { public static void main(String[] args) { String sortedPredictionResultTable = args[0]; String gainTable = args[1];

Initialization

SparkConf conf = new SparkConf(); JavaSparkContext sc = new JavaSparkContext(conf); HiveContext sqlContext = new HiveContext(sc.sc());

Main logic

DataFrame dataFrame = sqlContext.table(sortedPredictionResultTable) .select("target", "score");

// Generate schema ... StructType schema = DataTypes.createStructType(newFields);

long totalCount = dataFrame.count();

4

List seqList = new ArrayList();

for (long i = 100; i ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download