Introduction to Apache Spark

Introduction to Apache Spark

Patrick Wendell - Databricks

What is Spark?

Fast and Expressive Cluster Computing Engine Compatible with Apache Hadoop

Efficient

? General execution graphs

? In-memory storage

Usable

? Rich APIs in Java, Scala, Python

? Interactive shell

The Spark Community

+You!

Today's Talk

? The Spark programming model ? Language and deployment choices ? Example algorithm (PageRank)

Key Concept: RDD's

Write programs in terms of operations on distributed datasets

Resilient Distributed Datasets

? Collections of objects spread across a cluster, stored in RAM or on Disk

? Built through parallel transformations

? Automatically rebuilt on failure

Operations

? Transformations (e.g. map, filter, groupBy)

? Actions (e.g. count, collect, save)

................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Introduction to Apache Spark

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches

Introduction to Apache Spark

Apache spark

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches