Accelerating Data Science Workflows with RAPIDS

HIGH-PERFORMANCE DATA SCIENCE WITH RAPIDS

Zahra Ronaghi Senior Data Scientist

END-TO-END ACCELERATED GPU DATA SCIENCE

Data Processing Evolution

Faster data access, less data movement

Hadoop Processing, Reading from disk

HDFS Read

Query

HDFS HDFS Write Read

ETL

HDFS HDFS Write Read

Spark In-Memory Processing

HDFS Read

Query

ETL

ML Train

Traditional GPU Processing

HDFS Read

RGePaUd Query WCrPiUte

GPU Read

ETL

CPU Write

GPU ML Read Train

5-10x Improvement More code

Language rigid Substantially on GPU

ML Train

25-100x Improvement Less code

Language flexible Primarily In-Memory

3

Data Movement and Transformation

The bane of productivity and performance

APP B

Read Data

CPU

APP B

Copy & Convert APP A

Copy & Convert Copy & Convert

APP B

GPU Data

GPU Data

APP A

GPU

APP A

Load Data

4

Data Movement and Transformation

What if we could keep data on the GPU?

APP B

Read Data

CPU

APP B

Copy & Convert APP A

Copy & Convert Copy & Convert

APP B

GPU Data

GPU Data

APP A

GPU

APP A

Load Data

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download