The Platform Inside and Out - NERSC

The Platform Inside and Out

Nick Becker RAPIDS Engineering

RAPIDS

End-to-End Accelerated GPU Data Science

Data Preparation

cuDF cuIO Analytics

Dask

cuML Machine Learning

Model Training

cuGraph Graph Analytics

GPU Memory

Visualization

PyTorch Chainer MxNet

Deep Learning

cuxfilter pyViz Visualization

2

Data Processing Evolution

Faster data access, less data movement

Hadoop Processing, Reading from disk

HDFS Read

Query

HDFS HDFS Write Read

ETL

HDFS HDFS Write Read

Spark In-Memory Processing

HDFS Read

Query

ETL

ML Train

Traditional GPU Processing

HDFS Read

GPU Read

Query

CPU Writ

e

GPU Read

ETL

CPU Writ

e

GPU ML Read Train

5-10x Improvement More code

Language rigid Substantially on GPU

ML Train

25-100x Improvement

Less code Language flexible Primarily In-Memory

3

DDaattaa MMoovveemmeenntt aanndd TTrraannssffoorrmmaattiioonn

The bane of productivity and performance

APP B

Read Data

CPU

APP B Copy & Convert APP A

Copy & Convert Copy & Convert

APP B

GPU Data

GPU

GPU Data

APP A

APP A

Load Data

4

DDaattaa MMoovveemmeenntt aanndd TTrraannssffoorrmmaattiioonn What if we could keep data on the GPU?

APP B

Read Data

CPU

APP B Copy & Convert APP A

Copy & Convert Copy & Convert

APP B

GPU Data

GPU

GPU Data

APP A

APP A

Load Data

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download