The Platform Inside and Out - NERSC
The Platform Inside and Out
Nick Becker RAPIDS Engineering
RAPIDS
End-to-End Accelerated GPU Data Science
Data Preparation
cuDF cuIO Analytics
Dask
cuML Machine Learning
Model Training
cuGraph Graph Analytics
GPU Memory
Visualization
PyTorch Chainer MxNet
Deep Learning
cuxfilter pyViz Visualization
2
Data Processing Evolution
Faster data access, less data movement
Hadoop Processing, Reading from disk
HDFS Read
Query
HDFS HDFS Write Read
ETL
HDFS HDFS Write Read
Spark In-Memory Processing
HDFS Read
Query
ETL
ML Train
Traditional GPU Processing
HDFS Read
GPU Read
Query
CPU Writ
e
GPU Read
ETL
CPU Writ
e
GPU ML Read Train
5-10x Improvement More code
Language rigid Substantially on GPU
ML Train
25-100x Improvement
Less code Language flexible Primarily In-Memory
3
DDaattaa MMoovveemmeenntt aanndd TTrraannssffoorrmmaattiioonn
The bane of productivity and performance
APP B
Read Data
CPU
APP B Copy & Convert APP A
Copy & Convert Copy & Convert
APP B
GPU Data
GPU
GPU Data
APP A
APP A
Load Data
4
DDaattaa MMoovveemmeenntt aanndd TTrraannssffoorrmmaattiioonn What if we could keep data on the GPU?
APP B
Read Data
CPU
APP B Copy & Convert APP A
Copy & Convert Copy & Convert
APP B
GPU Data
GPU
GPU Data
APP A
APP A
Load Data
5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- mpi4dask 0 1 user guide ohio state university
- dask parallel computation with blocked algorithms and
- parallel analysis in mdanalysis using the dask parallel
- 126 proc of the 14th python in science conf scipy
- asynchronous execution of python code on task based
- the platform inside and out nersc
- legate numpy accelerated and distributed array
- ucx python a flexible communication library for