Distributed GPU Computing with Dask
Distributed GPU Computing with Dask
Nick Becker RAPIDS Engineering
RAPIDS
Scaling RAPIDS with Dask
Data Preparation
cuDF cuIO Analytics
Dask
cuML Machine Learning
Model Training
cuGraph Graph Analytics
GPU Memory
Visualization
PyTorch Chainer MxNet
Deep Learning
cuXfilter pyViz Visualization
2
Dask
What is Dask?
? Distributed compute scheduler built to scale Python
? Scales workloads from laptops to supercomputer clusters
? Extremely modular: disjoint scheduling, compute, data transfer and out-of-core handling
? Multiple workers per node allow easier one-worker-per-GPU model
3
Why Dask?
PyData Native
? Easy Migration: Built on top of NumPy, Pandas
Scikit-Learn, etc.
? Easy Training: With the same APIs ? Trusted: With the same developer community
Deployable
? HPC: SLURM, PBS, LSF, SGE ? Cloud: Kubernetes ? Hadoop/Spark: Yarn
Easy Scalability
? Easy to install and use on a laptop ? Scales out to thousand-node clusters
Popular
? Most common parallelism framework today in
the PyData and SciPy community
4
Combine Dask with cuDF
Many CPU DataFrames form a distributed CPU DataFrame
5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- a unified data infrastructure architecture
- dask for scalable computing cheat sheet
- scalable machine learning with dask
- fundamentals of accelerated data science with rapids
- from inception to insight accelerating ai productivity
- 126 proc of the 14th python in science conf scipy
- click through rate prediction data processing and model
- distributed gpu computing with dask
- lecture 4 dask github pages
- scaling rapids with dask nvidia