GPU Computing with Apache Spark and Python - NVIDIA

[Pages:55]GPU Computing with Apache Spark and Python

Stan Seibert Siu Kwan Lam

April 5, 2016

? 2015 Continuum Analytics- Confidential & Proprietary

My Background

? Trained in particle physics ? Using Python for data analysis for 10 years ? Using GPUs for data analysis for 7 years ? Currently lead the High Performance Python

team at Continuum

2

About Continuum Analytics

? We give superpowers to people who change the world!

? We want to help everyone analyze their data with Python (and other tools), so we offer: ? Enterprise Products ? Consulting ? Training ? Open Source

3

? I'm going to use Anaconda throughout this presentation.

? Anaconda is a free Mac/Win/Linux Python distribution: ? Based on conda, an open source package manager ? Installs both Python and non-Python dependencies ? Easiest way to get the software I will talk about today

?

4

Overview

1. Why Python? 2. Numba: A Python JIT compiler for the CPU and GPU 3. PySpark: Distributed computing for Python 4. Example: Image Registration 5. Tips and Tricks 6. Conclusion

WHY PYTHON?

6

Why is Python so popular?

? Straightforward, productive language for system administrators, programmers, scientists, analysts and hobbyists

? Great community: ? Lots of tutorial and reference materials ? Easy to interface with other languages ? Vast ecosystem of useful libraries

7

8

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download