Manipulating and analyzing data with pandas - Eindhoven University of ...

Manipulating and analyzing data with pandas

C?line Comte Nokia Bell Labs France & T?l?com ParisTech

Python Academy - May 20, 2019

Introduction

? Pandas: Python Data Analysis Library

? "An open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language"

()

? Sponsored by NumFOCUS, a non-profit organization in the US (like NumPy, Matplotlib, Jupyter, and Julia)

? Used in StatsModel, sklearn-pandas, Plotly, IPython, Jupyter, Spyder

()

2/50 ? 2019 Nokia

Public

Side remark: BSD licenses

? BSD = Berkeley Software Distribution The first software (an OS actually) to be distributed under BSD license "Permissive" license can be used in a proprietary software

()

3/50 ? 2019 Nokia

Public

Introduction

? Built on top of NumPy ? Part of the SciPy ecosystem

(Scientific Computing Tools for Python) ? Version history (

community.html#history-of-development)

- Project initiated in 2008 - Oldest version in the doc:

0.4.1 (September 2011) - Current version: 0.24.2 (March 2019)

4/50 ? 2019 Nokia

Public

Objectives of the presentation

? Explain when one can benefit from using pandas

? Describe the data structures in pandas Series 1-dimensional array with labels

DataFrame 2-dimensional array with labels Panel 3-dimensional array with labels (deprecated since version 0.20.0)

? Review the data analysis tools in pandas - Import and export data - Select data and reshape arrays - Merge, join, and concatenate arrays - Visualize data -...

5/50 ? 2019 Nokia

Public

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download