Python for Data Analysis - Boston University
[Pages:47]Python for Data Analysis
Research Computing Services Katia Oleinik (koleinik@bu.edu)
Tutorial Content
Overview of Python Libraries for Data Scientists
Reading Data; Selecting and Filtering the Data; Data manipulation, sorting, grouping, rearranging
Plotting the data
Descriptive statistics
Inferential statistics
2
Python Libraries for Data Science
Many popular Python toolboxes/libraries:
? NumPy
? SciPy
? Pandas ? SciKit-Learn
All these libraries are installed on the SCC
Visualization libraries
? matplotlib ? Seaborn
and many more ...
3
Python Libraries for Data Science
NumPy:
introduces objects for multidimensional arrays and matrices, as well as functions that allow to easily perform advanced mathematical and statistical operations on those objects
provides vectorization of mathematical operations on arrays and matrices which significantly improves the performance
many other python libraries are built on NumPy
Link:
4
Python Libraries for Data Science
SciPy:
collection of algorithms for linear algebra, differential equations, numerical integration, optimization, statistics and more
part of SciPy Stack built on NumPy
Link:
5
Python Libraries for Data Science
Pandas:
adds data structures and tools designed to work with table-like data (similar to Series and Data Frames in R)
provides tools for data manipulation: reshaping, merging, sorting, slicing, aggregation etc.
allows handling missing data
Link:
6
Python Libraries for Data Science
SciKit-Learn:
provides machine learning algorithms: classification, regression, clustering, model validation etc.
built on NumPy, SciPy and matplotlib
Link:
7
Python Libraries for Data Science
matplotlib:
python 2D plotting library which produces publication quality figures in a variety of hardcopy formats
a set of functionalities similar to those of MATLAB
line plots, scatter plots, barcharts, histograms, pie charts etc.
relatively low-level; some effort needed to create advanced visualization
Link:
8
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- pandas amazon web services inc
- data wrangling with tidy data ahsmart
- data transformation with cheat sheet github
- cheat sheet pandas python datacamp
- r reference card university of oxford
- data transformation with dplyr cheat sheet github pages
- gpu accelerated dataframes in python nvidia
- dataframes university of cambridge
- the split apply combine strategy for data analysis hadley
- data wrangling tidy data pandas