Table of Contents
PYTHON VISUALIZATIONS Notes by Michael Brothers
Table of Contents
What's What ....................................................................................................................................................................... 3 Vocabulary.......................................................................................................................................................................... 3 Jupyter Notebook Tips & Tricks ......................................................................................................................................... 3
MATPLOTLIB ........................................................................................................................................................................... 4 Documentation................................................................................................................................................................... 4 Standard imports................................................................................................................................................................ 4 Plots, Figures and Axes....................................................................................................................................................... 4 The figure/axis object-oriented approach ...................................................................................................................... 4 The pyplot function-based approach.............................................................................................................................. 5 Running as a Python script.............................................................................................................................................. 5 Saving a figure .................................................................................................................................................................... 6 Anatomy of a figure............................................................................................................................................................ 6 Figure size ........................................................................................................................................................................... 7 Axes aspect ratio............................................................................................................................................................. 8 Legend position .................................................................................................................................................................. 9 Two charts on one figure ................................................................................................................................................. 10 Two figures on one plot ................................................................................................................................................... 11 Shared axes ................................................................................................................................................................... 12 Inset figures ...................................................................................................................................................................... 13 Default colors ................................................................................................................................................................... 14 Line and marker styles ..................................................................................................................................................... 14
MATPLOTLIB CHARTS ........................................................................................................................................................... 15 Load sample data ............................................................................................................................................................. 15 Scatter plots...................................................................................................................................................................... 15 Marker transparency .................................................................................................................................................... 16 Marker size and color ................................................................................................................................................... 17 Bar charts.......................................................................................................................................................................... 18 The problem with matplotlib ........................................................................................................................................ 18 Vertical bar charts......................................................................................................................................................... 20 Error bars ...................................................................................................................................................................... 21 Stacked bar charts......................................................................................................................................................... 23 Bar labels....................................................................................................................................................................... 24 Grouped bar charts ....................................................................................................................................................... 25 Horizontal bar charts .................................................................................................................................................... 26 Histograms........................................................................................................................................................................ 27 Layered histograms....................................................................................................................................................... 28 Edgecolors..................................................................................................................................................................... 28 Data as an image with imshow ........................................................................................................................................ 29
SEABORN .............................................................................................................................................................................. 30 Documentation................................................................................................................................................................. 30 Standard imports.............................................................................................................................................................. 30 Seaborn vs. matplotlib ..................................................................................................................................................... 30 Load dataset ..................................................................................................................................................................... 30 Available Seaborn Plots ................................................................................................................................................... 31
1
REV 0323
SEABORN RELATIONAL PLOTS.............................................................................................................................................. 32 Scatter plots...................................................................................................................................................................... 32 Line plots........................................................................................................................................................................... 34
SEABORN DISTRIBUTIONAL PLOTS ...................................................................................................................................... 36 Histograms........................................................................................................................................................................ 36 Kernel Density Estimate (KDE) Plots ................................................................................................................................ 38 Rug Plots ........................................................................................................................................................................... 39 Bivariate distributions...................................................................................................................................................... 39
SEABORN CATEGORICAL PLOTS ........................................................................................................................................... 40 Strip plots.......................................................................................................................................................................... 40 Box & Whisker plots ......................................................................................................................................................... 41 Violin plots........................................................................................................................................................................ 42 Bar plots vs. Count plots .................................................................................................................................................. 43
ADDITIONAL SEABORN PLOTS ............................................................................................................................................. 45 Heatmaps.......................................................................................................................................................................... 45
PANDAS BUILT-IN VISUALIZATIONS..................................................................................................................................... 46 Scatter plot ....................................................................................................................................................................... 46 Hexbin plot ....................................................................................................................................................................... 48
APPENDIX I ? DEEP DIVE ...................................................................................................................................................... 50 Plotting a KDE by hand ..................................................................................................................................................... 50 Cool Decagon Plot in pandas ........................................................................................................................................... 51
APPENDIX II ? ADDITIONAL RESOURCES ............................................................................................................................. 52
The following courses and resources aided in the creation of this document: Learning Python for Data Analysis and Visualization by Jose Portilla
Python for Financial Analysis and Algorithmic Trading by Jose Portilla
Python for Data Science and Machine Learning Bootcamp by Jose Portilla
Python for Machine Learning & Data Science Masterclass by Jose Portilla
2
REV 0323
This a Companion Guide to Python for Data Analysis Using Numpy and Pandas
Using this guide Code is written in a fixed-width font:
Red indicates required syntax/new concepts Green represents arbitrary variable assignments Blue indicates program output
Notes in brown indicate older, deprecated methods. All code is Python 3.11.0, NumPy 1.24.1, pandas 1.5.2, matplotlib 3.6.2 and Seaborn 0.12.2 unless otherwise noted.
What's What NumPy ? fundamental package for scientific computing & working with array data Pandas ? offers high-performance data structures (Series, DataFrames), built-in visualization, file reading tools Matplotlib ? data visualization package Seaborn Libraries ? specialized visualizations (heatmaps et al) Beautiful Soup ? a web-scraping tool SciPy ? a scientific/technical computing library built on NumPy SciKit-Learn ? a machine learning library
Vocabulary numpy: An Array is numpy's basic data structure. A one-dimensional array is called a vector, while a 2D array
is a matrix (although it is possible for a matrix to have just one row or one column) pandas: A Series is built on top of an array, allowing you to label the data and index it formally
A DataFrame is built on top of Series, and is essentially many series put together with different column names but sharing the same index.
Arrays are numpy data types while Series and DataFrame are pandas data types. They have different available methods and attributes.
Jupyter Notebook Tips & Tricks
Shift+Enter
run the current cell and move to the next cell (create one if needed)
Ctrl+Enter
run the current cell but remain inside it
pwd
print working directory
ls
print a list of current directory contents
To determine what version of Python is running (in jupyter and elsewhere): import sys print(sys.version)
To play a YouTube video inside a Jupyter notebook (video owner must permit playing on other websites): from IPython.display import YouTubeVideo YouTubeVideo('J0Aq44Pze-w')
Sometimes when creating a visualization, jupyter adds a scrollbar which obscures viewing the entire plot on one screen. To remove the scrollbar, click somewhere outside the cell and hit Shift-O (letter "o") to toggle scrolling of current outputs. This setting is also found in the Cell menu.
3
REV 0323
MATPLOTLIB Much of matplotlib's functionality has been baked into pandas, and can be accessed from pandas directly. However, to fully appreciate the amount of customization that's available we discuss matplotlib first, and cover pandas built-in visualization in a later section.
Documentation Matplotlib User's Guide:
Standard imports import numpy as np import pandas as pd import matplotlib.pyplot as plt
Note: the magic command %matplotlib inline used by older versions of Jupyter to show plots as cell outputs has been deprecated.
Plots, Figures and Axes Matplotlib architecture centers around plots, which combine figures (the actual charts) and axes (the dimensional attributes). Each of these elements can be modified and customized to meet specific needs.
The figure/axis object-oriented approach Working with figure and axes objects directly allows for the most control and customization.
x,y = [1,2,3,4], [1,2.5,2.5,4] create some data
fig, ax = plt.subplots()
create a figure and an axes object (fig, ax are arbitrary names)
ax.plot(x, y, label='some data')
ax.set_xlabel('x-axis label')
ax.set_ylabel('y-axis label')
ax.set_title('Plot Title')
ax.legend()
A few things of note: ? Matplotlib's default plot is a line chart. ? Tick mark values, axis ranges, linewidths, colors and the location of the legend were set automatically ? although we can control them if we want. ? Axis labels, plot titles and legends are optional, but the're a good habit to adopt. Charts should say what they represent.
4
REV 0323
? Jupyter returns the name of the most recent object rendered (in this case the Legend). We can suppress this by adding a semicolon to the last line of code.
The pyplot function-based approach Matplotlib also offers a function-based approach. Pyplot does much of the heavy lifting, and more closely resembles the syntax we'll see with pandas built-in visualization.
x,y = [1,2,3,4], [1,2.5,2.5,4] create some data
plt.plot(x, y, label='some data')
plt.xlabel('x-axis label')
plt.ylabel('y-axis label')
plt.title('Plot Title')
plt.legend();
add a semicolon to suppress jupyter's object name output
Running as a Python script Jupyter automatically paints a matplotlib plot as cell output. When working outside of Jupyter be sure to include plt.show() as the last line of code.
5
REV 0323
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- table of common cardiac medications
- mbti table of personality types
- time table of examination 2019
- complete table of values calculator
- table of values equation calculator
- table of values generator
- graph table of values calculator
- linear equation table of values
- table of standard scores and percentiles
- table of derivatives pdf
- table of integrals exponential functions
- table of exponential integrals