Python Visualizations - Titanium Ventures

PYTHON VISUALIZATIONS Notes by Michael Brothers

Table of Contents

What's What ....................................................................................................................................................................... 3 Vocabulary.......................................................................................................................................................................... 3 Jupyter Notebook Tips & Tricks ......................................................................................................................................... 3

MATPLOTLIB ........................................................................................................................................................................... 4 Documentation................................................................................................................................................................... 4 Standard imports................................................................................................................................................................ 4 Plots, Figures and Axes....................................................................................................................................................... 4 The figure/axis object-oriented approach ...................................................................................................................... 4 The pyplot function-based approach.............................................................................................................................. 5 Running as a Python script.............................................................................................................................................. 5 Saving a figure .................................................................................................................................................................... 6 Anatomy of a figure............................................................................................................................................................ 6 Figure size ........................................................................................................................................................................... 7 Axes aspect ratio............................................................................................................................................................. 8 Legend position .................................................................................................................................................................. 9 Two charts on one figure ................................................................................................................................................. 10 Two figures on one plot ................................................................................................................................................... 11 Shared axes ................................................................................................................................................................... 12 Inset figures ...................................................................................................................................................................... 13 Default colors ................................................................................................................................................................... 14 Line and marker styles ..................................................................................................................................................... 14

MATPLOTLIB CHARTS ........................................................................................................................................................... 15 Load sample data ............................................................................................................................................................. 15 Scatter plots...................................................................................................................................................................... 15 Marker transparency .................................................................................................................................................... 16 Marker size and color ................................................................................................................................................... 17 Bar charts.......................................................................................................................................................................... 18 The problem with matplotlib ........................................................................................................................................ 18 Vertical bar charts......................................................................................................................................................... 20 Error bars ...................................................................................................................................................................... 21 Stacked bar charts......................................................................................................................................................... 23 Bar labels....................................................................................................................................................................... 24 Grouped bar charts ....................................................................................................................................................... 25 Horizontal bar charts .................................................................................................................................................... 26 Histograms........................................................................................................................................................................ 27 Layered histograms....................................................................................................................................................... 28 Edgecolors..................................................................................................................................................................... 28 Data as an image with imshow ........................................................................................................................................ 29

SEABORN .............................................................................................................................................................................. 30 Documentation................................................................................................................................................................. 30 Standard imports.............................................................................................................................................................. 30 Seaborn vs. matplotlib ..................................................................................................................................................... 30 Load dataset ..................................................................................................................................................................... 30 Available Seaborn Plots ................................................................................................................................................... 31

1

REV 0323

SEABORN RELATIONAL PLOTS.............................................................................................................................................. 32 Scatter plots...................................................................................................................................................................... 32 Line plots........................................................................................................................................................................... 34

SEABORN DISTRIBUTIONAL PLOTS ...................................................................................................................................... 36 Histograms........................................................................................................................................................................ 36 Kernel Density Estimate (KDE) Plots ................................................................................................................................ 38 Rug Plots ........................................................................................................................................................................... 39 Bivariate distributions...................................................................................................................................................... 39

SEABORN CATEGORICAL PLOTS ........................................................................................................................................... 40 Strip plots.......................................................................................................................................................................... 40 Box & Whisker plots ......................................................................................................................................................... 41 Violin plots........................................................................................................................................................................ 42 Bar plots vs. Count plots .................................................................................................................................................. 43

ADDITIONAL SEABORN PLOTS ............................................................................................................................................. 45 Heatmaps.......................................................................................................................................................................... 45

PANDAS BUILT-IN VISUALIZATIONS..................................................................................................................................... 46 Scatter plot ....................................................................................................................................................................... 46 Hexbin plot ....................................................................................................................................................................... 48

APPENDIX I ? DEEP DIVE ...................................................................................................................................................... 50 Plotting a KDE by hand ..................................................................................................................................................... 50 Cool Decagon Plot in pandas ........................................................................................................................................... 51

APPENDIX II ? ADDITIONAL RESOURCES ............................................................................................................................. 52

The following courses and resources aided in the creation of this document: Learning Python for Data Analysis and Visualization by Jose Portilla



Python for Financial Analysis and Algorithmic Trading by Jose Portilla

Python for Data Science and Machine Learning Bootcamp by Jose Portilla

Python for Machine Learning & Data Science Masterclass by Jose Portilla

2

REV 0323

This a Companion Guide to Python for Data Analysis Using Numpy and Pandas

Using this guide Code is written in a fixed-width font:

Red indicates required syntax/new concepts Green represents arbitrary variable assignments Blue indicates program output

Notes in brown indicate older, deprecated methods. All code is Python 3.11.0, NumPy 1.24.1, pandas 1.5.2, matplotlib 3.6.2 and Seaborn 0.12.2 unless otherwise noted.

What's What NumPy ? fundamental package for scientific computing & working with array data Pandas ? offers high-performance data structures (Series, DataFrames), built-in visualization, file reading tools Matplotlib ? data visualization package Seaborn Libraries ? specialized visualizations (heatmaps et al) Beautiful Soup ? a web-scraping tool SciPy ? a scientific/technical computing library built on NumPy SciKit-Learn ? a machine learning library

Vocabulary numpy: An Array is numpy's basic data structure. A one-dimensional array is called a vector, while a 2D array

is a matrix (although it is possible for a matrix to have just one row or one column) pandas: A Series is built on top of an array, allowing you to label the data and index it formally

A DataFrame is built on top of Series, and is essentially many series put together with different column names but sharing the same index.

Arrays are numpy data types while Series and DataFrame are pandas data types. They have different available methods and attributes.

Jupyter Notebook Tips & Tricks

Shift+Enter

run the current cell and move to the next cell (create one if needed)

Ctrl+Enter

run the current cell but remain inside it

pwd

print working directory

ls

print a list of current directory contents

To determine what version of Python is running (in jupyter and elsewhere): import sys print(sys.version)

To play a YouTube video inside a Jupyter notebook (video owner must permit playing on other websites): from IPython.display import YouTubeVideo YouTubeVideo('J0Aq44Pze-w')

Sometimes when creating a visualization, jupyter adds a scrollbar which obscures viewing the entire plot on one screen. To remove the scrollbar, click somewhere outside the cell and hit Shift-O (letter "o") to toggle scrolling of current outputs. This setting is also found in the Cell menu.

3

REV 0323

MATPLOTLIB Much of matplotlib's functionality has been baked into pandas, and can be accessed from pandas directly. However, to fully appreciate the amount of customization that's available we discuss matplotlib first, and cover pandas built-in visualization in a later section.

Documentation Matplotlib User's Guide:

Standard imports import numpy as np import pandas as pd import matplotlib.pyplot as plt

Note: the magic command %matplotlib inline used by older versions of Jupyter to show plots as cell outputs has been deprecated.

Plots, Figures and Axes Matplotlib architecture centers around plots, which combine figures (the actual charts) and axes (the dimensional attributes). Each of these elements can be modified and customized to meet specific needs.

The figure/axis object-oriented approach Working with figure and axes objects directly allows for the most control and customization.

x,y = [1,2,3,4], [1,2.5,2.5,4] create some data

fig, ax = plt.subplots()

create a figure and an axes object (fig, ax are arbitrary names)

ax.plot(x, y, label='some data')

ax.set_xlabel('x-axis label')

ax.set_ylabel('y-axis label')

ax.set_title('Plot Title')

ax.legend()

A few things of note: ? Matplotlib's default plot is a line chart. ? Tick mark values, axis ranges, linewidths, colors and the location of the legend were set automatically ? although we can control them if we want. ? Axis labels, plot titles and legends are optional, but the're a good habit to adopt. Charts should say what they represent.

4

REV 0323

? Jupyter returns the name of the most recent object rendered (in this case the Legend). We can suppress this by adding a semicolon to the last line of code.

The pyplot function-based approach Matplotlib also offers a function-based approach. Pyplot does much of the heavy lifting, and more closely resembles the syntax we'll see with pandas built-in visualization.

x,y = [1,2,3,4], [1,2.5,2.5,4] create some data

plt.plot(x, y, label='some data')

plt.xlabel('x-axis label')

plt.ylabel('y-axis label')

plt.title('Plot Title')

plt.legend();

add a semicolon to suppress jupyter's object name output

Running as a Python script Jupyter automatically paints a matplotlib plot as cell output. When working outside of Jupyter be sure to include plt.show() as the last line of code.

5

REV 0323

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download