How to Use the Python Data Analysis Library the Right Way -- Hannah Stepanek

How to Use the Python Data Analysis Library the Right Way

Hannah Stepanek

Hannah Stepanek Portland, OR, USA

Table of Contents

About the Authorvii About the Technical Reviewerix Introductionxi

Chapter 1: Introduction1 About pandas1 How pandas helped build an image of a black hole4 How pandas helps financial institutions make more informed predictions about the future market6 How pandas helps improve discoverability of content6

Chapter 2: Basic Data Access and Merging9 DataFrame creation and access9 The iloc method11 The loc method14 Combining DataFrames using the merge method17 Combining DataFrames using the join method25 Combining DataFrames using the concat method27

Chapter 3: How pandas Works Under the Hood31 Python data structures32 The performance of the CPython interpreter, Python, and NumPy37


Table of Contents An introduction to pandas performance49 Choosing the right DataFrame55

Chapter 4: Loading and Normalizing Data65 pd.read_csv 67 pd.read_json 92 pd.read_sql, pd.read_sql_table, and pd.read_sql_query101

Chapter 5: Basic Data Transformation in pandas109 Pivot and pivot table109 Stack and unstack113 Melt 116 Transpose 117

Chapter 6: The apply Method121 When not to use apply121 When to use apply128 Improving performance of apply using Cython131

Chapter 7: Groupby135 Using groupby correctly135 Indexing 137 Avoiding groupby139

Chapter 8: Performance Improvements Beyond pandas141 Computer architecture141 How NumExpr improves performance146 BLAS and LAPACK150



