GO TO TOP
You're reading from Pandas 1.x Cookbook - Second EditionPractical recipes for scientific computing, time series analysis, and exploratory data analysis using Python
Product typeBook
Published inFeb 2020
PublisherPackt
ISBN-139781839213106
Pages pages
Edition2nd Edition
Languages
Python
Concepts
Programming Language
Authors (2):
Theodore Petrou
Theodore Petrou
Theodore Petrou
Theodore Petrou is the founder of Dunder Data, a training company dedicated to helping teach the Python data science ecosystem effectively to individuals and corporations. Read his tutorials and attempt his data science challenges at the Dunder Data website.
Read more
See other products by Theodore Petrou
Matthew Harrison
Matthew Harrison
Matthew Harrison
Matt Harrison has been using Python since 2000. He runs MetaSnake, which provides corporate training for Python and Data Science. He is the author of Machine Learning Pocket Reference, the bestselling Illustrated Guide to Python 3, and Learning the Pandas Library, among other books
Read more
See other products by Matthew Harrison
View More author details
Table of Contents (17) Chapters
Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Get in touch
1. Pandas Foundations
- Introduction
- The pandas DataFrame
- DataFrame attributes
- Understanding data types
- Selecting a column
- Calling Series methods
- Series operations
- Chaining Series methods
- Renaming column names
- Creating and deleting columns
2. Essential DataFrame OperationsFREE CHAPTER
- Selecting multiple DataFrame columns
- Selecting columns with methods
- Ordering column names
- Summarizing a DataFrame
- Chaining DataFrame methods
- DataFrame operations
- Comparing missing values
- Transposing the direction of a DataFrame operation
- Determining college campus diversity
3. Creating and Persisting DataFrames
- Creating DataFrames from scratch
- Writing CSV
- Reading large CSV files
- Using Excel files
- Working with ZIP files
- Working with databases
- Reading JSON
- Reading HTML tables
4. Beginning Data Analysis
- Developing a data analysis routine
- Data dictionaries
- Reducing memory by changing data types
- Selecting the smallest of the largest
- Selecting the largest of each group by sorting
- Replicating nlargest with sort_values
- Calculating a trailing stop order price
5. Exploratory Data Analysis
- Summary statistics
- Column types
- Categorical data
- Continuous data
- Comparing continuous values across categories
- Comparing two continuous columns
- Comparing categorical values with categorical values
- Using the pandas profiling library
6. Selecting Subsets of Data
- Selecting Series data
- Selecting DataFrame rows
- Selecting DataFrame rows and columns simultaneously
- Selecting data with both integers and labels
- Slicing lexicographically
7. Filtering Rows
- Calculating Boolean statistics
- Constructing multiple Boolean conditions
- Filtering with Boolean arrays
- Comparing row filtering and index filtering
- Selecting with unique and sorted indexes
- Translating SQL WHERE clauses
- Improving the readability of Boolean indexing with the query method
- Preserving Series size with the .where method
- Masking DataFrame rows
- Selecting with Booleans, integer location, and labels
8. Index Alignment
- Examining the Index object
- Producing Cartesian products
- Exploding indexes
- Filling values with unequal indexes
- Adding columns from different DataFrames
- Highlighting the maximum value from each column
- Replicating idxmax with method chaining
- Finding the most common maximum of columns
9. Grouping for Aggregation, Filtration, and Transformation
- Defining an aggregation
- Grouping and aggregating with multiple columns and functions
- Removing the MultiIndex after grouping
- Grouping with a custom aggregation function
- Customizing aggregating functions with *args and **kwargs
- Examining the groupby object
- Filtering for states with a minority majority
- Transforming through a weight loss bet
- Calculating weighted mean SAT scores per state with apply
- Grouping by continuous variables
- Counting the total number of flights between cities
- Finding the longest streak of on-time flights
10. Restructuring Data into a Tidy Form
- Tidying variable values as column names with stack
- Tidying variable values as column names with melt
- Stacking multiple groups of variables simultaneously
- Inverting stacked data
- Unstacking after a groupby aggregation
- Replicating pivot_table with a groupby aggregation
- Renaming axis levels for easy reshaping
- Tidying when multiple variables are stored as column names
- Tidying when multiple variables are stored as a single column
- Tidying when two or more values are stored in the same cell
- Tidying when variables are stored in column names and values
11. Combining Pandas Objects
- Appending new rows to DataFrames
- Concatenating multiple DataFrames together
- Understanding the differences between concat, join, and merge
- Connecting to SQL databases
12. Time Series Analysis
- Understanding the difference between Python and pandas date tools
- Slicing time series intelligently
- Filtering columns with time data
- Using methods that only work with a DatetimeIndex
- Counting the number of weekly crimes
- Aggregating weekly crime and traffic accidents separately
- Measuring crime by weekday and year
- Grouping with anonymous functions with a DatetimeIndex
- Grouping by a Timestamp and another column
13. Visualization with Matplotlib, Pandas, and Seaborn
- Getting started with matplotlib
- Object-oriented guide to matplotlib
- Visualizing data with matplotlib
- Plotting basics with pandas
- Visualizing the flights dataset
- Stacking area charts to discover emerging trends
- Understanding the differences between seaborn and pandas
- Multivariate analysis with seaborn Grids
- Uncovering Simpson's Paradox in the diamonds dataset with seaborn
14. Debugging and Testing Pandas
- Apply performance
- Improving apply performance with Dask, Pandarell, Swifter, and more
- Inspecting code
- Debugging in Jupyter
- Managing data integrity with Great Expectations
- Using pytest with pandas
- Generating tests with Hypothesis
15. Other Books You May Enjoy
16. Index
This chapter covers many fundamental operations of the DataFrame. Many of the recipes willbe similar to those in Chapter 1, Pandas Foundations, which primarily covered operations on a Series.
Previous Section
Section 1 of 10
Next Section
Personalised recommendations for you
Based on your interests and search pattern