In this chapter, we’ll extend our EDA repertoire to cover bivariate data, which means studying the relationships between pairs of variables, rather than focusing only on one variable at a time. This is where most of the action is: you’ll be awed and impressed by how much more we can dig out of a data set in this chapter.
Bivariate data analysis is especially suited to the tables (in Python, DataFrames) from section 7.1 and chapters 16–18. This is because each column of a table is a variable that matches one-for-one with every other column in the table.
In the Simpsons example (p. 174), the fourth species value corresponds to Lisa, as does the fourth age value, the fourth fave value, the fourth gender value, the fourth fave value, the fourth IQ value, the fourth hair value, and the fourth salary value. This means that if we examine any two columns, we know that matching indices go together (i.e., represent the same person). This implicit connection is what allows us to meaningfully examine a pair of variables.