Skip to main content
Engineering LibreTexts

17.3: Accessing a DataFrame's Metadata

  • Page ID
    39315
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    We can get some meta-information about a DataFrame without even looking at individual rows. If we want to know what the index values themselves are, we use .index:

    Code \(\PageIndex{1}\) (Python):

    print(simpsons.index)

    | Index(['Homer', 'Marge', 'Bart', 'Lisa', 'Maggie', 'SLH'],

    | dtype='object', name='name')

    That weird-looking output tells us several things. First, the index of this DataFrame consists of strings (remember from p. 71 that’s what “dtype='object'” means). Second, the name of the index column is, ironically, “name”. (It could be named anything at all, of course.) Third, the actual index values are Homer, Marge, and all the rest.

    That’s the index, or the “row names,” if you will. To get the column names, we use .columns:

    Code \(\PageIndex{2}\) (Python):

    print(simpsons.columns)

    | Index(['species', 'age', 'gender', 'fave', 'IQ', 'hair',

    | 'salary'], dtype='object')

    Interestingly, this too is an “Index” beast, also comprised of strings. Pandas treats both “axes” of a DataFrame similarly, in that both of them are the same type of thing (an “Index”). Notice that name is not present in the column names list, because as the DataFrame’s index it’s a different sort of thing.

    How many rows does a DataFrame have? This is answerable by using the len() function again:

    Code \(\PageIndex{3}\) (Python):

    print(len(simpsons))

    This is our third use of the word len(): it can be used to find the number of characters in a string, the number of key/value pairs of a Series, and (here) the number of rows of a DataFrame.

    Finally, we often want to get a quick sense of how large a DataFrame is, both in terms of rows and columns. The .shape syntax is handy here:

    Code \(\PageIndex{4}\) (Python):

    print(simpsons.shape)

    | (6, 7)

    This tells us that simpsons has six rows and seven columns. As I mentioned previously (p. 56) this is definitely not the typical case: most DataFrames will have many more rows (thousands or even millions) than columns (at most, dozens).


    This page titled 17.3: Accessing a DataFrame's Metadata is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Stephen Davies (allthemath.org) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.