Skip to main content
Engineering LibreTexts

9.4: Geospatial and Heatmap Data Visualization Using Python

  • Page ID
    118129
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
    Learning Objectives

    By the end of this section, you should be able to:

    • 9.4.1 Describe the insights produced by spatial heatmaps based on geospatial data.
    • 9.4.2 Discuss the common features of GIS mapping.
    • 9.4.3 Use a GIS mapper to produce a visualization of geospatial data.

    Geospatial data refers to data that describes the geographic location, shape, size, and other attributes relative to a location on the Earth's surface. Geospatial data can be captured, stored, manipulated, analyzed, and visualized using various technologies and tools.

    Geospatial data visualization using Python involves the representation and analysis of data that has a geographic component, such as latitude and longitude coordinates. Python offers several libraries and utilities for geospatial data visualization and analysis, including the Pandas library discussed in What Are Data and Data Science? Pandas is a Python library specialized for data manipulation and analysis. In a similar way, Geopandas extends the capabilities of Pandas to handle geospatial data. The Geopandas library provides the ability to read, write, analyze, and visualize geospatial data. Geopandas provides a GeoDataFrame object, which is similar to a Pandas DataFrame but includes geometry information for spatial operations.

    Spatial and Grid Heatmaps

    Spatial heatmaps are a data visualization method used to represent the density or intensity of data points within a geographical area using coloring and shading to represent densities of various attributes. These heatmaps provide a visual summary of the distribution and concentration of data points, highlighting areas of high and low density.

    For example, a spatial heatmap of New York City might show crime rate information where higher crime rates in a certain location are represented by darker shades of red and lower crime rates are represented by lighter shades of red. The reader can quickly ascertain the differences in crime rates among various neighborhoods based on the shading in the heatmap.

    Figure 9.6 provides an example of a heatmap showing differences in average temperature for January 2024 as compared to previous temperature data from 1991–2020 where the coloring scheme represents the differences (blue represents a reduction in temperature and red represents an increase in temperature).

    A heatmap with a map of the US labeled January 2024. The legend shows differences from average temperature from -11 represented by blue to 11 represented by red. The northeast US is mostly red and the center of the country is mostly blue. The west coast is a mixture of the two.
    Figure 9.6 Example of Heatmap Showing Differences from Average Temperature for January 2024 by Colors
    (data source: NOAA Climate.gov maps based on data from NOAA National Centers for Environmental Information. [NCEI] www.climate.gov/media/15930, accessed June 23, 2024.)

    Spatial heatmaps are generated by condensing the spatial coordinates of individual data points into a grid or a set of bins covering the geographic area of interest. The density of data points within each grid cell or bin is then calculated, and these density results are then mapped to specific colors using a color gradient map. Geographic areas with higher density values are assigned colors that represent higher intensity, while areas with lower density values are assigned colors that represent lower intensity. Finally, the colored grid cells can be overlaid on a map, and the result is a visual representation of the spatial distribution of data points.

    Grid heatmaps display colors in a two-dimensional array where the x-axis and y-axis represent two differing characteristics and the color coding for a certain cell is based on the combined characteristics for that cell. For example, in Figure 9.7, a matrix is created where the x-axis represents different local farmers and the y-axis represents harvest for various crops (in tons/year), and then each cell is coded based on the harvest value for that combination of farmer and specific crop. At a glance, the viewer can discern that the highest harvests appear to be for potatoes at BioGoods Ltd and for barley at Cornylee Corp. Note that in a grid heatmap, the grids are typically based on a fixed size since the intent is to detect clustering for the characteristics of interest. Also note that the color scheme used is where green and yellow shading represent higher harvest levels and blue and purple represent lower harvest levels.

    A grid heatmap labeled harvest of local farmers (in tons/year). The X axis has 7 farmer names and the Y axis has crops/vegetables. Crops include cucumber, tomato, lettuce, asparagus, potato, wheat, and barley. Color density is represented by yellow for the highest crops, then green, teal, blue, and purple in descending order.
    Figure 9.7 Grid Heatmap Showing Color Density for Crop Harvest by Local Farmer and Type of Crop
    (source: created by author using Matplotlib by D. Hunter, "Matplotlib: A 2D Graphics Environment", Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95, 2007. DOI: https://zenodo.org/records/13308876)

    Exploring Further

    Interactive Heatmaps

    Various interactive heatmaps are available that allow the user to preselect variables of interest and then view the corresponding geospatial map or heatmap. For example, see these interactive heatmaps utilizing crash and demographic data as well as traffic safety improvement activities in California:

    https://catsip.berkeley.edu/resources/crash-data/crash-data-tools-and-resources

    https://safetyheatmap.berkeley.edu/map///

    Using Python to Generate Heatmaps

    There are several ways to generate heatmaps in Python. Two common methods include the following:

    • imshow() function, which is part of MatPlotlib plotting library. The imshow() function can easily display heatmaps such as the two-dimensional heatmap shown in Figure 9.7.
    • heatmap() function, which is part of the Seaborn library. Seaborn provides a high-level capability for statistical graphs.

    Exploring Further

    Seaborn

    Seaborn includes a wide variety of functions for visualizing statistical relationships. This user guide and tutorial provides an excellent introduction to its uses.

    In the next example, Python is used to create a heatmap based on a heatmap function called sns.heatmap(), which is part of the Seaborn library. The example plots number of airline passengers for time-based data of month and year where month is plotted on the horizontal axis, year is plotted on the vertical axis, and the color coding of the heatmap represents the magnitude of the number of airline passengers.

    Example 9.10

    Generate a heatmap of a dataset using the heatmap function that is part of the Seaborn library in Python.

    Answer

    As an example of generating a simple heatmap in Python using the heatmap function, we can make use of an existing dataset in the Seaborn library called “flights” that lists month, year, and number of passengers for historical airline data: Master Flights Seaborn Data.

    Note: Many such datasets are available at Data Depository for Seaborn Examples.

    Once the array is created, the function sns.heatmap() can be used to generate the actual heatmap, which will color code the matrix based on the number of passengers. The sns.heatmap function also shows the color bar to provide the viewer with the scale of color gradient used in the heatmap.

    Python Code

        import seaborn as sns
        import matplotlib.pyplot as plt
        
        # Load the dataset from seaborn library called flights
        heatmap_data = sns.load_dataset("flights")
        
        # Create an array with heatmap data from the flights dataset 
        heatmap_data_array = heatmap_data.pivot(index="year", columns="month", values="passengers")
        
        # Create the heatmap
        sns.heatmap(heatmap_data_array, cmap = 'coolwarm') 
        
        # Set labels and title
        plt.xlabel('Month')
        plt.ylabel('Year')
        plt.title('Heatmap of Number of Passengers')
        
        # Show the plot
        plt.show()
        

    The resulting output will look like this:

    A heatmap of number of passengers with months (Jan through Dec) on the X axis and years (1960 through 1949) on the Y axis. A color key with blue (100) as the lowest and red (600) as the highest value runs along the right side of the graph. The highest values are clustered in July and August of 1959 and 1960.
    Figure 9.8 Choropleth Graph Showing Cases of COVID-19

    (data source: Data from WHO COVID-19 Dashboard. Geneva: World Health Organization, 2020. Available online: https://data.who.int/dashboards/covid19/" via OurWorldinData.org. Retrieved from: ourworldindata.org/grapher/w...y-covid-deaths [Online Resource].)

    Using Python for Geographic Mapping

    As mentioned earlier, Python provides a number of libraries and tools to facilitate the generation of geographic mapping. There are several steps involved in the process:

    1. Establish and collect geospatial data: The first step is to collect the geospatial data of interest. This type of data is readily available from a number of sources such as open data repositories, government agencies such as the Census Bureau, for example, etc. The data can be found in various formats such as shapefiles, GeoJSON files, etc. (a shapefile is a data format used for geospatial data that represents geographic data in a vector format).
    2. Data processing: Once the data is collected, data processing or transformation may be needed, as discussed in Collecting and Preparing Data. For example, some form of coding might be needed to transform addresses into geographic coordinates such as latitude and longitude to facilitate the geographic mapping. Additional data processing might be needed to estimate or interpolate missing data to generate continuous temperature maps, for example.
    3. Generate the map to visualize the geographic data using libraries such as matplotlib and geopandas. Various tools also provide interactivity to the map so that the user can zoom in, zoom out, rotate, or view data when hovering over a map feature.

    Python provides an excellent environment to allow the user to create, manipulate, and share geographic data to help visualize geospatial data and detect trends or arrive at conclusions based on the data analysis.


    This page titled 9.4: Geospatial and Heatmap Data Visualization Using Python is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by OpenStax via source content that was edited to the style and standards of the LibreTexts platform.