11.2: Appendix C- Review of Python Algorithms
- Page ID
- 118147
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)This appendix provides a summary of various Python algorithms used in this textbook. The intent is to provide students with a cross-reference to the various algorithms used in the text and provide a general description of the algorithm and link to the section of the text.
For more details on Python functions, syntax, and usage, please refer to Appendix D: Review of Python Functions and/or the Python documentation online.
| Chapter Title | Topic | Description of Algorithm | First Reference |
|---|---|---|---|
| What Are Data and Data Science? | Loading and viewing data | Using Python
pandas
library to load a CSV file, describe its features, and explore its contents |
Python Basics for Data Science |
| Visualizing data using Python | Using Python to create a scatterplot for two numeric quantities | Python Basics for Data Science | |
| Collecting and Preparing Data | Scraping data from a website | Using Python to extract a data table from a website | Web Scraping and Social Media Data Collection |
| Using regular expressions in Python | Using Python to search for a selected word in a given string and output the number of times it appears | Web Scraping and Social Media Data Collection | |
| Processing and storing data | Using Python to process data and output to a CSV file | Web Scraping and Social Media Data Collection | |
| Parsing and extracting data | Using Python to parse and extract specific data from a given dataset | Web Scraping and Social Media Data Collection | |
| Descriptive Statistics: Statistical Measurements and Probability Distributions | Calculate binomial probabilities | Using Python to calculate probabilities associated with the binomial distribution | Discrete and Continuous Probability Distributions |
| Calculate probabilities from a normal distribution | Using Python to calculate probabilities associated with the normal distribution | Discrete and Continuous Probability Distributions | |
| Inferential Statistics and Regression Analysis | Computing margin of error | Using Python library
scipy.stats
to compute a margin of error using the t-distribution |
Statistical Inference and Confidence Intervals |
| Confidence interval using bootstrapping | Using Python to calculate a confidence interval using a bootstrapping approach | Statistical Inference and Confidence Intervals | |
| Confidence interval for the mean | Using Python to calculate a confidence interval for the mean using the normal and t-distributions (two examples) | Statistical Inference and Confidence Intervals | |
| Hypothesis test for the mean (one sample) | Using Python to calculate a p-value associated with a hypothesis test for the mean | Hypothesis Testing | |
| Hypothesis test for a proportion | Using Python to calculate a p-value associated with a hypothesis test for a proportion | Hypothesis Testing | |
| Hypothesis test for the mean (one sample) | Using Python to calculate a test statistic and p-value associated with a hypothesis test for the mean | Hypothesis Testing | |
| Hypothesis test for the mean (two samples) | Using Python to calculate a test statistic and p-value associated with a hypothesis test for the difference between two means | Hypothesis Testing | |
| Correlation coefficient | Using Python to calculate the Pearson correlation coefficient for two numeric variables | Correlation and Linear Regression Analysis | |
| Creating a scatterplot | Using Python to create a scatterplot for two numeric quantities | Correlation and Linear Regression Analysis | |
| Creating a linear regression model | Using Python to calculate the slope and intercept for a linear regression model | Correlation and Linear Regression Analysis | |
| One-way analysis of variance (ANOVA) | Using Python to conduct a one-way analysis of variance hypothesis test | Analysis of Variance (ANOVA) | |
| Visualizing time series data | Using basic Python plot routine and
matplotlib.pyplot
to generate a time series graph based on a
pandas
DataFrame (two examples) |
Analysis of Variance (ANOVA) | |
| Time Series and Forecasting | Plotting a simple moving average (SMA) and differencing | Using Python to generate a simple moving average and first-order difference for time series data | Time Series Forecasting Methods |
| Decomposing a time series into components | Using Python autocorrelation function and STL (seasonal and trend decomposition using LOESS) model to decompose time series data into its components | Time Series Forecasting Methods | |
| Exponential moving average (EMA) | Using Python to calculate exponential moving average for time series data | Time Series Forecasting Methods | |
| Autoregressive integrated moving average (ARIMA) | Using Python to calculate autoregressive integrated moving average model for time series data | Time Series Forecasting Methods | |
| Autoregressive integrated moving average (ARIMA) | Using Python to fit and plot an autoregressive integrated moving average and use it to make forecasts for time series data | Time Series Forecasting Methods | |
| Forecasting methods | Using Python to create and plot a forecasting model with confidence intervals for time series data | Forecast Evaluation Methods | |
| Decision-Making Using Machine Learning Basics | Logistic regression | Using Python to fit a logistic regression model and assess the accuracy of the model | Classification Using Machine Learning |
| K-means clustering | Using Python to produce a k-means clustering model from a dataset | Classification Using Machine Learning | |
| DBscan clustering | Using Python to generate a density-based spatial clustering of applications with noise (DBScan) model | Classification Using Machine Learning | |
| Confusion matrix | Using Python to generate a confusion matrix | Classification Using Machine Learning | |
| Linear regression with bootstrapping | Using Python to generate a linear regression model using a bootstrapping method | Machine Learning in Regression Analysis | |
| Multiple regression | Using Python to perform multiple regression analysis | Machine Learning in Regression Analysis | |
| Three-dimensional scatterplot | Using Python to generate a three-dimensional plot of a multiple regression model | Machine Learning in Regression Analysis | |
| Mesh grid | Using Python to generate a mesh grid plot of a multiple regression model | Machine Learning in Regression Analysis | |
| Multiple logistic regression | Using Python to perform multiple logistic regression analysis and generate corresponding confusion matrix | Machine Learning in Regression Analysis | |
| Decision trees | Using Python to generate decision trees | Decision Trees | |
| Random forests | Using Python to train a random forests model and analyze the importance of each feature | Other Machine Learning Techniques | |
| Gaussian naïve Bayes | Using Python to perform Gaussian naïve Bayes analysis | Other Machine Learning Techniques | |
| Deep Learning and Artificial Intelligence (AI) Basics | Perceptrons | Using Python to train and test a perceptron classification model and assess the accuracy of the model | Introduction to Neural Networks |
| Training a neural network with backpropagation | Using Python’s
TensorFlow
library to train and test a neural network classification model using backpropagation and assess the accuracy of the model |
Backpropagation | |
| Recurrent neural networks | Using Python’s
TensorFlow
library to train and test a classification model using recurrent neural networks (RNN) and assess the accuracy of the model |
Backpropagation | |
| Predict future values | Using Python to predict future values for a classification model using recurrent neural networks (RNN) | Backpropagation | |
| Plot predicted values | Using Python to plot future values versus original data for a classification model using recurrent neural networks | Backpropagation | |
| Deep learning | Using Python’s
TensorFlow
library to train and test a classification model using deep learning and assess the accuracy of the model |
Backpropagation | |
| Visualizing Data | Data visualization using boxplots | Using Python to create boxplots | Encoding Univariate Data |
| Data visualization using histograms | Using Python to create histograms | Encoding Univariate Data | |
| Data visualization using Pareto charts | Using Python to create Pareto charts | Encoding Univariate Data | |
| Data visualization using time series charts | Using Python to create time series charts | Encoding Data That Change over Time | |
| Data visualization for binomial probabilities | Using Python to create graphs associated with the binomial distribution | Graphing Probability Distributions | |
| Data visualization for Poisson probabilities | Using Python to create graphs associated with the Poisson distribution | Graphing Probability Distributions | |
| Data visualization for normal probabilities | Using Python to create graphs associated with the normal distribution | Graphing Probability Distributions | |
| Heatmaps | Using Python to create heatmaps | Geospatial and Heatmap Data Visualization Using Python | |
| Scatterplots with colormaps | Using Python to create scatterplots with colormaps | Multivariate and Network Data Visualization Using Python | |
| Correlation heatmaps | Using Python to create correlation heatmaps | Multivariate and Network Data Visualization Using Python | |
| Data visualization using three-dimensional plots | Using Python to create three-dimensional plots | Multivariate and Network Data Visualization Using Python | |
| Reporting Results | Identify data characteristics | Using Python to identify data characteristics of a dataset | Validating Your Model |
| Decision tree | Using Python to run a decision tree and generate a visualization of the result | Validating Your Model | |
| Model validation using Bayesian information criterion (BIC) | Using Python to perform cross validation with Bayesian information criterion | Validating Your Model | |
| Monte Carlo simulation | Using Python to perform Monte Carlo simulation | Validating Your Model | |
| Executive summary | Using Python to create an executive summary report | Effective Executive Summaries |


