Skip to main content

Registration is now open for this year's LibreFest! Join us virtually the week of July 13.

Register here
Engineering LibreTexts

11.3: Appendix D- Review of Python Functions

  • Page ID
    118148
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    This appendix provides a summary of Python functions used in this textbook. The intent is to provide students with a cross-reference of Python commands that includes a description of the Python functions, general syntax for usage, and a link to the section where the function is first used in the text.

    Please note this is a very high-level description of these functions. Many functions require specific libraries to be installed. For more details on Python functions, syntax, and usage, please refer to the Python documentation posted online.

    Python Function Description Syntax First Reference
    What Are Data and Data Science?
    print()
    Prints a specified message or specified values to the screen or other output device
    print(“text”)
    print(x, y)
    Python Basics for Data Science
    pd.read_csv()
    Loads data from a CSV (comma-separated values) file and stores in a DataFrame
    pd.read_csv
    (path_to_csv datafile)
    Python Basics for Data Science
    DataFrame.describe()
    Returns a table with basic statistics for a dataset including min, max, mean, count, and quartiles
    DataFrame.describe()


    Where:
    DataFrame
    is the name of the
    DataFrame.
    Python Basics for Data Science
    DataFrame.iloc[]
    Allows access to data in a DataFrame using row/column integer-based indexes.
    DataFrame.iloc[row, column]


    Where:
    DataFrame
    is the name of the
    DataFrame.
    Python Basics for Data Science
    DataFrame.loc[]
    Used to access a group of rows and columns by labels or a Boolean array
    DataFrame.loc[criteria]


    Where:
    DataFrame
    is the name of the
    DataFrame.
    Python Basics for Data Science
    Plt.scatter()
    Generates a scatterplot for (x, y) data
    plt.scatter(x_data, y_data)
    Python Basics for Data Science
    Plt.title()
    Specifies a title for a chart
    plt.title(“Title”)
    Python Basics for Data Science
    Plt.xlabel()
    Specifies a label for the x-axis
    plt.xlabel(“x-axis label”)
    Python Basics for Data Science
    Plt.ylabel()
    Specifies a label for the y-axis
    plt.ylabel(“y-axis label”)
    Python Basics for Data Science
    Plt.xlim()
    Specifies limits to use for x-axis numbering
    plt.xlim(lower, upper)
    Python Basics for Data Science
    Plt.ylim()
    Specifies limits to use for y-axis numbering
    plt.ylim(lower, upper)
    Python Basics for Data Science
    Collecting and Preparing Data
    pd.read_html()
    Read HTML table from a web page and convert into a DataFrame
    pd.read_html(URL)
    Web Scraping and Social Media Data Collection
    pd.to_numeric()
    Converts strings or other data types to numeric values
    pd.to_numeric
    (column_name)
    Web Scraping and Social Media Data Collection
    len()
    Returns the length of an object
    len(object)
    Web Scraping and Social Media Data Collection
    re.findall()
    Returns all non-overlapping matches of a specified pattern in a string
    re.findall(pattern, string)
    Web Scraping and Social Media Data Collection
    re.search()
    Checks if a specified pattern appears in a string
    re.search(pattern, string)
    Web Scraping and Social Media Data Collection
    Descriptive Statistics: Statistical Measurements and Probability Distributions
    binom.pmf()
    Calculates the probability mass function (PMF) for a binomial distribution. It gives the probability of having exactly x successes in n trials with success probability p.
    binom.pmf(x, n, p)


    Where:
    x is the number of successes in
    the experiment,
    n is the number of trials in the
    experiment,
    p is the probability of success.
    Discrete and Continuous Probability Distributions
    round()
    Rounds a numeric result to a specified level of precision
    round(number, digits)
    Discrete and Continuous Probability Distributions
    poisson.pmf()
    Calculates probabilities associated with the Poisson distribution
    poisson.pmf(x, mu)


    Where:
    x is the number of events of
    interest,
    mu is the mean of the Poisson
    distribution.
    Discrete and Continuous Probability Distributions
    norm.cdf()
    Calculates probabilities associated with the normal distribution (returns the area under the normal probability density function to the left of a specified measurement)
    norm.cdf(x, mu, std)


    Where:
    x is the measurement of interest,
    mu is the mean of the normal
    distribution,
    std is the standard deviation of
    the normal distribution.
    Discrete and Continuous Probability Distributions
    Inferential Statistics and Regression Analysis
    t.ppf()
    Generates the value of the t-distribution corresponding to a specified area under the t-distribution curve and specified degrees of freedom
    t.ppf
    (area to left, degrees of
    freedom)
    Statistical Inference and Confidence Intervals
    bootstrap()
    Performs bootstrap process to generate confidence interval
    bootstrap
    (data, statistic,
    confidence_level,
    number_resamples)
    Statistical Inference and Confidence Intervals
    norm.interval()
    Calculates confidence interval for the mean when population standard deviation is known, given sample mean, population standard deviation, and sample size (uses normal distribution). Note: Standard error is the standard deviation divided by the square root of the sample size.
    norm.interval
    (conf_level, sample_mean,
    standard_error)
    Statistical Inference and Confidence Intervals
    t.interval()
    Calculates confidence interval for the mean when population standard deviation is unknown, given sample mean, sample standard deviation, and sample size (uses t-distribution). Note, standard error is the standard deviation divided by the square root of the sample size.
    t.interval
    (conf_level,
    degrees_freedom,
    sample_mean,
    standard_error)
    Statistical Inference and Confidence Intervals
    proportion_confint()
    Calculates confidence interval for a proportion (uses normal distribution)
    proportion_confint
    (success, sample_size,
    alpha)
    Statistical Inference and Confidence Intervals
    ttest_1samp()
    Returns the value of the test statistic and the two-tailed p-value for a one-sample hypothesis test using the t-distribution
    ttest_1samp
    (data_array,
    null_hypothesis_mean)
    Hypothesis Testing
    ttest_ind_from_stats()
    Returns the value of the test statistic and the two-tailed p-value for a two-sample hypothesis test using the t-distribution
    ttest_ind_from_stats
    (sample_mean1,
    sample_standard_deviation1,
    sample_size1, sample_mean2,
    sample_standard_deviation2,
    sample_size2)
    Hypothesis Testing
    np.array()
    Creates a numerical array from a list-like object
    np.array(object)
    Correlation and Linear Regression Analysis
    pearsonr()
    Calculates the value of the Pearson correlation coefficient r
    pearsonr
    (x_data, y_data)
    Correlation and Linear Regression Analysis
    linregress()
    Generates a linear regression model and provides slope, y-intercept, and other regression-related output
    linregress
    (x_data, y_data)
    Correlation and Linear Regression Analysis
    f_oneway()
    Returns both the F test statistic and the p-value for the one-way ANOVA hypothesis test
    f_oneway
    (Array1, Array2, Array3, …)
    Analysis of Variance (ANOVA)
    Time Series and Forecasting
    plot()
    Generates a time series plot
    plot(dataframe)
    Introduction to Time Series Analysis
    rolling()
    Provides rolling window calculations
    rolling
    (window=window)
    Time Series Forecasting Methods
    mean()
    Computes the average of a dataset
    mean(dataset)
    Time Series Forecasting Methods
    diff()
    Computes the first-order difference of data in a window
    diff(dataframe)
    Time Series Forecasting Methods
    plot_acf()
    Plots the ACF (autocorrelation function) for a time series, up to lag L
    Plot_acf
    (time_series_data, lags=L)
    Time Series Forecasting Methods
    STL()
    Decomposes a time series with known period P into its components
    STL
    (time_series_data, 
    period=P)
    Time Series Forecasting Methods
    ewm()
    Performs exponential moving average (EMA) smoothing
    ewm(dataframe)
    Time Series Forecasting Methods
    adfuller()
    Performs the Augmented Dickey-Fuller (ADF) test, which is a statistical test for checking the stationarity of a time series
    adfuller
    (time_series_data)
    Time Series Forecasting Methods
    ARIMA()
    Fits an ARIMA(p, d, q) (AutoRegressive Integrated Moving Average) model to time series data
    ARIMA
    (time_series_data, 
    order=(p, d, q))
    Time Series Forecasting Methods
    Decision-Making Using Machine Learning Basics
    LogisticRegression()
    Creates a logistic regression model
    LogisticRegression()
    Classification Using Machine Learning
    model.fit()
    Trains a machine learning model on a given dataset
    model.fit
    (feature_matrix,
    target_vector)
    Classification Using Machine Learning
    KMeans()
    Sets up a k-means clustering model (Use model.fit() to fit the model to a dataset.)
    KMeans(n_clusters=k)
    Classification Using Machine Learning
    DBSCAN()
    Sets up a DBSCAN (Density-Based Spatial Clustering of Applications with Noise) model (Use model.fit() to fit the model to a dataset.)
    DBSCAN(options)
    Classification Using Machine Learning
    confusion_matrix()
    Used to visualize the performance of a model by comparing actual and predicted values
    confusion_matrix
    (target_values,
    predicted_values)
    Classification Using Machine Learning
    LinearRegression()
    Fits a linear regression model to data
    LinearRegression()
    .fit(feature_matrix,
    target_vector)
    Machine Learning in Regression Analysis
    predict()
    Used on trained machine learning models to generate predictions for new data points
    predict(feature_matrix)
    Machine Learning in Regression Analysis
    DecisionTreeClassifier()
    Sets up a decision tree model (Use model.fit() to fit the model to a dataset.)
    DecisionTreeClassifier
    (options)
    Decision Trees
    ens.RandomForestRegressor()
    Sets up a random forest model (Use model.fit() to fit the model to a dataset.)
    ens.RandomForestRegressor
    (options)
    Other Machine Learning Techniques
    GaussianNB()
    Set up a Naïve Bayes classification model (Use model.fit() to fit the model to a dataset.)
    GaussianNB()
    Other Machine Learning Techniques
    Deep Learning and Artificial Intelligence (AI) Basics
    Perceptron()
    Sets up a perceptron model (Use model.fit() to fit the model to a dataset.)
    Perceptron()
    Introduction to Neural Networks
    train_test_split()
    Splits dataset randomly into train and test subsets, using a proportion of P of the data for the test set
    train_test_split
    (input_data_arrays,
    target_data, test_size=P)
    Introduction to Neural Networks
    StandardScaler()
    Used to standardize features by removing the mean and scaling to unit variance
    StandardScaler()
    Introduction to Neural Networks
    accuracy_score()
    Calculates the accuracy of a classification model as the ratio of the number of correct predictions to the total number of predictions
    accuracy_score
    (y_true, y_predicted)
    Introduction to Neural Networks
    scaler.fit_transform()
    Fits a scaler to the data and then transforms the data according to the fitted scaler
    scaler.fit_transform(array)
    Introduction to Neural Networks
    scaler.transform()
    Applies a previously fitted scaler to new data
    scaler.transform(array)
    Introduction to Neural Networks
    tf.keras.Sequential()
    Creates a linear stack of layers for building a neural network model
    tf.keras.Sequential
    (layers, additional
    options)
    Backpropagation
    model.compile()
    Used to configure the learning process of a neural network model before training
    model.compile
    (optimizer, loss, metrics)
    Backpropagation
    Visualizing Data
    boxplot()
    Creates a box-and-whisker plot
    plt.boxplot(array)
    Encoding Univariate Data
    hist()
    Creates a histogram
    plt.hist (array)
    Encoding Univariate Data
    plot()
    Creates 2D line plots such as a time series graph
    plt.plot
    (x_data, y_data)
    Graphing Probability Distributions
    bar()
    Creates a bar chart
    plt.bar
    (x_array, heights)
    Graphing Probability Distributions
    imshow()
    Displays an image on a 2D regular raster, such as a heatmap
    plt.imshow(array)
    Geospatial and Heatmap Data Visualization Using Python
    heatmap()
    Creates a heatmap visualization
    sns.heatmap(array)
    Geospatial and Heatmap Data Visualization Using Python
    colorbar()
    Adds a colormap to a figure
    plt.colorbar()
    Multivariate and Network Data Visualization Using Python
    corr()
    Calculates the pairwise correlations of columns in a DataFrame
    dataframe.corr()
    Multivariate and Network Data Visualization Using Python
    add.subplot()
    Adds a subplot to a figure stored in fig
    fig.add.subplot
    (position)
    Multivariate and Network Data Visualization Using Python
    ax.scatter()
    Creates a scatterplot
    ax.scatter
    (x_data, y_data)
    Multivariate and Network Data Visualization Using Python
    Reporting Results
    plot_tree()
    Creates a visualization of a decision tree
    plot_tree
    (estimator, feature_names)
    Validating Your Model
    DataFrame.info()
    Provides a concise summary of a DataFrame's structure and content
    DataFrame.info()
    Validating Your Model
    DataFrame.drop()
    Removes rows or columns from a DataFrame
    DataFrame.drop
    (labels, axis=rows_columns)
    Validating Your Model
    score()
    Evaluates the performance of a trained model on a given dataset
    model.score
    (feature_matrix,
    true_labels)
    Validating Your Model
    dt.get_depth()
    Retrieves the depth of the decision tree, dt
    dt.get_depth()
    Validating Your Model
    cross_val_score()
    Evaluates a model's performance using cross-validation
    cross_val_score
    (estimator, feature_matrix,
    target_variable)
    Validating Your Model
    GridSearchCV ()
    Search for the best parameters for a specified estimator, with k-fold cross-validation
    GridSearchCV
    (estimator, parameters, k)
    Validating Your Model
    Table D1

    This page titled 11.3: Appendix D- Review of Python Functions is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by OpenStax via source content that was edited to the style and standards of the LibreTexts platform.