Search

Index
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/zz%3A_Back_Matter/10%3A_Index
12: Answer Key
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/12%3A_Answer_Key
5.4: Forecast Evaluation Methods
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/05%3A_Time_Series_and_Forecasting/5.04%3A_Forecast_Evaluation_Methods
This page covers time series forecasting, focusing on error measurement techniques like MAE, RMSE, MAPE, and sMAPE. It highlights the importance of prediction intervals in assessing future value range...This page covers time series forecasting, focusing on error measurement techniques like MAE, RMSE, MAPE, and sMAPE. It highlights the importance of prediction intervals in assessing future value ranges and discusses margin of error and confidence intervals. Using Python's `statsmodels.tsa.arima.model` library, it illustrates how to create prediction intervals, setting an 80% confidence level and visualizing original and forecasted values, showing that uncertainty increases with longer forecasts.
3.5: Discrete and Continuous Probability Distributions
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/03%3A_Descriptive_Statistics-_Statistical_Measurements_and_Probability_Distributions/3.05%3A_Discrete_and_Continuous_Probability_Distributions
This page covers key concepts in probability distributions, focusing on discrete (binomial and Poisson) and continuous (normal) types. It illustrates discrete models through examples, such as a binomi...This page covers key concepts in probability distributions, focusing on discrete (binomial and Poisson) and continuous (normal) types. It illustrates discrete models through examples, such as a binomial experiment with surgery success rates and a Poisson scenario of vehicle arrivals. Continuous probability uses the normal distribution, highlighting its characteristics and the empirical rule. Additionally, the text includes examples utilizing Python's `scipy.
2.4: Data Cleaning and Preprocessing
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/02%3A_Collecting_and_Preparing_Data/2.04%3A_Data_Cleaning_and_Preprocessing
This page discusses the significance of data cleaning and preprocessing in data science, highlighting processes such as data integration, transformation, and validation. It emphasizes the need to hand...This page discusses the significance of data cleaning and preprocessing in data science, highlighting processes such as data integration, transformation, and validation. It emphasizes the need to handle missing data and outliers and outlines techniques like imputation and robust statistical methods to maintain data integrity.
4.5: Key Terms
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/04%3A_Inferential_Statistics_and_Regression_Analysis/4.05%3A_Key_Terms
This page offers definitions and descriptions of essential statistical concepts relevant to hypothesis testing and data analysis, including alternative hypothesis, ANOVA, correlation analysis, and the...This page offers definitions and descriptions of essential statistical concepts relevant to hypothesis testing and data analysis, including alternative hypothesis, ANOVA, correlation analysis, and the central limit theorem. It addresses methods for estimating population parameters like confidence intervals, highlights potential errors in hypothesis testing (Type I and II), and explores regression analysis, residuals, and modeling techniques that illustrate variable relationships.
1.7: Group Project
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/01%3A_What_Are_Data_and_Data_Science/1.07%3A_Group_Project
This page outlines three projects aimed at enhancing data science skills for students and professionals. Project A focuses on finding and cleaning secondary data while analyzing datasets relevant to s...This page outlines three projects aimed at enhancing data science skills for students and professionals. Project A focuses on finding and cleaning secondary data while analyzing datasets relevant to specific policies. Project B involves downloading a dataset, formulating questions, and visualizing results using Python.
1.8: Chapter Review
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/01%3A_What_Are_Data_and_Data_Science/1.08%3A_Chapter_Review
This page includes multiple-choice questions on data science. The first question addresses incorrect step and goal pairings in the data science cycle. The second contrasts local storage with cloud sys...This page includes multiple-choice questions on data science. The first question addresses incorrect step and goal pairings in the data science cycle. The second contrasts local storage with cloud systems in the evolution of data management. The third emphasizes the interdisciplinary nature of data science by asking for the best example among various fields, including history, mathematics, biology, and chemistry.
6.9: Critical Thinking
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/06%3A_Decision-Making_Using_Machine_Learning_Basics/6.09%3A_Critical_Thinking
This page examines the importance of the training and testing data ratio on model performance, emphasizing the risks of underfitting or overfitting. It highlights the significance of the testing set f...This page examines the importance of the training and testing data ratio on model performance, emphasizing the risks of underfitting or overfitting. It highlights the significance of the testing set for detecting these issues. Additionally, it discusses the challenges of applying multiple linear regression in university admissions due to the correlation between SAT and ACT scores and measurement scale differences. Lastly, there is a prompt for classifying a news article using specific keywords.
6.4: Decision Trees
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/06%3A_Decision-Making_Using_Machine_Learning_Basics/6.04%3A_Decision_Trees
This page outlines the fundamentals of decision tree classification, focusing on entropy as a measure of uncertainty in decision-making. It details the construction process of decision trees, emphasiz...This page outlines the fundamentals of decision tree classification, focusing on entropy as a measure of uncertainty in decision-making. It details the construction process of decision trees, emphasizing feature selection through entropy and other criteria like the Gini index to maximize information gain. The importance of testing accuracy and employing pruning methods to avoid overfitting is discussed.
6.6: Key Terms
https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/06%3A_Decision-Making_Using_Machine_Learning_Basics/6.06%3A_Key_Terms
This page defines key machine learning terms such as accuracy, bias, classification, and regression techniques. It discusses methodologies for model improvement, including pruning and ensemble techniq...This page defines key machine learning terms such as accuracy, bias, classification, and regression techniques. It discusses methodologies for model improvement, including pruning and ensemble techniques like random forests. Additionally, it covers data processes like cleaning and mining, and makes distinctions between labeled and unlabeled data. Clustering algorithms, error metrics, and learning paradigms such as supervised and unsupervised learning are also explained.

Search

Text Color

Text Size

Margin Size

Font Type

Support Center

How can we help?