Search
- Filter Results
- Location
- Classification
- Include attachments
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/zz%3A_Back_Matter/10%3A_Index
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/12%3A_Answer_Key
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/05%3A_Time_Series_and_Forecasting/5.04%3A_Forecast_Evaluation_MethodsThis page covers time series forecasting, focusing on error measurement techniques like MAE, RMSE, MAPE, and sMAPE. It highlights the importance of prediction intervals in assessing future value range...This page covers time series forecasting, focusing on error measurement techniques like MAE, RMSE, MAPE, and sMAPE. It highlights the importance of prediction intervals in assessing future value ranges and discusses margin of error and confidence intervals. Using Python's `statsmodels.tsa.arima.model` library, it illustrates how to create prediction intervals, setting an 80% confidence level and visualizing original and forecasted values, showing that uncertainty increases with longer forecasts.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/03%3A_Descriptive_Statistics-_Statistical_Measurements_and_Probability_Distributions/3.05%3A_Discrete_and_Continuous_Probability_DistributionsThis page covers key concepts in probability distributions, focusing on discrete (binomial and Poisson) and continuous (normal) types. It illustrates discrete models through examples, such as a binomi...This page covers key concepts in probability distributions, focusing on discrete (binomial and Poisson) and continuous (normal) types. It illustrates discrete models through examples, such as a binomial experiment with surgery success rates and a Poisson scenario of vehicle arrivals. Continuous probability uses the normal distribution, highlighting its characteristics and the empirical rule. Additionally, the text includes examples utilizing Python's `scipy.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/02%3A_Collecting_and_Preparing_Data/2.04%3A_Data_Cleaning_and_PreprocessingThis page discusses the significance of data cleaning and preprocessing in data science, highlighting processes such as data integration, transformation, and validation. It emphasizes the need to hand...This page discusses the significance of data cleaning and preprocessing in data science, highlighting processes such as data integration, transformation, and validation. It emphasizes the need to handle missing data and outliers and outlines techniques like imputation and robust statistical methods to maintain data integrity.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/04%3A_Inferential_Statistics_and_Regression_Analysis/4.05%3A_Key_TermsThis page offers definitions and descriptions of essential statistical concepts relevant to hypothesis testing and data analysis, including alternative hypothesis, ANOVA, correlation analysis, and the...This page offers definitions and descriptions of essential statistical concepts relevant to hypothesis testing and data analysis, including alternative hypothesis, ANOVA, correlation analysis, and the central limit theorem. It addresses methods for estimating population parameters like confidence intervals, highlights potential errors in hypothesis testing (Type I and II), and explores regression analysis, residuals, and modeling techniques that illustrate variable relationships.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/01%3A_What_Are_Data_and_Data_Science/1.07%3A_Group_ProjectThis page outlines three projects aimed at enhancing data science skills for students and professionals. Project A focuses on finding and cleaning secondary data while analyzing datasets relevant to s...This page outlines three projects aimed at enhancing data science skills for students and professionals. Project A focuses on finding and cleaning secondary data while analyzing datasets relevant to specific policies. Project B involves downloading a dataset, formulating questions, and visualizing results using Python.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/01%3A_What_Are_Data_and_Data_Science/1.08%3A_Chapter_ReviewThis page includes multiple-choice questions on data science. The first question addresses incorrect step and goal pairings in the data science cycle. The second contrasts local storage with cloud sys...This page includes multiple-choice questions on data science. The first question addresses incorrect step and goal pairings in the data science cycle. The second contrasts local storage with cloud systems in the evolution of data management. The third emphasizes the interdisciplinary nature of data science by asking for the best example among various fields, including history, mathematics, biology, and chemistry.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/06%3A_Decision-Making_Using_Machine_Learning_Basics/6.09%3A_Critical_ThinkingThis page examines the importance of the training and testing data ratio on model performance, emphasizing the risks of underfitting or overfitting. It highlights the significance of the testing set f...This page examines the importance of the training and testing data ratio on model performance, emphasizing the risks of underfitting or overfitting. It highlights the significance of the testing set for detecting these issues. Additionally, it discusses the challenges of applying multiple linear regression in university admissions due to the correlation between SAT and ACT scores and measurement scale differences. Lastly, there is a prompt for classifying a news article using specific keywords.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/06%3A_Decision-Making_Using_Machine_Learning_Basics/6.04%3A_Decision_TreesThis page outlines the fundamentals of decision tree classification, focusing on entropy as a measure of uncertainty in decision-making. It details the construction process of decision trees, emphasiz...This page outlines the fundamentals of decision tree classification, focusing on entropy as a measure of uncertainty in decision-making. It details the construction process of decision trees, emphasizing feature selection through entropy and other criteria like the Gini index to maximize information gain. The importance of testing accuracy and employing pruning methods to avoid overfitting is discussed.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/06%3A_Decision-Making_Using_Machine_Learning_Basics/6.06%3A_Key_TermsThis page defines key machine learning terms such as accuracy, bias, classification, and regression techniques. It discusses methodologies for model improvement, including pruning and ensemble techniq...This page defines key machine learning terms such as accuracy, bias, classification, and regression techniques. It discusses methodologies for model improvement, including pruning and ensemble techniques like random forests. Additionally, it covers data processes like cleaning and mining, and makes distinctions between labeled and unlabeled data. Clustering algorithms, error metrics, and learning paradigms such as supervised and unsupervised learning are also explained.