Search
- Filter Results
- Location
- Classification
- Include attachments
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/02%3A_Collecting_and_Preparing_Data/2.04%3A_Data_Cleaning_and_PreprocessingThis page discusses the significance of data cleaning and preprocessing in data science, highlighting processes such as data integration, transformation, and validation. It emphasizes the need to hand...This page discusses the significance of data cleaning and preprocessing in data science, highlighting processes such as data integration, transformation, and validation. It emphasizes the need to handle missing data and outliers and outlines techniques like imputation and robust statistical methods to maintain data integrity.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/01%3A_What_Are_Data_and_Data_Science/1.07%3A_Group_ProjectThis page outlines three projects aimed at enhancing data science skills for students and professionals. Project A focuses on finding and cleaning secondary data while analyzing datasets relevant to s...This page outlines three projects aimed at enhancing data science skills for students and professionals. Project A focuses on finding and cleaning secondary data while analyzing datasets relevant to specific policies. Project B involves downloading a dataset, formulating questions, and visualizing results using Python.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/02%3A_Collecting_and_Preparing_Data/2.03%3A_Web_Scraping_and_Social_Media_Data_CollectionThis page provides an overview of web scraping and social media data collection, emphasizing Python techniques like web crawling, XPath, and APIs for data extraction. It introduces libraries such as P...This page provides an overview of web scraping and social media data collection, emphasizing Python techniques like web crawling, XPath, and APIs for data extraction. It introduces libraries such as Pandas, Beautiful Soup, and NLTK for data manipulation. The text also covers natural language processing with SpaCy and the use of regular expressions for text parsing.
- https://eng.libretexts.org/Bookshelves/Computer_Science/Programming_Languages/Python_Programming_(OpenStax)/15%3A_Data_Science/15.03%3A_PandasThis page provides an overview of the Pandas library, a powerful Python tool for data cleaning and analysis, detailing its main data structures: Series and DataFrame. It highlights key functions like ...This page provides an overview of the Pandas library, a powerful Python tool for data cleaning and analysis, detailing its main data structures: Series and DataFrame. It highlights key functions like `info()`, `describe()`, `value_counts()`, and `unique()`, which facilitate data exploration and summary. Examples are given for creating DataFrames from various sources. The text also includes practice questions and recommends consulting the Pandas user guide for further learning.
- https://eng.libretexts.org/Bookshelves/Data_Science/Principles_of_Data_Science_(OpenStax)/02%3A_Collecting_and_Preparing_DataThis page presents a structured guide on data collection methods, covering survey design, web scraping, data cleaning, and managing large datasets. It highlights the significance of critical thinking ...This page presents a structured guide on data collection methods, covering survey design, web scraping, data cleaning, and managing large datasets. It highlights the significance of critical thinking and includes key terms and group projects to facilitate learning. Further references are provided for deeper exploration of these topics.