Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Engineering LibreTexts

15: Data Science

( \newcommand{\kernel}{\mathrm{null}\,}\)

  • 15.0: Introduction
    This page introduces data science, emphasizing its importance in decision-making across sectors like healthcare, business, and education. It details the data science life cycle and provides resources for using Python to further explore the field.
  • 15.1: Introduction to Data Science
    This page provides an overview of data science, detailing its definition, lifecycle stages (data acquisition, exploration, analysis, reporting), and essential tools (Python, R, Jupyter Notebook, Google Colaboratory, Kaggle Kernels, Microsoft Excel). It also includes practical exercises for using Google Colaboratory, highlighting the significance of these tools in data analysis and visualization.
  • 15.2: NumPy
    This page details the NumPy library's learning objectives and features for numerical operations on multi-dimensional arrays in Python. It explains creating ndarray objects and covers mathematical functions, array manipulation, and linear algebra. Included are practice questions and references to the NumPy user guide, while encouraging practice through Google Colaboratory.
  • 15.3: Pandas
    This page provides an overview of the Pandas library, a powerful Python tool for data cleaning and analysis, detailing its main data structures: Series and DataFrame. It highlights key functions like `info()`, `describe()`, `value_counts()`, and `unique()`, which facilitate data exploration and summary. Examples are given for creating DataFrames from various sources. The text also includes practice questions and recommends consulting the Pandas user guide for further learning.
  • 15.4: Exploratory Data Analysis
    This page covers exploratory data analysis (EDA), focusing on data inspection, indexing, and methods for handling missing values using Pandas in Python. Key concepts include label-based and integer-based indexing, along with functions like `isnull()`, `dropna()`, and `fillna()` to maintain data quality. The text emphasizes methods for replacing Null values and correcting common misconceptions about incorrect function usage.
  • 15.5: Data Visualization
    This page highlights the significance of data visualization in data science, discussing various visualization types like bar plots and scatter plots, each suited for specific analysis. It underscores visualization's role in data exploration, trend identification, and reporting within the data science life cycle.
  • 15.6: Chapter Summary
    This page provides an overview of data science fundamentals, highlighting its multidisciplinary nature and lifecycle, which involves data acquisition, exploration, analysis, and reporting. It introduces key Python libraries, including NumPy for numerical tasks and Pandas for data management. The importance of Exploratory Data Analysis (EDA) and data visualization techniques is emphasized, along with functions for data structure manipulation in Python.


This page titled 15: Data Science is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by OpenStax via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?