Simple Data Science

This project compiles simple and practical examples for common data science use cases with tabular data.

You can access the complete examples using the following links:

Setup

Environment

In this repository, we use UV—a handy Python package and project manager. To install UV, follow these instructions.

To set up the environment and install the required dependencies, run the following commands in your terminal:

cd simple-data-science     # change to the project's directory
uv venv --python 3.12      # create a python 3.12 virtual environment
source .venv/bin/activate  # activate virtual environment
uv sync                    # synchronize dependencies
pre-commit install         # install pre-commit hooks

If you want to deactivate and delete the virtual environment, run:

deactivate                 # deactivate virtual environment
rm -rf .venv               # delete virtual environment

Data

The examples in this project use the publicly available Fetal Health Dataset and Medical Insurance Payout Dataset.

Because the datasets are small, they are available as .zip files in the repository's data/ folder. You can unzip them with your preferred software or simply run make unzip-datasets in your terminal.

Data Science Use Cases

Code

The code for the data science use cases described above is located in the src/ directory. To run a use case, open the corresponding notebook file, select your preconfigured virtual environment, and execute the cells. Have fun!

To update the HTML files with the latest notebook outputs, run the command make convert-notebooks-to-html in your terminal.

Contributions

We welcome contributions of all kinds. Whether you have questions, spot a bug, or want to enhance the code, documentation, or tests, please feel free to start a discussion or open a pull request. Your feedback, ideas, and fixes are vital in making this project better for everyone!

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
data		data
docs		docs
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Simple Data Science

Setup

Environment

Data

Data Science Use Cases

Code

Contributions

License

About

Uh oh!

Uh oh!

Languages

License

antonacio/simple-data-science

Folders and files

Latest commit

History

Repository files navigation

Simple Data Science

Setup

Environment

Data

Data Science Use Cases

Code

Contributions

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages