Toggle navigation sidebar
Toggle in-page Table of Contents
Build a Reproducible and Maintainable Data Science Project
Build a Reproducible and Maintainable Data Science Project
1. How to read this book
2. Structure Project
2.1. How to Structure a Data Science Project for Readability and Transparency
2.2. Best Practices for Writing Python Functions
3. Test Code
3.1. Pytest for Data Scientists
3.2. 4 Useful Tips for Pytest
4. Test Data
4.1. Great Expectations: Always Know What to Expect From Your Data
4.2. Validate Your pandas DataFrame with Pandera
4.3. Introduction to Schema: A Python Libary to Validate your Data
5. Build Pipelines
5.1. Orchestrate a Data Science Project in Python With Prefect
6. Experiment Tracking
6.1. Configure your Data Science Projects with Hydra
6.2. Introduction to Weight & Biases: Track and Visualize your Machine Learning Experiments in 3 Lines of Code
7. Version Control
7.1. Introduction to DVC: Data Version Control Tool for Machine Learning Projects
7.2. DagsHub: a GitHub Supplement for Data Scientists and ML Engineers
7.3. 4 pre-commit Plugins to Automate Code Reviewing and Formatting in Python
8. Deploy Models
8.1. BentoML: Create an ML Powered Prediction Service in Minutes
8.2. GitHub Actions in MLOps: Automatically Check and Deploy Your ML Model
repository
open issue
.md
.pdf
Experiment Tracking
6.
Experiment Tracking
#