Skip to content

khuyentran1401/production-ready-data-science-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Production-Ready Data Science Code Examples

Code examples from the Production-Ready Data Science book by Khuyen Tran.

Enhance your data science workflow with scalable, production-ready practices through hands-on examples.

🔗 Get the Book

What You'll Gain

Transform your data science workflow with these production-ready skills:

  • 📁 Organization: Transform messy notebooks into organized, maintainable code
  • 🔄 Reproducibility: Create reproducible environments across teams and deployments
  • 🧪 Quality: Write modular, reusable, and testable Python code
  • 🔍 Testing: Implement automated testing to catch bugs early
  • 📊 Version Control: Leverage version control for code and data integrity
  • 🚀 Production: Deploy bulletproof systems that scale

Examples by Chapter

Chapter 1-3: Foundation

  1. Version Control - Git workflows
  2. Dependency Management - Environment setup
  3. Modules & Packages - Project organization

Chapter 4-6: Code Quality

  1. Variables - Clean code practices
  2. Functions - Function design
  3. Classes - Object-oriented programming

Chapter 7-9: Testing & Operations

  1. Unit Testing - Automated testing
  2. Configuration Management - Settings management
  3. Logging - Monitoring and debugging

Chapter 10-11: Data

  1. Data Validation - Input validation
  2. Data Version Control - Dataset tracking

Chapter 12-14: Production

  1. Continuous Integration - Automated deployment
  2. Package Your Project - Package distribution
  3. Notebooks in Production - Production notebooks

Getting Started

Fork and Clone

  1. Click the "Fork" button at the top of this page
  2. This creates your own copy at: github.com/YOUR_USERNAME/production-ready-data-science-code
  3. Clone your fork:
git clone https://github.com/YOUR_USERNAME/production-ready-data-science-code.git
cd production-ready-data-science-code

Prerequisites

  • Python 3.10.11 or higher
  • uv - Fast Python package manager

Install Dependencies

Option A: Install Everything (Recommended)

uv sync --all-groups

Option B: Install Specific Chapters Only

uv sync --group chapter7   # Testing examples
uv sync --group chapter9   # Logging examples  
uv sync --group chapter10  # Data validation

Ready to get started? Browse examples above or get the book

Author: Khuyen Tran | Website: https://codecut.ai/

About

Transform messy data science notebooks into production-ready code. Examples covering testing, CI/CD, MLOps, and scalable deployment practices.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published