Development Workflow
Project Structure
The project follows a standard MLOps structure with clear separation of concerns:
ml_ops102/
├── src/mlops_project/ # Source code
│ ├── data.py # Data preprocessing
│ ├── train.py # Model training
│ ├── evaluate.py # Model evaluation
│ ├── models.py # Model architecture
│ ├── api.py # FastAPI service
│ └── visualize.py # Visualization utilities
├── tests/ # Unit and integration tests
├── configs/ # Configuration files
├── data/ # Data storage (raw/processed)
├── models/ # Trained model checkpoints
├── dockerfiles/ # Docker configurations
└── .github/workflows/ # CI/CD pipelines
Development Lifecycle
1. Data Preprocessing
Process raw news data into PyTorch tensors:
uv run invoke preprocess-data
This reads from data/raw/News.csv and creates train/val/test splits in data/processed/.
2. Model Training
Train the LSTM model with configurable hyperparameters:
uv run invoke train
Training uses:
- Hydra for configuration management (
configs/config.yaml) - PyTorch Lightning for training orchestration
- Weights & Biases for experiment tracking
- Lightning checkpointing for model saving
Checkpoints are saved to models/ and logs to logs/training/.
3. Model Evaluation
Evaluate trained models on the test set:
uv run invoke evaluate
Generates metrics and saves results to the reports directory.
4. Testing
Run the full test suite with coverage:
uv run invoke test
Test categories:
- Unit tests:
tests/test_data.py,tests/test_model.py - Integration tests:
tests/integrationtests/test_api.py - Performance tests:
tests/performancetests/locustfile.py
5. Code Quality
The project uses pre-commit hooks for code quality:
uv run ruff format .
uv run ruff check . --fix
uv run pre-commit run --all-files
Configuration Management
Hydra manages configurations in the configs/ directory:
config.yaml- Main configurationconfig_cloud.yaml- Cloud-specific settingsconfig_cpu.yaml/config_gpu.yaml- Device-specific configs
Override parameters via command line:
uv run python src/mlops_project/train.py training.epochs=20 training.batch_size=64
Version Control
Data Versioning
DVC tracks large files:
dvc pull
dvc push
Tracked files:
data/raw/News.csv.dvcmodels/model.pt.dvc
Code Versioning
Git tracks code with automated workflows:
- CI: Linting and testing on push
- CD: Docker builds and cloud deployment on merge
Task Management
Use invoke for common tasks:
uv run invoke --list
Available tasks:
preprocess-data- Preprocess raw datatrain- Train modelevaluate- Evaluate modeltest- Run testsdocker-build- Build Docker imagesbuild-docs- Build documentationserve-docs- Serve documentation locally
Monitoring & Logging
Training Monitoring
Experiments are tracked with:
- Weights & Biases: Real-time metrics and visualizations
- PyTorch Lightning logs: Local CSV logs in
logs/training/ - Hydra outputs: Per-run outputs in
outputs/YYYY-MM-DD_HH-MM-SS/
API Monitoring
The FastAPI service includes:
- Prometheus metrics: Endpoint
/metrics - Health checks: Endpoint
/health - Request logging: Structured JSON logs
Docker Workflow
Build containers locally:
uv run invoke docker-build
Run training in Docker:
docker run -v $(pwd)/data:/app/data -v $(pwd)/models:/app/models train:latest
Run API in Docker:
docker run -p 8000:8000 -v $(pwd)/models:/app/models api:latest
Continuous Integration
GitHub Actions automatically:
- Run linting on every push
- Run tests on every push
- Build Docker images on merge to main
- Deploy to GCP Cloud Run on merge to main
View workflow status in repository badges.