Machine Learning Dev Ops Guideline
or
git init
4. DVC
(Data Version Control)
Split Jupyter Notebook into Python Scripts (Modular Approach):
-
data_ingestion.py
-
data_preprocessing.py
-
feature_engineering.py
-
model_building.py
-
model_evaluation.py
-
Output metrics.json
Create dvc.yaml
(configuration file)
git init
& dvc init
, dvc repro
(this will execute dvc.yaml
) , dvc dag
, dvc metrics show
See here MLflow Repo
Dockerfile