Platform for Complete Machine
Learning Lifecycle
Jules S. Damji
@2twitme
San Francisco| May 20, 2020: Part 3 of 3 Series
Outline – Introduction to MLflow: Model Registry
Workflows Explained – Part 3
§ Review & Recap Part 2: MLflow Projects &
Models
§ Concepts and Motivations
§ MLFlow Component
▪ MLflow Model Registry
▪ Model Registry UI & API Workflow
▪ Managed MLflow Model Registry Demo
▪ Tutorials on local host
§ Q&A
https://dbricks.co/mlflow-part-3
MLflow Components
w
ne
Tracking Projects Models Model
Record and query Package data Deploy machine Registry
experiments: code, science code in a learning models in
Store, annotate
data, config, and results format that enables diverse serving
and manage
reproducible runs environments
models in a
on any platform
central repository
databricks.com
mlflow.org github.com/mlflow twitter.com/MLflow
/mlflow
MLflow Projects Motivation
Diverse set of tools
Projects
Package data science
code in a format that
Diverse set of environments enables reproducible runs
on any platform
Challenge: ML results difficult to reproduce
MLflow Projects
Local Execution
Project Spec
Code Config
Remote Execution
Dependencies Data
Example MLflow Project
my_projectject/
├── MLProject conda_env: conda.yaml
│ entry_points:
│ main:
parameters:
│ training_data: path
│ lambda: {type: float, default: 0.1}
command: python main.py {training_data} {lambda}
│
├── conda.yaml
├── main.py $ mlflow run git://<my_project> -P lambda=0.2
└── model.py
... mlflow.run(“git://<my_project>”, ...)
mlflow run . –e main –P lambda=0.2
Example MLflow Project
my_project/
├── MLproject
channels:
│ - defaults
│ dependencies:
│ - python=3.7.3
- scikit-learn=0.20.3
│ - pip:
│ - mlflow
├── conda.yaml - cloudpickle==0.8.0
├── main.py name: mlflow-env
└── model.py
...
MLflow Model Motivations
Inference Code
NxM
Combination of
Model support for
all Serving tools
Batch & Stream Scoring
ML Frameworks Serving Tools
MLflow Model Motivation
MLflow Models
Inference Code
Model Format
Flavor 1 Flavor 2
Batch & Stream
Scoring
Standard for ML models
ML Frameworks Serving Tools
Example MLflow Model
Example MLflow Model
mlflow.tensorflow.log_model(...)
my_model/
├── MLmodel run_id: 769915006efd4c4bbd662461
time_created: 2018-06-28T12:34
│ flavors:
│ tensorflow:
Usable by tools that understand
saved_model_dir: estimator
│ signature_def_key: predict TensorFlow model format
│ python_function: Usable by any tool that can run
loader_module: mlflow.tensorflow
│ Python (Docker, Spark, etc!)
└── estimator/
├── saved_model.pb
└── variables/
...
Model Flavors Example
predict = mlflow.pyfunc.load_model(model_uri)
predict(pandas.input_dataframe)
MLflow Components
w
ne
Tracking Projects Models Model
Record and query Package data Deploy machine Registry
experiments: code, science code in a learning models in
Store, annotate
data, config, and results format that enables diverse serving
and manage
reproducible runs environments
models in a
on any platform
central repository
databricks.com
mlflow.org github.com/mlflow twitter.com/MLflow
/mlflow
The Model Management Problem
When you’re working on one ML app alone, storing your
models in files is manageable
MODEL
classifier_v1.h5
DEVELOPER
classifier_v2.h5
classifier_v3_sept_19.h5
classifier_v3_new.h5
…
The Model Management Problem
When you work in a large organization with many models,
many data teams, management becomes a major
challenge:
• Where can I find the best version of this model?
• How was this model trained? MODEL
USER
• How can I track docs for each model?
MODEL
• How can I review models? DEVELOPER
• How can I integrate with CI/CD?
REVIEWER ???
Model Registry
VISION: Centralized and collaborative model lifecycle management
Model Registry
Downstream
Tracking Server
Data Scientists Deployment Engineers
Users
Staging Production Archived
Parameters Metrics Artifacts Automated Jobs
Metadata Models
REST Serving
Reviewers + CI/CD Tools
MLflow Model Registry
• Repository of named, versioned
models with comments & tags
• Track each model’s stage: none,
staging, production, or archived
• Easily inspect a specific version and its run
info
• Easily load a specific version
• Provides model description, lineage and
activities
Model Registry Workflow UI
MODEL
DEVELOPER
Model Registry Workflow UI
MODEL
REVIEWER
DOWNSTREAM
USERS
AUTOMATED JOBS
REST SERVING
Model Registry Workflow API
model_uri= "models:/{model_name}/production".format(
mlflow.register_model(model_uri,"WeatherForecastModel") model_name="WeatherForecastModel")
model_prod = mlflow.sklearn.load_model(model_uri)
model_prod.predict(data)
DOWNSTREAM
MODEL USERS
DEVELOPER
AUTOMATED JOBS
Model Registry
REVIEWERS,
REST SERVING
CI/CD TOOLS
client = mlflow.tracking.Mlflowclient()
client.transition_model_version_stage(name=”WeatherForecastModel”,
version=5,
stage="Production")
MLflow Backend Registry Stores
1. Entity (Metadata) Store and Models 1. Artifact Store
§ SQLStore (via SQLAlchemy) § Local Filesystem
▪ PostgreSQL, MySQL, SQLite ▪ mlruns directory
▪ Default is mlruns.db file locally
§ S3 backed store
§ Set programmatically for locally
§ mlflow.set_tracking_uri(“sqli § Azure Blob storage
te:///mlruns.db”) § Google Cloud Storage
§ sqlite3 ./mlruns.db (on local § DBFS artifact repo
host)
§ Managed MLflow on Databricks
▪ MySQL on AWS and Azure
MLflow Model Registry Recap
• Central Repository: Unique named registered models for
discovery across data teams
• Model Registry Workflow: Provides UI and API for registry Model Registry
operations
Data Scientists Deployment Engineers
• Model Versioning: Allow multiple versions of model in
different stages
Staging Production Archived
• Model Stages: Allow stage transition: none, staging,
production, or archived
• CI/CD Integration: Easily load a specific version for testing
and inspection
• Model Lineage: Provides model description, lineage and
activities
Recap of all parts:
What Did We Talk About?
§ Modular Components greatly simplify the ML
lifecycle
§ Easy to install & Great Developer experience
§ Develop & Deploy locally; track locally or
remotely
§ Available APIs: Python, Java & R (Soon Scala)
§ REST APIs and CLI tools
§ Visualize experiments and compare runs
§ Centrally register and manage model lifecycle
Model Registry Demo
Tutorials: https://github.com/dmatrix/mlflow-workshop-part-3
Thank you! J
Q&A
[email protected] @2twitme
https://www.linkedin.com/in/dmatrix/