Reproducibility

This example highlights one of the main reasons for kaos' development - Reproducibility.

This example assumes you are a Data Scientist using kaos with a running endpoint

Prerequisites

The following steps are required before being able to train the MNIST model.

Initialization

The kaos ML platform is fully functional when initialized with a running endpoint from a System Administrator. See Workflows for more information regarding different kaos personas.

kaos init -e <running_endpoint>

Create a workspace

A workspace is required within kaos for organizing multiple environments and code. Refer to Workspaces for additional information.

$ kaos workspace create -n mnist
​Successfully set mnist workspace

Load the MNIST template

kaos is supplied with various templates (including MNIST) for ensuring simplicity in training and serving own models.

$ kaos template get --name mnist
​Successfully loaded mnist template

Deploy an "initial" training job

The training pipeline requires at least a valid source and data bundle. Refer to Train Pipeline for additional information.

$ kaos train deploy -s templates/mnist/model-train \
-d templates/mnist/data
Submitting source bundle: templates/mnist/model-train
Compressing source bundle: 100%|███████████████████████████|
✔ Setting source bundle: /mnist:e23a2
Submitting data bundle: templates/mnist/data
Compressing data bundle: 100%|███████████████████████████|
✔ Setting data bundle: /features:9fd9d
CURRENT TRAINING INPUTS
+------------+-----------------+-------------+
| Image | Data | Hyperparams |
+------------+-----------------+-------------+
| ⨂ | ✔ | ✗ |
| <building> | /features:9fd9d | |
+------------+-----------------+-------------+

The prerequisites are complete when the training job state is JOB_SUCCESS. Below is the desired output.

$ kaos train list
+------------------------------------------------------------------------------------------------------------+
| TRAINING |
+-----+----------+----------+----------------------------------+-------------------------------+-------------+
| ind | duration | hyperopt | job_id | started | state |
+-----+----------+----------+----------------------------------+-------------------------------+-------------+
| 0 | 71 | False | 1ed80bcad9a6465db67652255f904377 | Mon, 29 Jul 2019 12:38:39 GMT | JOB_SUCCESS |
+-----+----------+----------+----------------------------------+-------------------------------+-------------+

1. Reproduce training

kaos allows any user (within the correct workspace) to retrieve a previously trained model. The following approach is the ideal method for inspecting artifacts or collaborative training (i.e. retraining a colleague's model). kaos ensure simplicity for redeployment since it maintains the original bundle structure.

Retrieve train artifacts

Code

kaos is able to retrieve the source bundle used for training with the following command.

$ kaos train get -i 0 -c
Extracting train bundle: /mnist/1ed80bcad9a6465db67652255f904377
Extracting train bundle: 100%|███████████████████████████|

The source bundle (i.e. code) is extracted in the following structure.

$ tree mnist
mnist
└── 1ed80bcad9a6465db67652255f904377
└── code
└── mnist:e23a2
├── Dockerfile
└── model
├── params.json
├── requirements.txt
├── train
└── utils.py

kaos simplifies retraining by running kaos train deploy after kaos train get

Data

kaos is able to retrieve the data bundle used for training with the following command.

$ kaos train get -i 0 -d
Extracting train bundle: /mnist/1ed80bcad9a6465db67652255f904377
Extracting train bundle: 100%|███████████████████████████|

The source bundle (i.e. data) is extracted based on the supplied input format.

$ tree mnist
mnist
└── 1ed80bcad9a6465db67652255f904377
└── data
└── features:9fd9d
├── test
│ └── test_mini.csv
├── training
│ └── training_mini.csv
└── validation
└── validation_mini.csv

Model

kaos is able to retrieve all output from training with the following command.

$ kaos train get -i 0 -m
Extracting train bundle: /mnist/1ed80bcad9a6465db67652255f904377
Extracting train bundle: 100%|███████████████████████████|

Outputs are extracted based on the template output - /model and /metrics.

$ tree mnist
mnist
└── 1ed80bcad9a6465db67652255f904377
└── models
└── b82151
├── metrics
│ └── metrics.json
└── model
└── model.pkl

kaos ensures full reproducibility of any previous training job

Identify model provenance

Reproducibility of a model is also a function of understand how the output was created - its provenance. This information is readily available for any user (within the correct workspace). The direct acyclic graph (DAG) associated with a specific training job can be visualized via kaos train provenance. Note that the required model_id is found with kaos train info.

$ kaos train info -i 0
Job ID: 1ed80bcad9a6465db67652255f904377
Process time: 67s
State: JOB_SUCCESS
Available metrics: ['accuracy_train', 'metrics_test', 'metrics_train', 'accuracy_validation', 'metrics_validation', 'accuracy_test']
Page count: 1
Page ID: 0
+-----+--------------------+-----------------------+--------------------+-------------+
| ind | Code | Data | Model ID | Hyperparams |
+-----+--------------------+-----------------------+--------------------+-------------+
| 0 | Author: jfriedman | Author: jfriedman | e23a2_9fd9d:b82151 | None |
| | Path: /mnist:e23a2 | Path: /features:9fd9d | | |
+-----+--------------------+-----------------------+--------------------+-------------+
$

In short, running kaos train provenance -m e23a2_9fd9d:b82151 yields the visual overview (below) of the entire training provenance.

mnist/provenance/model-e23a2_9fd9d:b82151.pdf

kaos tracks training provenance to keep all processing fully transparent

2. Reproduce serving

Deploying a running endpoint requires the model_id from above (e.g. e23a2_9fd9d:b82151 ). Refer to Serve Pipeline for additional information.

$ kaos serve deploy -s templates/mnist/model-serve \
-m e23a2_9fd9d:b82151
Submitting source bundle: templates/mnist/model-serve
Compressing source bundle: 100%|███████████████████████████|
✔ Adding trained model_id: e23a2_9fd9d:b82151
Compressing Source Bundle: 100%|███████████████████████████|
✔ Setting source bundle: /mnist:ff06b

The status of deploying the endpoint can be queried with kaos serve list

Retrieve serve artifacts

Code

kaos is able to retrieve the source bundle used for serving with the following command.

$ kaos serve get -i 0
Extracting serve bundle: /mnist/serve-mnist-ae6466
Extracting serve bundle: 100%|███████████████████████████|

The source bundle (i.e. code) is extracted in the following structure.

$ tree mnist
mnist
└── serve-mnist-ae6466
└── code
└── mnist:ff06b
├── Dockerfile
└── model
├── model.pkl
├── model.py
├── nginx.conf
├── predict.py
├── requirements.txt
├── serve
├── web-requirements.txt
└── wsgi.py

kaos simplifies updating endpoints by running kaos serve deploy after kaos serve get

Identify endpoint provenance

Reproducibility of a running endpoint is extremely difficult given the processing chain from training to serving. kaos simplifies the entire process with full provenance based on a running endpoint.

$ kaos serve list
+------------------------------------------------------------------------------------------------------------------------------------+
| RUNNING |
+-----+-------------------------------+--------------------+------------------+------------------------------------------+-----------+
| ind | created_at | name | state | url | user |
+-----+-------------------------------+--------------------+------------------+------------------------------------------+-----------+
| 0 | Mon, 29 Jul 2019 13:20:22 GMT | serve-mnist-ae6466 | PIPELINE_RUNNING | localhost/serve-mnist-ae6466/invocations | jfriedman |
+-----+-------------------------------+--------------------+------------------+------------------------------------------+-----------+

Running kaos serve provenance -e serve-mnist-ae6466 yields a visual overview (below) of the entire endpoint provenance (i.e. both training and serving and their respective inputs).

mnist/provenance/serve-mnist-ae6466.pdf

kaos tracks endpoint provenance to keep all processing fully transparent