6.
Candidate Model Evaluation
The model evaluation stage is also always "offline". By drawing a
comparison of the predictions generated with the testing dataset with the
actual data values using several key performance indicators and metrics, the
"predictive performance" of a model can be measured. To generate a
prediction on future input data, the "best" model from the testing subset will
be preferred. An evaluator library consisting of some evaluators can be
designed to generate accuracy metrics such as "ROC curve" or "PR curve",
which can also be stored in a data storage against the model. Once more,
the same techniques are applied to make it possible to flexibly combine and
switch between evaluators.
The "Model Evaluation Service" will request the testing dataset from the
"Data Segregation API" to orchestrate the training and testing of the model.
Moreover, the corresponding evaluators will be applied for the model
originating from the "Model Candidate repository". The findings of the test
will be returned to and saved in the repository. In order to develop the final
machine learning model, an incremental procedure, hyper-parameter
optimization, as well as regularization methods, would be used. The best
model would be deemed as deployable to the production environment and
eventually released in the market. The deployment information will be
published by the "notification service".
7.
Model Deployment
The machine learning model with the highest performance will be marked
for deployment for "offline (asynchronous)" and "online (synchronous)"
prediction generation. It is recommended to deploy multiple models at the
same time to ensure the transfer from obsolete to the current model is made
smoothly, this implies that the services must continue to respond to forecast
requirements without any lapse, while the new model is being deployed.
Historically, the biggest issue concerning deployment has been pertaining to
the coding language required to operate the models have not been the same
as the coding language used to build them. It is difficult to operationalize a
"Python or R" based model in production using languages such as "C++, C
#or Java". This also leads to a major reduction in the performance in terms
of speed and accuracy, of the model being deployed. This problem can be
dealt with in a few respects as listed below:
⠀
Implementing new language for rewriting the code, for example,
“Python to C# translate".
Creating customized "DSL (Domain Specific Language)" to
define the model.
Creating a "micro-service" that is accessible through "RESTful
APIs".
Implementing an "API first" approach through the course of the
deployment.
Creating containers to store the code independently.
Adding serial numbers to the model and loading them to "in-
memory key-value storage".
In practice, the deployment activities required to implement an actual model
are automated through the use of "continuous delivery implementation",
which ensures the packaging of the necessary files, validation of the model
via a robust test suite as well as the final deployment into a running
container. An automated building pipeline can be used to execute the tests,
which makes sure that the short, self-containing and stateless unit tests are
conducted first. When the model has passed these tests, its quality will be
evaluated in larger integrations and by executing regression tests. If both the
test phases have been cleared, the model is deemed ready for deployment in
the production environment.
Do'stlaringiz bilan baham: |