Copyright © 2018 DataScience.US All Rights Reserved.
Train and evaluate custom machine learning models of Watson Developer Cloud
IBM Watson Developer Cloud (WDC) services put the power of machine learning technology in the hands of developers to extract insights from unstructured data (text, speech, and images). To serve developers and enable them to tackle a wide spectrum of applications ranging from general consumer applications to various enterprise-specific applications, the IBM Watson team offers several pre-trained services as well as a rich set of customization capabilities.
Watson Developer Cloud Customization Capabilities
For the pre-trained services, the IBM Watson team has taken on the responsibility of acquiring the right data to train these services, generating trained machine learning (ML) models and providing out-of-the-box functionality for developers. Natural Language Understanding (NLU), Personality Insights (PI), Tone Analyzer (TA), Speech-to-text (STT), Language Translator (LT), and Visual Recognition (VR) are some of the pre-trained WDC services. Developers like these services because they’re intuitive, easy-to-use, require no extra ML training effort and work well for applications tackling a general domain such as enriching web URLs, image tagging or analyzing sentiment of social media posts.
However, for other applications that involve specific domains (such as legal, medical…) or private enterprise data, developers have a need to train and deploy custom ML models. To address that need, the WDC services offer several customization capabilities. Natural Language Classifier (NLC), Watson Conversation, and Visual Recognition services allow developers to train custom ML models by providing example text utterances (NLC and Conversation) and example images (VR) for a defined set of classes (or intents). The Watson Speech to Text (STT) service has a beta offering for training custom language models. Furthermore, for custom entity and relation extraction from text, IBM Watson offers Watson Knowledge Studio, a SaaS solution designed to enable Subject Matter Experts (SMEs) to train custom statistical machine learning models for extracting domain-specific entities and relations from text. Once a custom model is trained with WKS, it can be deployed to NLU or Watson Discovery Service which developers can call at runtime to extract relevant entities and relations.
Performance Evaluation of Trained ML Models
When dealing with custom models, some common questions from developers are “how much training should I do” and “when is my model trained well enough to release”? To help address these questions and enable our partners and clients to exercise the full power of WDC customization capabilities, we’ve published WDC Jupyter notebooks that report commonly used machine learning performance metrics to judge the quality of a trained model. Specifically, the WDC Jupyter notebooks report machine learning metrics that include accuracy, precision, recall, f1-score, and confusion matrix. If you’re interested in more details on these various metrics, please consult the “Is your chatbot ready for primetime?” blog.
These WDC Jupyter notebooks for NLC, Conversation and Visual Recognition help developers evaluate how well their trained models are performing before releasing their application updates to production.
To leverage these notebooks, you need to have a custom trained model as well as a test dataset. The general recommended approach is to start with the groundtruth, a dataset that consists of example input data and the corresponding correct label. For NLC and Conversation, the input data consists of example text utterances and the label is the intent (or class) that best represents that utterance. For Visual Recognition, the input data consists of example images and the label is the class that best represents that image.