How to Prepare for an Automated Future: 7 Steps to Machine Learning
The increasingly digital economy requires boards and executives to have a solid understanding of the rapidly changing digital landscape. Naturally, artificial intelligence (AI) is an important stakeholder. Those organisations that want to prepare for an automated future should have a thorough understanding of AI. However, AI is an umbrella term that covers multiple disciplines, each affecting the business in a slightly different way.
When we look at artificial intelligence, it can be divided into three different domains:
- Robotics, which deals with the physical world and it can directly interact with humans. Robotics can be used to improve our work in various ways. Including Ford’s exoskeleton or Boston Dynamics’ helping robots.
- Cognitive systems, which deal with the human world. A great example of a cognitive system as part of AI are chatbots. Chatbots are a very tangible example where humans and machines work together to achieve a goal. A chatbot is a communication interface that helps individuals and organisations have conversations.
- Machine learning, which deals with the information world. Machines use data to learn, and machine learning aims to derive meaning from that data. Machine learning uses statistical methods to enable machines to improve with machines. A subset of machine learning is deep learning, which enables multi-layer neural networks.
Artificial intelligence consists of the seamless integration of robotics, cognitive systems and machine learning.
Figure 1: Artificial Intelligence – adapted from Goel & Davies, 2019
7 Steps to Machine Learning
Let’s dive a little bit deeper into one of these domains: machine learning. The objective of machine learning is to derive meaning from data. Therefore, data is the key to unlock machine learning. There are seven steps to machine learning, and each step revolves around data:
Figure 2: 7 Steps to Machine Learning
1. Data collection
Machine learning requires training data, a lot of it (either labelled, meaning supervised learning or not labelled, meaning unsupervised learning). Data collection, or datafication, is also the first step in my new D2 + A2 model.
2. Data preparation
Raw data alone is not very useful. The data needs to be prepared, normalized, de-duplicated and errors and bias need to be removed. Visualisation of the data can be used to look for patterns and outliers to see if the right data has been collected or if data is missing.
3. Choosing a model
The third step consists of selecting the right model. There are many models that can be used for many different purposes. Upon selecting the model, you need to make sure that the model meets the business goal. In addition, you should know how much preparation the model requires, how accurate it is and how scalable the model is. A more complex model does not always constitute a better model. Commonly used machine learning algorithms include linear regression, logistic regression, decision trees, K-means, principal component analysis (PCA), Support Vector Machines (SVM), Naïve Bayes, Random Forest and Neural Networks.
Training your model is the bulk of machine learning. The objective is to use your training data and incrementally improve the predictions of the model. Each cycle of updating the weights and biases is one training step. In supervised machine learning, the model is built using labelled sample data, while unsupervised machine learning tries to draw inferences from non-labelled data (without references to known or labelled outcomes).
After training the model comes evaluating the model. This entails testing the machine learning against an unused control dataset to see how it performs. This might be representative of how the model works in the real world, but this does not have to be the case. The larger the number of variables in the real world, the bigger to training and test data should be.
6. Parameter tuning
After evaluating your model, you should test the originally set parameters to improve the AI. Increasing the number of training cycles can lead to more accurate results. However, you should define when a model is good enough as otherwise, you will continue to tweak the model. This is an experimental process.
Once you have gone through the process of collecting data, preparing the data, selecting the model, training and evaluating the model and tuning the parameters, it is time to answer questions using predictions. These can be all kinds of predictions, ranging from image recognition to semantics to predictive analytics.
Machine learning allows software to become accurate in predicting outcomes. It will augment many, if not all, business processes in the coming years. As such, machine learning will become an integral part of the automated organisation of tomorrow. Thanks to increasingly faster hardware, we will see more powerful models offering better predictions.
Unfortunately, the challenge of biased models thanks to biased data and biased data scientists is never far away. Therefore, for organisations to truly benefit from AI, they should ensure that their models and data are bias-free, well-trained and evaluated and properly tuned. Only then, will organisations really benefit from machine learning.