The two biggest barriers to the use of machine learning (both classical and ) are skills and computing resources. You can solve the second problem by throwing money at it, either for the purchase of (such as computers with high-end GPUs) or for the rental of compute resources in the cloud (such as instances with attached GPUs, TPUs, and FPGAs).
On the other hand, solving the skills problem is harder. Data scientists often command hefty salaries and may still be hard to recruit. Google was able to train many of its employees on its own , but most companies barely have people skilled enough to build machine learning and deep learning models themselves, much less teach others how.
What is AutoML?
Automated machine learning, or AutoML, aims to reduce or eliminate the need for skilled data scientists to build machine learning and deep learning models. Instead, an AutoML system allows you to provide the labeled training data as input and receive an optimized model as output.
There are several ways of going about this. One approach is for the software to simply train every kind of model on the data and pick the one that works best. A refinement of this would be for it to build one or more ensemble models that combine the other models, which sometimes (but not always) gives better results.
A second technique is to optimize the hyperparameters (explained below) of the best model or models to train an even better model. Feature engineering (also explained below) is a valuable addition to any model training. One way of de-skilling deep learning is to use transfer learning, essentially customizing a well-trained general model for specific data.
What is hyperparameter optimization?
All machine learning models have parameters, meaning the weights for each variable or feature in the model. These are usually determined by back-propagation of the errors, plus iteration under the control of an optimizer such as stochastic gradient descent.
Hyperparameter tuning or hyperparameter optimization (HPO) is an automatic way of sweeping or searching through one or more of the hyperparameters of a model to find the set that results in the best trained model. This can be time-consuming, since you need to train the model again (the inner loop) for each set of hyperparameter values in the sweep (the outer loop). If you train many models in parallel, you can reduce the time required at the expense of using more hardware.
() to convert correlated variables into a set of linearly uncorrelated variables.
To use categorical data for machine classification, you need to encode the text labels into another form. There are two common encodings.
One is label encoding, which means that each text label value is replaced with a number. The other is one-hot encoding, which means that each text label value is turned into a column with a binary value (1 or 0). Most machine learning frameworks have functions that do the conversion for you. In general, one-hot encoding is preferred, as label encoding can sometimes confuse the machine learning algorithm into thinking that the encoded column is ordered.
Some of the transformations that people use to construct new features or reduce the dimensionality of feature vectors are simple. For example, subtract
Year of Birth from
Year of Death and you construct
Age at Death, which is a prime independent variable for lifetime and mortality analysis. In other cases, feature construction may not be so obvious.
What is transfer learning?
Transfer learning is sometimes called custom machine learning, and sometimes called AutoML (mostly by Google). Rather than starting from scratch when training models from your data, implements automatic deep transfer learning (meaning that it starts from an existing deep neural network trained on other data) and neural architecture search (meaning that it finds the right combination of extra network layers) for language pair translation, natural language classification, and image classification.
That’s a different process than what’s usually meant by AutoML, and it doesn’t cover as many use cases. On the other hand, if you need a customized deep learning model in a supported area, transfer learning will often produce a superior model.
There are many implementations of AutoML that you can try. Some are paid services, and some are free source code. The lists below are by no means complete or final.
All of the big three cloud services have some kind of AutoML. does hyperparameter tuning but doesn’t automatically try multiple models or perform feature engineering. has both AutoML, which sweeps through features and algorithms, and hyperparameter tuning, which you typically run on the best algorithm chosen by AutoML. Google Cloud AutoML, as I discussed earlier, is deep transfer learning for language pair translation, natural language classification, and image classification.
A number of smaller companies offer AutoML services as well. For example, , which claims to have invented AutoML, has a strong reputation in the market. And while has a tiny market share and a mediocre UI, it has strong feature engineering capabilities and covers many enterprise use cases. , which I reviewed in 2017, can help a data scientist turn out models like a Kaggle master, doing feature engineering, algorithm sweeps, and hyperparameter optimization in a unified way.
is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. is an open source software library for automated machine learning, developed at Texas A&M, that provides functions to automatically search for architecture and hyperparameters of deep learning models. (Neural Network Intelligence) is a toolkit from Microsoft to help users design and tune machine learning models (e.g., hyperparameters), neural network architectures, or a complex system’s parameters in an efficient and automatic way.
You can find and a fairly complete and current on GitHub.