What is Machine Learning? | IBM

Machine learning focuses on applications that learn from experience and improve their decision-making or predictive accuracy over time.

Machine learning is a branch of artificial intelligence (AI) focused on building applications that learn from data and improve their accuracy over time without being programmed to do so.

In data science, an algorithm is a sequence of statistical processing steps. In machine learning, algorithms are 'trained' to find patterns and features in massive amounts of data in order to make decisions and predictions based on new data. The better the algorithm, the more accurate the decisions and predictions will become as it processes more data.

Today, examples of machine learning are all around us. Digital assistants search the web and play music in response to our voice commands. Websites recommend products and movies and songs based on what we bought, watched, or listened to before. Robots vacuum our floors while we do . . . something better with our time. Spam detectors stop unwanted emails from reaching our inboxes. Medical image analysis systems help doctors spot tumors they might have missed. And the first self-driving cars are hitting the road.

We can expect more. As big data keeps getting bigger, as computing becomes more powerful and affordable, and as data scientists keep developing more capable algorithms, machine learning will drive greater and greater efficiency in our personal and work lives.

There are four basic steps for building a machine learning application (or model). These are typically performed by data scientists working closely with the business professionals for whom the model is being developed.

Training data is a data set representative of the data the machine learning model will ingest to solve the problem its designed to solve. In some cases, the training data is labeled datatagged to call out features and classifications the model will need to identify. Other data is unlabeled, and the model will need to extract those features and assign classifications on its own.

In either case, the training data needs to be properly preparedrandomized, de-duped, and checked for imbalances or biases that could impact the training. It should also be divided into two subsets: the training subset, which will be used to train the application, and the evaluation subset, used to test and refine it.

Again, an algorithm is a set of statistical processing steps. The type of algorithm depends on the type (labeled or unlabeled) and amount of data in the training data set and on the type of problem to be solved.

Common types of machine learning algorithms for use with labeled data include the following:

Algorithms for use with unlabeled data include the following:

Training the algorithm is an iterative processit involves running variables through the algorithm, comparing the output with the results it should have produced, adjusting weights and biases within the algorithm that might yield a more accurate result, and running the variables again until the algorithm returns the correct result most of the time. The resulting trained, accurate algorithm is the machine learning modelan important distinction to note, because 'algorithm' and 'model' are incorrectly used interchangeably, even by machine learning mavens.

The final step is to use the model with new data and, in the best case, for it to improve in accuracy and effectiveness over time. Where the new data comes from will depend on the problem being solved. For example, a machine learning model designed to identify spam will ingest email messages, whereas a machine learning model that drives a robot vacuum cleaner will ingest data resulting from real-world interaction with moved furniture or new objects in the room.

Machine learningmethods (also called machine learning styles) fall into three primary categories.

Supervised machine learning trains itself on a labeled dataset. That is, the data is labeled with information that the machine learning model is being built to determine and that may even be classified in ways the model is supposed to classify data. For example, a computer vision model designed to identify purebred German Shepherd dogs might be trained on a data set of various labeled dog images.

Supervised machine learning requires less training data than other machine learningmethods and makes training easier because the results of the model can be compared to actual labeled results. But, properly labeled data is expensive to prepare, and there's the danger of overfitting, or creating a model so closely tied and biased to the training data that it doesn't handle variations in new data accurately.

Learn more about supervised learning.

Unsupervised machine learning ingests unlabeled datalots and lots of itand uses algorithms to extract meaningful features needed to label, sort, and classify the data in real-time, without human intervention. Unsupervised learning is less about automating decisions and predictions, and more about identifying patterns and relationships in data that humans would miss. Take spam detection, for examplepeople generate more email than a team of data scientists could ever hope to label or classify in their lifetimes. An unsupervised learning algorithm can analyze huge volumes of emails and uncover the features and patterns that indicate spam (and keep getting better at flagging spam over time).

Learn more about unsupervised learning.

Semi-supervised learning offers a happy medium between supervised and unsupervised learning. During training, it uses a smaller labeled dataset to guide classification and feature extraction from a larger, unlabeled data set. Semi-supervised learning can solve the problem of having not enough labeled data (or not being able to afford to label enough data) to train a supervised learning algorithm.

Reinforcement machine learning is a behavioral machinelearning model that is similar to supervised learning, but the algorithm isnt trained using sample data. This model learns as it goes by using trial and error. A sequence of successful outcomes will be reinforced to develop the best recommendation or policy for a given problem.

The IBM Watson system that won the Jeopardy! challenge in 2011 makes a good example. The system used reinforcement learning to decide whether to attempt an answer (or question, as it were), which square to select on the board, and how much to wagerespecially on daily doubles.

Learn more about reinforcement learning.

Deep learning is a subset of machine learning (all deep learning is machine learning, but not all machine learning is deep learning). Deep learning algorithms define an artificial neural network that is designed to learn the way the human brain learns. Deep learning models require large amounts of data that pass through multiple layers of calculations, applying weights and biases in each successive layer to continually adjust and improve the outcomes.

Deep learning models are typically unsupervised or semi-supervised. Reinforcement learning models can also be deep learning models. Certain types of deep learning modelsincluding convolutional neural networks (CNNs) and recurrent neural networks (RNNs)are driving progress in areas such as computer vision, natural language processing (including speech recognition), and self-driving cars.

See the blog post AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: Whats the Difference? for a closer look at how the different concepts relate.

Learn more about deep learning.

As noted at the outset, machine learning is everywhere. Here are just a few examples of machine learning you might encounter every day:

IBM Watson Machine Learning supports the machine learning lifecycle end to end. It is available in a range of offerings that let you build machine learning models wherever your data lives and deploy them anywhere in your hybrid multicloud environment.

IBM Watson Machine Learning on IBM Cloud Pak for Data helps enterprise data science and AI teams speed AI development and deployment anywhere, on a cloud native data and AI platform. IBM Watson Machine Learning Cloud, a managed service in the IBM Cloud environment, is the fastest way to move models from experimentation on the desktop to deployment for production workloads. For smaller teams looking to scale machine learning deployments, IBM Watson Machine Learning Server offers simple installation on any private or public cloud.

To get started, sign up for an IBMid and create your IBM Cloud account.

Read this article:
What is Machine Learning? | IBM

Related Posts

Comments are closed.