How is the Expectation-Maximization algorithm used in machine learning? – Analytics India Magazine

The expectation-maximization (EM) algorithm is an elegant algorithm that maximizes the likelihood function for problems with latent or hidden variables. As from the name itself it could primarily be understood that it does two things one is the expectation and the other is maximization. This article would help to understand the math behind the EM algorithm with an implementation. Following are the topics to be covered.

Lets try to understand how the expectation and maximization combination helps to decide the number of clusters to be formed but before that we need to understand the concept of the latent variable.

A latent variable is a random variable that can be observed neither in training nor in the test phase. These variables cant be measured on a quantitative scale. There are two reasons to use latent variables:

The latent variable is the direct causation of all the parameters. Now the final model is much simpler to work with and has the same efficiency without reducing the flexibility of the model. There is one drawback of latent variables: it is harder to train these models.

Are you looking for a complete repository of Python libraries used in data science,check out here.

The general form of probability distribution arises from the observed variables for the variables that arent directly observable also known as latent variables, the expectation-maximization algorithm is used to predict their values by using the values of the other observed variable. This algorithm is the building block of many unsupervised clustering algorithms in the field of machine learning. This algorithm has two major computational steps which are expectation and maximization:

A high-level idea of EM algorithm functioning is stated below.

So, we had an understanding of the EM algorithm functionality but for implementation of this algorithm in python we need to understand the model which uses this algorithm to form clusters. Lets talk about the Gaussian Mixture model.

The Gaussian Mixture Model is an important concept in machine learning which uses the concept of expectation-maximization. A Gaussian Mixture is composed of several Gaussians, each represented by k which is the subset of the number of clusters to be formed. For each Gaussian k in the mixture the following parameters are present:

The above plot explains the Gaussian distribution for the data having a mean of 4 and a variance of 0.25. This could be concluded as the normal distribution. Using an iterative process the model concludes the final number of the cluster with the help of these parameters which determines the cluster stability.

Lets implement the concept of expectation-maximization in python.

Import necessary libraries

Reading and analyzing the data

Using the famous wine data for this implementation.

Plotting a distribution

This plot helps to understand the distribution of the dependent variable over the independent variable.

Fitting the GMM

The score function returns the log-likelihood which the lower the better. The is negative because it is the product of the density evaluated at the observations and the density takes values that are smaller than one, so its logarithm will be negative. Ignoring the negative and focusing on the magnitude which is 0.73 indicates the model is good and the number of clusters should be 6.

The expectation-Maximization Algorithm represents the idea of computing the latent variables by taking the parameters as fixed and known. The algorithm is inherently fast because it doesnt depend on computing gradients. With a hands-on implementation of this concept in this article, we could understand the expectation-maximization algorithm in machine learning.

Original post:
How is the Expectation-Maximization algorithm used in machine learning? - Analytics India Magazine

Related Posts

Comments are closed.