# Implement the EM algorithm for fitting a Gaussian mixture model for the MNIST dataset. We reduce the dataset to be only two cases

Implementing EM algorithm for MNIST dataset.

Implement the EM algorithm for fitting a Gaussian mixture model for the MNIST dataset. We reduce the dataset to be only two cases, of digits "2" and "6" only. Thus, you will fit GMM with C = 2. Use the data file data.dat. True label of the data is provided in label.dat

The matrix images is of size 784-by-1990, i.e., there are totally 1990 images, and each column of the matrix corresponds to one image of size 28-by-28 pixels (the image is vectorized; the original image can be recovered, e.g., using MATLAB code, reshape(images(:,1),28, 28).

(a)Select from data one raw image of "2" and "6" and visualize them, respectively.

(b) Use random Gaussian vector with zero mean as initial means, and identity matrix as initial

covariance matrix for the clusters. Please plot the log-likelihood function versus the number of

iterations to show your algorithm is converging.

(c) Report the finally fitting GMM model when EM terminates: the weights for each component, the

mean vectors (please reformat the vectors into 28-by-28 images and show these images in your

submission). Ideally, you should be able to see these means corresponds to "average" images. No

need to report the covariance matrices.

(d) (Optional). Use the pic to infer the labels of the images, and compare with the true labels. Report

the miss classification rate for digits "2" and "6" respectively. Perform K-means clustering with

K = 2. Find out the miss classification rate for digits "2" and "6" respectively, and compare with

GMM. Which one achieves the better performance?

