Classifying Handwritten Digits Using EM and PCA

February 3, 2019 in machine learning

In this post, we’ll take the Semeion Handwritten Digits data set and cluster the handwritten digits data using the EM algorithm with a principle components step within each maximization. First, we’ll read in the data, load the additional libraries, and create our initial data table. library("mvtnorm") library("data.table") # Reading data and convert to data table setwd("C:/Users/Josh/Documents/GitHub/joshuahancock.github.io/data_sets/") data <- fread("C:/Users/Josh/Documents/GitHub/joshuahancock.github.io/data_sets/semeion.csv", header = FALSE) Each row of the data represents one handwritten digit, which were digitally scanned and stretched into a 16x16 pixel box.

Classifying Handwritten Digits Using EM and PCA

Josh Hancock

Classifying Handwritten Digits Using EM and PCA