Thursday, November 18, 2010

Missing Value Imputation

An interesting paper in Neural Networks: "Missing value imputation on missing completely at random data using multilayer perceptrons" by Esther-Lydia Silva-Ramíreza, Rafael Pino-Mejíasb, Manuel López-Coelloa, and María-Dolores Cubiles-de-la-Vegac.

In short, they have demonstrated that MLP can be used to impute values that are randomly missing from data sets. They also examine which learning algorithms and network architecture give the best results.

One thing that I would be interested in finding out, though, is how well an ANN trained on the imputed data would perform. In other words, in a situation where you had to impute data to train an ANN, how well would that ANN perform compared to one trained on the complete data set?