We frequently encounter datasets with missing values (represented as NAs in the data frame). Missing values render useless some part of the data. Why those values are missing is a different story and is beyond the scope of this article. Here we only talk about treatment.

The primary treatment is either to delete the rows with missing values (reducing the No. of observations) or to remove the columns with missing values (giving up some information). Some people are not happy with a reduced dataset, and they replace missing values with summary stats such as means or medians of available values…


I a marketing professor and I teach BDMA (big data and marketing analytics) at Lazaridis School of Business, Wilfrid Laurier University, in Waterloo, Canada.

