R functions – is.na – cleaning up missing values

As we’ve noted elsewhere, missing values can be a significant annoyance in real world data collection. Surveys come back incomplete or illegible, meter readings are indeterminate, and tick sheets are lost. The variable in question might even occur sparsely, in combination with other factors. In any event, we’re going to need to identify and clean up missing values.

R and the is.na function – finding missing values

The first step of the process is detecting missing values in our data when they occur. This is accomplished via the is.na function.

# is.na in R example
test <- c(1,2,3,NA) 
is.na(test)

This function will return a vector of True / False values indicating if the values of a vector are missing. This can be used to filter or replace values.

To select entire rows of a data frame which include at least one missing value, consider using the complete.cases function (complete cases function reference).