Data Cleanup – How to Reorder Levels of a Factor in R

Today we’re going to show you how to handle a common data cleanup task in R, sorting the levels of a factor into the order you wish to display them.

You have the option of specifying the order when you define a factor. This usually doesn’t affect the analysis (unless you are merging adjacent levels of a factor together to boost significance). However, it often affects the output of the graphs and reports which you intend to present to others.

# reorder factor levels r example
# setting up initial factor
> icecream <- factor(c('vanilla','chocolate','peach','mint','mint','mint'))
> icecream
[1] vanilla   chocolate peach     mint      mint      mint     
Levels: chocolate mint peach vanilla

In this example, the factor is unordered and they place chocolate first. The observant reader may detect I have a slight preference for mint ice cream. It would be a pleasing thing to see mint ice cream listed first on the reports.

We can accomplish this by using the levels parameter. This example uses the existing factor (icecream) as a starting point.

# reorder factor levels r example
# step where we actually reorder factor levels in r
> icecream <- factor(icecream, levels=c('mint','vanilla','chocolate','peach'))
> icecream
[1] vanilla   chocolate peach     mint      mint      mint     
Levels: mint vanilla chocolate peach

Ah… much better. Mint is first, vile peach ice cream is banished to the end!

We could also have accomplished this using the relevel() function.

# reorder factor levels r example
# alternative way to reorder factors in R using relevel()
> icecream <- factor(c('vanilla','chocolate','peach','mint','mint','mint'))
> icecream
[1] vanilla   chocolate peach     mint      mint      mint     
Levels: chocolate mint peach vanilla
> relevel (icecream, 'mint')
[1] vanilla   chocolate peach     mint      mint      mint     
Levels: mint chocolate peach vanilla

The same approach works for ordered factors.

# reorder factor levels r example
# ordered factors
> icecream <- c('vanilla','chocolate','peach','mint','mint','mint')
> icecream <- ordered(icecream, levels=c('mint','vanilla','chocolate','peach'))
> icecream
[1] vanilla   chocolate peach     mint      mint      mint     
Levels: mint < vanilla < chocolate < peach

And you can use vector operations to perform larger sorts on the factors to reverse the order of the factors or perform more intricate orderings.