How to Sort an R Data Frame

We’re going to walk through how to sort a data frame using R.

This article continues the examples started in our data frame tutorial. We’re using the ChickWeight data frame example which is included in the standard R distribution. You can easily get to this by typing: data(ChickWeight) in the R console. This data frame captures the weight of chickens that were fed different diets over a period of 21 days. If you can imagine someone walking around a research farm with a clipboard for an agricultural experiment, you’ve got the right idea….

This series has a couple of parts – feel free to skip ahead to the most relevant parts.

Sorting an R Data Frame

Continuing the example in our r data frame tutorial, let us look at how we might able to sort the data frame into an appropriate order. We will be using the order( ) function to accomplish this.

The order function’s default sort is in ascending order (from lowest to highest value). A quick hack to reverse this is to add a minus sign to the sorting variable to indicate you want the results sorted in descending order. Here are a couple of examples.

Returning to our feathered subjects (the chickens) for a moment, lets start by selecting a list of the chickens who were in the measured on the final day of the study (day 21). We’re going to use conditional indexing to do this quickly.

birds <- ChickWeight[ChickWeight$Time ==21,]

We’ve got a total of 45 birds in the set, by the way. Lets start by sorting them into order


birds[order(birds$weight),]

And as you can see, it does a lovely job in sorting the results from largest to smallest.

 

 

 

 

 

 

 

 

 

 

 

I’d like to be a bit more picky, however. Perhaps the largest birds, and only the top 5 of them. Easy enough.

 birds[order(-birds$weight),][1:5,] 

We use two techniques to zero in on the results we’re interested in. First, we use a negative sign in from the variable to sort the results in descending order. Next, we select the first five rows of the data frame for inspection, This yields the following result – which is exactly what we are looking for.

 

 

 

 

Sorting by Multiple Factors

Moving along, what if we wanted to sort the entire list by the largest birds for each diet? Easy enough, the order function supports the ability to sort using multiple variables.


birds[order(birds$weight),]

This yields this utterly lovely result, which satisfies our goal.

 

 

 

 

 

 

 

 

Summation: Sorting Dataframe in R

As you can see from the examples above, the order function provides you with the essential tool you need to sort a data frame in R. By manipulating the sign of the variables, you can control the direction of the sort.

Up next…adding and removing columns from a data frame. Or if you want to skip ahead, see below….