Getting Picky – How To Use Which in R

When To Use Which in R

Need to find a needle in a haystack? Or, more specific for R programming, need to find which element of a Vector, Data Frame, or Matrix meets a specific set of conditions? You’ve come to the right place.

Use of Which in R

The Syntax: which(x, arr.ind = FALSE)

  • x can be any logical test vector
  • arr.ind is an optional parameter primarily uses for working with matrices (multiple dimension arrays); it indicates you want the array indices (both x and y data points, for example, vs a single index for one dimensional arrays).

You can use the function which in R to scan a data structure and identify the elements within that data structure which meet a specific condition. The which function returns the indices of the matching items in the data object.

Which() r examples

Lets dive right into some which function r examples .

# r language which function - which function r examples
> items < = c('hay','hay','hay','more hay','needle','hay','hay')
> which(items=='needle')
[1] 5
# neddle is indeed at position 5 in the list

> which(items=='goat')

# apparently we didn't get their goat; missing values are zeros

Which Function in R Data Frame

The which function also works for R data frames as well. In the example, below, we’re going to identify the indices of a subset of the ChickWeight data frame which meet a certain criteria (Time = day 20). This is one of the built in data sets within R – it shows progression of a group of chickens fed various diets over time.

# which function in r data frame
> which(ChickWeight$Time==20)
 [1]  11  23  35  47  59  71  83  95 106 118 130 142 154 166 193 207 219 231 243
[20] 255 267 279 291 303 315 327 339 351 363 375 387 399 411 423 435 447 459 471
[39] 483 495 517 529 541 553 565 577

As we can see, the which function grabbed the indices for all of the data points where the time was equal to 20.

Broader Applications

You’ve got a little room in using the which function in R.

  • The function can work with any logical test; it doesn’t just have to be checking for equality.
  • As mentioned, the function can be used for multi-dimensional matrices.

Which in R – Why This Matters?

Which is actually an extremely useful function within the R tool kit, especially if you are creating procedures to clean data and implement statistical algorithms. This function encapsulates the operation of peeking inside an existing data structure and tagging which elements might be required for additional processing.

It can also be used as the prelude to a filtering operation. Use which to spot which observations within a data frame need to be included or excluded from the analysis, return them as a vector of values, and process them accordingly. In effect this IS the basic underlying operation for “filter” in functional programming terms (returning pointers vs. creating an array).