Loops are one of the fundamental components of any programming language. But they have a special place of honor in systems like R programming which are centered around data science. R even provides you with a number of functions that can automate loop functionality. For example, you might use items in the apply family to modify every element in a data frame. But automated functionality can’t anticipate every usage scenario. And there will always be moments when you need to manually iterate over items in a data frame. So how would you actually go about using a for loop to accomplish that task?
An Overview of for Loops
R loops operate in a relatively similar way to what you’ll find in other popular programming languages. However, that’s not to say that there’s no learning curve involved with using a loop in R for the first time. The main point of divergence with R loops comes from the fact that R has its own special data types. A data frame or vector in R loops will inherit all of those variables’ special properties. Likewise, the fact that R is heavily focused on statistics means that variables often function differently than you might expect after using less specialized languages. Of course, this general concept is a lot easier to understand after simply jumping into some functional code.
An Initial Example of For Loops in R
We can begin by looking at a simple example that illustrates how for loops in R operate. Take a look at the following code.
ourDataFrame <- data.frame(
first = c (99,2,5,8,11,14,17,20,23,1111),
second = c(88,3,6,9,12,15,18,21,24,2222),
third = c(77,4,7,10,13,16,19,22,25,3333))
for(x in ourDataFrame[,1]) {
print(x)
}
We begin by creating a data frame instance called ourDataFrame. Next, we create the for loop which will move through that frame. We provide it with an ourDataFrame reference which points to the entirety of its first row.
Note that we define a new variable, called x, when forming the loop’s conditional logic. The initial variable in for loops contain whatever item we’re iterating over. This makes the process more like iterator methods in other languages than a typical implementation of for loops. Though we can also form a numerical iteration by using a function like nrow to pass the number of items in a frame to the loops when we create them.
Our example’s logic ensures x will be constantly redefined with whatever variable we’re iterating over. And we can see that in effect when the print statement outputs the contents of x to the screen. This process prints the numbers 99 to 1111 as we iterate from the initial value to the end of the row. Note that the process is even error-tolerant. Replace the print statement with the following code and see what happens.
print(ourDataFrame[i+1,1 ])
The new code tries to access an element outside of the allowable range. The missing value is simply referenced as “NA”. It’s also quite easy to insert additional calculations, control statements, or criteria into the loop’s structure. For example, replace the loop with the following code.
for(x in ourDataFrame[,1]) {
print(x)
if (x==8) {
print(“Break triggered”)
break
}
}
This time around we begin at the start of the row and have a secondary conditional that checks the value of x against 8. If x does equal 8 then the conditional will print “break triggered” and execute a break statement. The break statement will instantly break us out of loops when called. This is especially useful when we create a repeat loop that can essentially run forever. We can even nest multiple logical decisions within each other and use break to escape when the conditions are reached. Take a look at the following R code.
breakValue <- 5
repeat {
print(breakValue)
breakValue = breakValue+5
if (breakValue == 100){
print(“Break value 100 reached”)
break
} else {
print(“Still looping”)
}
}
In this example we create a variable, breakValue, that will trigger success or failure during the looping process. This is a valuable tool within the context of frame iteration. Huge sets may well lock up a process for longer than is reasonable. But a combination of variable analysis and break statements provides you the ability to give up on a process if it hasn’t proven successful after a reasonable number of iterations. But now that we’re adding some more logic into the loops we can move on to some of the more complex permutations of this general idea.
More Advanced Scenarios With For Loops
Loops are essentially control structures that work based on a central logical statement or condition. And like any structure, we can always add more components to it. We’ve seen how to add a simple if into a loop’s structure. But we can also create a nested loop that consists of an inner loop and an outer loop. In fact, this is also how we can use iteration to move over every component in a data frame’s structure. Try running the following R code.
ourDataFrame <- data.frame(
first = c (99,2,5,8,11,14,17,20,23,1111),
second = c(88,3,6,9,12,15,18,21,24,2222),
third = c(77,4,7,10,13,16,19,22,25,3333))
for (x in 1:nrow(ourDataFrame)){
for (y in 1:ncol(ourDataFrame)){
print(paste(‘row’, x, ‘column’,y,’=’, ourDataFrame[x,y]))
}
}
We begin in a similar way to the previous examples and define our frame as ourDataFrame. However, things change considerably when we actually create the looping logic even though it might not seem that way at first glance.
We declare two for loops, one within the other. The first of our loops creates an x value to store the contents of the iteration. But we’re not directly assigning frame contents to x as we iterate. We instead pass the result of nrow to for. With the earlier 1 parameter, this essentially tells for that we want to move through a progression starting at 1 and ending at the number of elements in ourDataFrame’s rows. The important point to remember is that by providing numbers we’re looping through that set of values and assigning them to x. This gives us numerical elements that correlate to the positions we need within ourDataFrame.
We do the same with ncol and the ourDataFrame columns. This time around we use y for our current iteration value when moving over the columns. Finally, we print everything to screen using paste to format the data in an easier-to-understand manner. And, as you can see, we’ve now successfully iterated across data frames using for loops.
Need more options? Check Out The articles below…
- How To Use the apply function (matrix or data frame)
- How To Use the sapply function (simplified version of lapply)
- How To Use the lapply function (list or vector)
- How To Use the mapply function (applying a function to multiple lists or vectors)
- How To Use the tapply function (levels of a factor)
- While Loops
- For Loops
- Creating Anonymous Functions in R