How To Use Mapply in R To Apply a Function to Multiple Lists or Vectors

Loops are a fundamental part of any programming language. It’s almost a given that you’ll use loops in even fairly simple code. If you’re not writing a loop yourself you’ll typically still benefit from them within libraries operating behind the scenes. And this rule is even more self-apparent within data-focused programming languages like R. These languages even give you special functions to optimize and simplify the process. In R this is often done with the apply function and its relatives. Apply lets you apply a function to every element of a data frame. While variants like mapply bring more specialized functionality. As you’ll soon see, mapply in particular makes it easy to apply a function to multiple lists or vectors.

The Basics of Mapply

Before we start using mapply we need to take note of why it’s such a useful part of the R programming language. R can be used for a wide variety of different tasks. But it’s highly optimized around statistics and data science. All of the objects and data types in R revolve around data manipulation. You can certainly use R for basic arithmetic centered around a couple of numbers. But R shines when it’s applying advanced processing to large data sets.

The fact that you’ll often juggle thousands, or even hundreds of thousands, of numbers at once shapes most of R’s fundamental design. And this is also where mapply shines. Imagine that you have multiple lists of data to work through. Doing so might entail running everything through multiple convoluted loops. Or you might need to join some, all, or only a subset of those lists together before then looping through them. But a single call to mapply will often accomplish the same thing. The concept can be best seen by turning this conceptual theory into practice.

Turning Theory Into Practice

One of the best ways to see what a unique set of functions can do is to simply contrast it with the alternatives. We’ve considered a situation where we need to loop through multiple lists in R and we can see how that might work out in the following code.

ourFirstList <- list(c(1, 4, 7, 10, 13))
ourSecondList <- list(c(2, 5, 8, 11, 14))
ourThirdList <- list(c(3, 6, 9, 12, 15))

ourFinalValue <- 0
for (x in ourFirstList) {
for(y in x){
ourFinalValue <- ourFinalValue+y
}
}

print(ourFinalValue)

In this example, we’re going to try and total the numbers found within multiple list instances. We begin by defining and populating three list variables. Next, we’ll create a variable called ourFinalValue with a value of 0. This is the variable we’ll be using to hold our results as we work with the list data.

The next stage involves setting up two loops to handle the range we might find in various list instances. As we move within the inner loop we add the values from each step to our ourFinalValue variable. And, finally, we print the results. Of course, if we wanted this to handle all three list instances we’d need to repeat the for loop two more times.

We could also move the loop into a new function. But while that would be a little more efficient, it’d still necessitate calling the function multiple times. And it’d still be an overly verbose and wasteful design if we were dealing with a larger amount of lists. You might also wonder why we’re not just using sum. The problem there becomes apparent if you add the following line above the print statement.

ourFinalValue <- sum(ourFirstList)

We’d receive a sum of the data sent to sum if everything was working as we might wish. Unfortunately, sum isn’t compatible with lists. As a result the interpreter simply gives us an error message to alert us of that fact. We need to unpack our data before sending it to sum or many other functions. And this brings us right back to reliance on for loops. However, this also returns us to the subject of mapply. We’d ideally want to find a way to loop through all of our lists while ensuring the contents are properly fed into sum. And it turns out that we can do this with considerable ease.

Practical Use of Mapply

We’ve already touched on the fact that mapply in R can make your code considerably more concise and efficient. But you might be surprised by the extent of it. Take a look at how much smaller the previous example is after converting it to use mapply.

ourFirstList <- list(c(1, 4, 7, 10, 13))
ourSecondList <- list(c(2, 5, 8, 11, 14))
ourThirdList <- list(c(3, 6, 9, 12, 15))

ourFinalValue <- mapply(sum, ourFirstList, ourSecondList, ourThirdList)
print(ourFinalValue)

We once again declare three lists with the same data as in the first example. But we go on to encapsulate all of the previous functionality into a single line of code. The ourFinalValue variable returns as the final receptacle of our calculations. But those calculations are actually performed in the mapply function.

Our use of mapply begins by passing sum as the first parameter. This is the same sum that we tried to use with ourFirstList. Note that we’re only passing the word sum rather than calling it as we normally would. Mapply takes over for sum’s parameter management. Instead of passing data directly to sum we do so with the values following it inside the mapply call.

We can essentially chain as many list instances together as we’d like. While we’re using three, this is simply to keep things concise. Part of what makes mapply so useful is its internal data management. Mapply handles a lot of the work normally required for data formatting. And we can see that in action on the next line when we pass the now populated ourFinalValue to print. Mapply has looped over all three list instances without us needing to set up a separate loop. What’s more, we can even extend this to vectors. Try running the slightly altered code below.

ourFirstNonList <- c(1, 4, 7, 10, 13)
ourSecondNonList <- c(2, 5, 8, 11, 14)
ourThirdNonList <- c(3, 6, 9, 12, 15)

ourEndValue = sum(mapply(sum,ourFirstNonList, ourSecondNonList, ourThirdNonList))
print(ourEndValue)

Moving to a straight vector format doesn’t require many changes. The main difference is that mapply needs a little extra help with the formatting. Mapply tries to return a vector. But it errs on the side of caution if a structure isn’t readily apparent. In the case of our vectors, it’s applying the multivariate logic on the first elements. But this is in terms of columns rather than rows. So we’re essentially seeing the sum applied to iterations over the columns of each. This results in multiple results which consist of the vector’s sums by column.

We can get around the alignment issue by either formatting things for mapply or simply converting the results as needed. In this case, the latter is the most efficient solution so we simply wrap mapply’s call to sum with another call to sum. The interior sum call through mapply generates multiple results by column and the outer concatenates it into a single value.

Need more options? Check Out The articles below…

Scroll to top
Privacy Policy