Sometimes, in data science, it is necessary, to sum up, the rose set. In most programming languages, this would be a tedious task, particularly with large data sets. Here is a case where R comes to the rescue. It has a formula that can do the job with a single line of code
The RowSums Function.
Rowsums in r is based on the rowSums function what is the format of rowSums(x) and returns the sums of each row in the data set. There are some additional parameters that can be added, the most useful of which is the logical parameter of na.rm which tells the function whether to skip N/A values
# data for rowsums in R examples
> a = c(1:5)
> b = c(1:5*2)
> c = c(1:5*3)
> d = c(1:5/2)
> e = c(1:5/4)
>
> x = data.frame(a,b,c,d,e)
>
> x
a b c d e
1 1 2 3 0.5 0.25
2 2 4 6 1.0 0.50
3 3 6 9 1.5 0.75
4 4 8 12 2.0 1.00
5 5 10 15 2.5 1.25
>
# rowsum in R example / results
> rowSums(x)
[1] 6.75 13.50 20.25 27.00 33.75
If you manually add each row together, you will see that they add up do the numbers provided by the rowsSums formula in one simple step.
Applications of The RowSums Function.
The applications for rowsums in r are numerous, being able to easily add up all the rows in a data set provides a lot of useful information. In the example below, we have the number of phones in different parts of the world in different years, by adding up, the rows you can get the total number of phones worldwide for that year.
# rowsums in R - phone data
> head(WorldPhones)
N.Amer Europe Asia S.Amer Oceania Africa Mid.Amer
1951 45939 21574 2876 1815 1646 89 555
1956 60423 29990 4708 2568 2366 1411 733
1957 64721 32510 5230 2695 2526 1546 773
1958 68484 35218 6662 2845 2691 1663 836
1959 71799 37598 6856 3000 2868 1769 911
1960 76036 40341 8220 3145 3054 1905 1008
>
# rowsums in R example / results
> rowSums(WorldPhones)
1951 1956 1957 1958 1959 1960 1961
74494 102199 110001 118399 124801 133709 141700
Summing up the rows of data is an extremely useful tool and one which is simple to use. This single-function returns the sums for all the rows of the data set being worked on. This one simple tool does so much work that it is one of the examples of the power of R.
Potential Errors
There are a couple of potential errors you can throw with this function. For example, the R rowsums() function isn’t very tolerant of missing or non-numeric data. You can easily generate lovely errors such as…
error in rowsums(x, na.rm = true) : ‘x’ must be numeric
Should this lovely fail-whale appear, the cause is simple enough. Check the data you’ve fed into your process. Something in there isn’t numeric and the rowsums function throws a little tantrum to communicate that you. My best suggestion is to filter the missing or incorrect data point from your data and proceed from there.
You may also get:
error in rowsums: ‘x’ must be an array of at least two dimensions
Which occurs when you feed a vector (single dimensional series of values) into a function which expects to look at an array.
Related Functions & Broader Usage
There are several functions designed to help you calculate the total and average value of columns and rows in R. In addition to rowmeans in r, this family of functions includes colmeans, rowsum, and colsum. Here’s some specifics on where you use them…
- Colmeans – calculate mean of multiple columns in r .
- Colsums – how do i sum each column in r…
- Rowsums – sum specific rows in r
These functions are extremely useful when you’re doing advanced matrix manipulation or implementing a statistical function in R. These form the building blocks of many basic statistical operations and linear algebra procedures. This is why you sometimes see an error message from this cluster of functions show up as part of a higher level package.
In the event you need them, there are also functions for RowMedians (solves for the median of a row in R) and RowSD (solves for the standard deviation of a row in R). Given the existence of the above, be sure to do a quick search of the various R packages if you need anything more exotic – since it most likely exists…
If you are looking to solve for rowmeans or rowsums by group, check out the aggregate function (one of the items we addressed in our article about descriptive statistics).
Related Content: