Finding averages is an important part of data science, they can tell you much about what is going on in a dataset. Being able to find the mean value of the rows in a data set is important, but it can be difficult. R has a single function that does all this for you.
Finding rowmeans in r is by the use of the rowMeans function which has the form of rowMeans(data_set) it returns the mean value of each row in the data set. It has several optional parameters including the na.rm logical parameter. This parameter tells the function whether to omit N/A values.
# dataset - rowmeans in R example > a = c(1:5) > b = c(1:5*2) > c = c(1:5*3) > d = c(1:5/2) > e = c(1:5/4) > > x = data.frame(a,b,c,d,e) > > x a b c d e 1 1 2 3 0.5 0.25 2 2 4 6 1.0 0.50 3 3 6 9 1.5 0.75 4 4 8 12 2.0 1.00 5 5 10 15 2.5 1.25 > # using R rowmeans to get mean of selected rows in R > rowMeans(x)  1.35 2.70 4.05 5.40 6.75
The applications for rowmeans in R are many, it allows you to average values across categories in a data set. The benefits of this include being able to determine if a given value is above or below the mean value. The example below shows the number of phones in various parts of the world during different years.
# data for rowmeans in r example > head(WorldPhones) N.Amer Europe Asia S.Amer Oceania Africa Mid.Amer 1951 45939 21574 2876 1815 1646 89 555 1956 60423 29990 4708 2568 2366 1411 733 1957 64721 32510 5230 2695 2526 1546 773 1958 68484 35218 6662 2845 2691 1663 836 1959 71799 37598 6856 3000 2868 1769 911 1960 76036 40341 8220 3145 3054 1905 1008 > # using rowmeans in R to get average across regions for dataset # granted, would work better if we used per-capital data > rowMeans(WorldPhones) 1951 1956 1957 1958 1959 1960 1961 10642.00 14599.86 15714.43 16914.14 17828.71 19101.29 20242.86
Once the rowMeans function is applied to this data set, we get the mean value for each year. For example, in 1951, North America was above the mean number of telephones, while Africa was well below it.
Rowmeans in R is a useful tool for finding the mean value of the rows in a data set. It is one of the many useful tools offered by R.
There are a couple of potential errors you can throw with this function. For example, the R rowmeans() function isn’t very tolerant of missing or non-numeric data. You can easily generate lovely errors such as…
error in rowmeans(x, na.rm = true) : ‘x’ must be numeric
Should this lovely fail-whale appear, the cause is simple enough. Check the data you’ve fed into your process. Something in there isn’t numeric and the rowmeans function throws a little tantrum to communicate that you. My best suggestion is to filter the missing or incorrect data point from your data and proceed from there.
You may also get:
error in rowmeans: ‘x’ must be an array of at least two dimensions
Which occurs when you feed a vector (single dimensional series of values) into a function which expects to look at an array.
Related Functions & Broader Usage
There are several functions designed to help you calculate the total and average value of columns and rows in R. In addition to rowmeans in r, this family of functions includes colmeans, rowsum, and colsum. Here’s some specifics on where you use them…
- Colmeans – calculate mean of multiple columns in r .
- Colsums – (answers: how do i sum each column in r…)
- Rowsums – sum specific rows in r
These functions are extremely useful when you’re doing advanced matrix manipulation or implementing a statistical function in R. These form the building blocks of many basic statistical operations and linear algebra procedures. This is why you sometimes see an error message from this cluster of functions show up as part of a higher level package.
In the event you need them, there are also functions for RowMedians (solves for the median of a row in R) and RowSD (solves for the standard deviation of a row in R). Given the existence of the above, be sure to do a quick search of the various R packages if you need anything more exotic – since it most likely exists…
If you are looking to solve for rowmeans by group, check out the aggregate function (one of the items we addressed in our article about descriptive statistics).