How to Count The Number of Trues in R (For a Data Frame Column)

When doing data science, you often encounter Boolean values, these true and false values are usually answers to simple questions. In some situations, you are going to want to know how many true values you have in a vector or data frame column.

Description

The R programming language offers you two functions for finding the number of true values in a Boolean data frame column or vector. The first one has the format of sum(vector), where “vector” is the name of the vector or data frame column you wish to evaluate. The second method has the format of table(vector)[“TRUE”] where “vector” is the name of the vector or data frame column you wish to evaluate. In the second case “FALSE” can be Substituted for “TRUE” to get the number of false values.

Explanation

While you can use either function to obtain the number of true values, they do have significant differences. The sum function will find the number of true values, and it will produce an NA value when there is an NA value present in the list. The table function will find the number of true, false, and other values and produce an NA value when the value searched for is not in the list.

Examples

Here is an example of each function in action using a data frame with a column of randomly generated Boolean values.

> a = c(FALSE, TRUE)
> df = data.frame(A = c(1, 2, 3, 4, 5, 6, 7),
+ B = a[as.integer(abs(rnorm(7))+1)],
+ C = c(“A”, “B”, “C”, “D”, “E”, “F”, “G”),
+ D = as.integer(abs(rnorm(7)*10)))
> df
A B C D
1 1 FALSE A 10
2 2 TRUE B 19
3 3 FALSE C 1
4 4 FALSE D 14
5 5 FALSE E 0
6 6 TRUE F 13
7 7 TRUE G 2
> x = sum(df$B)
> x
[1] 3

Here is an example using the sum function, note that in this particular instance that there are three true values, and it produced a value of three.

> a = c(FALSE, TRUE)
> df = data.frame(A = c(1, 2, 3, 4, 5, 6, 7),
+ B = a[as.integer(abs(rnorm(7))+1)],
+ C = c(“A”, “B”, “C”, “D”, “E”, “F”, “G”),
+ D = as.integer(abs(rnorm(7)*10)))
> df
A B C D
1 1 FALSE A 0
2 2 TRUE B 8
3 3 FALSE C 21
4 4 FALSE D 8
5 5 FALSE E 5
6 6 FALSE F 5
7 7 FALSE G 9
> t = table(df$B)[“TRUE”]
> f = table(df$B)[“FALSE”]
> t
TRUE
1
> f
FALSE
6
Here is an example using the table function, note that in this particular instance it produced values of one and six for the number of true and false values respectively.

Application

One application of these functions is when you are trying to tally up a vote. If you are using the sum function, the values need to be true or false. However, if you are using the table function you can use it to count any value that you put in the brackets, this would include candidate names in an election. The table function is particularly useful in any situation where you need to find the number of a particular value within a data frame column or vector, such as polls.

When you are looking to tally up the number of true values within a data frame column, these two functions come in handy. Each of these functions has its advantages and disadvantages, and the usefulness of each one depends upon the situation.

Scroll to top
Privacy Policy