The “fixing the r error message – attributes are not identical across measure variables; they will be dropped” error message occurs when using the dplyr and tidyr packages to analyze a data frame. It occurs because it is difficult to set up the gather and spread functions to work together properly.
Description of the error.
This error message occurs when analyzing a data frame using the dplyr and tidyr packages to analyze a data frame containing two factors associated with a group of numbers. In this analysis, you are trying to find out which numbers are associated with specific pairs of the factors. On the surface, this looks like a straightforward process but when using the gather and spread functions we will get our error message if they are not set up properly to work together. While fixing this problem is not difficult it is complicated. The process requires using the pivot_wider function and going through a second data frame created specifically for this purpose. While this process produces the results we are looking for, it is a rather complicated fix, that is less intuitive than the original approach but much easier to get right.
Explanation of the error.
Here we have an example of code that produces our error message. Note that you need to have the dplyr and tidyr packages installed otherwise instead of getting our error message you will get an error message telling you that the functions are not found.
> library(dplyr)
> library(tidyr)
> df = data.frame(
+ A = as.factor(c(“a”, “a”, “a”, “b”, “b”, “b”, “c”, “c”, “c”, “d”, “d”, “d”)),
+ B = as.factor(c(“aa”,”aa”,”aa”,”bb”,”bb”,”bb”,”cc”,”cc”,”cc”,”dd”,”dd”,”dd”)),
+ C = as.factor(c(“xx”,”yy”,”zz”,”xx”,”yy”,”zz”,”xx”,”yy”,”zz”,”xx”,”yy”,”zz”)),
+ D = c(5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60)
+ )
> df %>%
+ gather(A, B) %>%
+ spread(C)
Error: Must extract column with a single valid subscript.
x Subscript `var` has the wrong type `function`.
i It must be numeric or character.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
attributes are not identical across measure variables;
they will be dropped
As you can tell from looking at this code this approach produces another error message other than the one that we are trying to produce. Now it does produce our error message, but it clearly has other problems as well. Using this approach is a complicated process, where you will probably be surprised if you do not get an error message. So, despite the fact that the fix we propose is complicated, it is actually the better way to perform this task.
How to fix the error
Here we have an example of code that fixes this problem. It does so by using a different function to do the job from within the same packages.
> library(dplyr)
> library(tidyr)
> df = data.frame(
+ A = as.factor(c(“a”, “a”, “a”, “b”, “b”, “b”, “c”, “c”, “c”, “d”, “d”, “d”)),
+ B = as.factor(c(“aa”,”aa”,”aa”,”bb”,”bb”,”bb”,”cc”,”cc”,”cc”,”dd”,”dd”,”dd”)),
+ C = as.factor(c(“xx”,”yy”,”zz”,”xx”,”yy”,”zz”,”xx”,”yy”,”zz”,”xx”,”yy”,”zz”)),
+ D = c(5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60)
+ )
> df2 = group_by(df,
+ A,
+ B,
+ C) %>%
+ mutate(rn=row_number() -1 ,
+ rn2 = ifelse(rn == 0,””,as.character(rn)),
+ tC=paste0(C,rn2)) %>%
+ ungroup %>%
+ select(-C,-rn,-rn2)
> df2 %>% pivot_wider(names_from = tC,
+ values_from = D)
# A tibble: 4 x 5
A B xx yy zz
fct fct dbl dbl dbl
1 a aa 5 10 15
2 b bb 20 25 30
3 c cc 35 40 45
4 d dd 50 55 60
Despite the fact that this program is a little more complicated in that it requires setting up a second data frame, It does do the job without our error message. It may be complicated, but it works.
Fixing this error message is complicated because it requires setting up a different function and a second data frame. However, it supplies the comparison between the two factors in the original data frame that we are looking for. It is an excellent example of the programming adage that a program is right if it works.