Bar plots are a common way of presenting content in data science and r programming provides you with the tools to make them. The Error message before us occurs when you try to use the barplot function with a data frame rather than a vector or a matrix. Fixing this error involves giving the function a numerical vector.
Description of the error
When using the bar plot function, It will accept either a numerical vector or matrice for the height of the bars. Consequently, if you try to use a data frame or a non-numeric vector you will get the “’height’ must be a vector or a matrix” error message. The easiest way to get this error message is by using a data frame holding the height labels. You can also get this error message by trying to use a non-numeric column such as the one containing the labels. To get the labels on your barplot, you need to use a named vector. A named vector takes an extra step to set up, but it does produce a better-looking chart. The labels can be either a character vector or a logical vector, so it is easy to get good-looking bar plots.
Explanation of the error
The problem that causes the “’height’ must be a vector or a matrix” error message is using the wrong data types with the barplot function. In the example we give here we are using a data frame that contains both the numerical values and labels. This is not the proper formatting for including labels on your barplot.
> t = as.numeric(Sys.time())
> set.seed(t)
> df = data.frame(x = LETTERS[1:7],
+ y = as.integer(abs(rnorm(7)*10)))
> df
x y
1 A 3
2 B 4
3 C 10
4 D 7
5 E 14
6 F 1
7 G 4
> barplot(df)
Error in barplot.default(df) : ‘height’ must be a vector or a matrix
In this example of creating the error message, the labels are the letters in the X column and the heights are the numeric values in the y column. This is not a format that the function can read and so it produces an error message. To fix this error you need to enter the Y column as a numeric vector, and as a named vector to include the labels.
How to fix the error
Here are three examples of fixing this error message. It is important to note that the barplot function will not use either column or row names as labels for the barplot.
> t = as.numeric(Sys.time())
> set.seed(t)
> df = data.frame(x = LETTERS[1:7],
+ y = as.integer(abs(rnorm(7)*10)))
> df
x y
1 A 9
2 B 2
3 C 4
4 D 7
5 E 15
6 F 11
7 G 3
> barplot(df$y)
In this example of fixing this error message we simply use the y column as a numeric vector to provide the height value for each place on the plot.
> t = as.numeric(Sys.time())
> set.seed(t)
> df = data.frame(x = LETTERS[1:7],
+ y = as.integer(abs(rnorm(7)*10)))
> df
x y
1 A 1
2 B 7
3 C 7
4 D 3
5 E 23
6 F 12
7 G 15
> df_bar = df$y
> names(df_bar) = df$x
> df_bar
A B C D E F G
1 7 7 3 23 12 15
> barplot(df_bar)
In this example, we combine the X and Y columns into a named vector. In a named vector each label serves as a column name, where the columns have a single element.
> t = as.numeric(Sys.time())
> set.seed(t)
> df = data.frame(x = c(TRUE, FALSE),
+ y = as.integer(abs(rnorm(2)*10)))
> df
x y
1 TRUE 3
2 FALSE 6
> df_bar = df$y
> names(df_bar) = df$x
> df_bar
TRUE FALSE
3 6
> barplot(df_bar)
Here is an example, that uses a logical vector instead of a character vector. The barplot function works with logical values as labels as well as character values.
Despite the fact this is a very easy error message to get, it is also an easy one to fix. Getting the best results does involve creating a named vector from the data frame, but this is an extremely simple process. Also, the results of this extra step will make for an easier-to-understand plot.