The “x must be numeric error in r histogram” error message is a numeric data problem and not necessarily a decoding mistake. Now it can result from an R code input mistake if you created the dataset but if you obtained your numeric data set from an outside file, it may not result from a coding mistake. If you control the data source, then you can fix it but if not you need some extra code to fix the problem.
The circumstances of this error.
This problem usually occurs when using the hist function to create a histogram. You can get similar messages when you are using another plotting function. It occurs when the numerical variables you’re trying to create a histogram plot for has the wrong type of value. One way in which it can result from a coding mistake is by selecting the wrong column of a data frame for your summary statistic or correlation calculation. If you intend to access a column of numbers but instead call the name a column of names or missing values you will get this message. However, even if your input code is perfect a problem with the dataset will still produce this message.
What is causing this error?
Whether or not you’re trying to make a histogram, density plot, bar chart, frequency distribution, or even a graph of a categorical variable list the cause of the problem is the same. The function is looking for a numeric value and you are giving it something else.
# x must be numeric error in r histogram > df = data.frame("number" = c("3.14159", "2.71828", "1.41421" ), "Type" = c("r", "r", "i")) > re = df[,1] > hist(re) Error in hist.default(re) : 'x' must be numeric
In this example, we start with a data frame and extract the first column to make a histogram plot of that column. The problem is that the hist function for your frequency distribution is looking for a numeric vector but if you look closely you see that we are giving it a character vector. The bar chart function only takes a data point that is a numeric value and in this example, we are giving it character values. The frequency does not matter because it only takes one to trigger the message. The key to understanding this error message is the fact that the function is looking for a numeric value but it is getting a different value type.
How to fix this error.
There are two main ways that you can fix this problem. The first is only useful if you have direct access to it so that you correct a mistake at the source. The second is a solution that can be used in either case.
> df = data.frame("number" = c(3.14159, 2.71828, 1.41421 ), "Type" = c("r", "r", "i")) > re = df[,1] > hist(re)
In this case, we corrected the data frame so that the first column contains numeric values. Once this was done the problem was fixed and we get a histogram.
> df = data.frame("number" = c("3.14159", "2.71828", "1.41421" ), "Type" = c("r", "r", "i")) > re = as.numeric(levels(droplevels(df[,1]))) > hist(re)
In this case, the y values in the first column were left as a character vector but extra code was added to change them to numeric values. The levels(droplevels()) functions remove some extra material that messes up the conversion. The as.numeric() function then converted the character values to numeric values. This allowed the function to successfully create the histogram.
This is a fairly tricky message because if you do not have direct access to a dataset it may be hard to figure out exactly what is going on. In practice trying to fix this problem is likely to involve some trial and error. However, understanding what the problem is and how you can check your data and factor variable to find the exact cause, be it a character string error or missing values in a column, vertical axis, or cell, will help lead you to the exact solution for your particular R code case.
Looking for more help with R error messages? Check out these other great articles:
- R Error Message: nans produced error
- R Error Message: numeric(0) error
- R Error Message: more columns than column names
- R Error Message: unused argument