How To Fix R Error Message: some 'x' not counted; maybe 'breaks' do not span range of 'x'

While error messages are a nuisance, they let you know when there is a real problem with your code. The “some ‘x’ not counted; maybe ‘breaks’ do not span range of ‘x'” error message is one that looks worse than it really is. It is also an easy one for you to fix.

The circumstances of this error.

This error message can occur when using the hist.default() function which is a part of the ggplot2 library. It occurs when your data exceeds the range you have set for the graph you are making.

> library(ggplot2)
> df = read.table(text = ‘
+ A B C
+ 1 3 20 81
+ 2 4 34 13
+ 3 5 46 18
+ 4 6 42 16
+ 5 7 65 26
+ 6 8 71 28
+ 7 9 79 31′, header=TRUE)
> hist(df$B,breaks=seq(0,70,by=1))
Error in hist.default(df$B, breaks = seq(0, 70, by = 1)) :
some ‘x’ not counted; maybe ‘breaks’ do not span range of ‘x’

As you can see in this illustration rows 6 and 7 are “71” and “79” respectively. However, the hist.default() function has been given a range of “70.” It is the over-extension of this range that causes this error to occur.

What is causing this error?

This error message occurs as a result of having values that are larger than the range of the graph. It is most likely to occur when you do not know the content of the data you are using when you write the program.

> library(ggplot2)
> df = read.table(text = ‘
+ A B C
+ 1 3 20 81
+ 2 4 34 13
+ 3 5 46 18
+ 4 6 42 16
+ 5 7 65 26
+ 6 8 71 28
+ 7 9 79 31′, header=TRUE)
> hist(df$B,breaks=seq(0,70,by=1))

This illustration highlights the problem. Here, we have two values that exceed the range of the graph. The result is our error message.

How to fix this error.

Fixing this error is simple. All you need to do is make sure that the range that you use in the hist.default() function is greater than the highest value and your data.

> library(ggplot2)
> df = read.table(text = ‘
+ A B C
+ 1 3 20 81
+ 2 4 34 13
+ 3 5 46 18
+ 4 6 42 16
+ 5 7 65 26
+ 6 8 71 28
+ 7 9 79 31′, header=TRUE)
> hist(df$B,breaks=seq(0,80,by=1))

This is easy when you know the data you are working with. If you do not know the data, then some trial and error should help you find it. If this is not an option because the program will be using more than one data source, you can use a for-loop to go through the data and dynamically find the largest value. After this, you can add a buffer to get your range.