Convert Character to Numeric in R

Some Coder Humor: character strings are like opinions – almost every data source has them. More flexible than a numeric variable, they’re a great holding pen for data where you don’t know what you’re going to use it for. The good news? We’ve got all that data available… the bad news? To unlock that treasure trove of available data, we’re going to need to be able to change character to numeric in r. This tutorial will help you do this.

Meet the character data type:

Characters are the default data type used to store text data in many languages. They happily accept both numbers and letters. In fact, if you’re the type who likes categorical variables (and who doesn’t? I love them), those are rarely anything other than a character string.

The same applies if you’re going to do factor variable analysis. Most factor column(s) are configured as a character string.

The catch of all this wonderful fun? When you initialize a variable, they assume everything that you store in that variable should be treated like a letter. So that 1 gets stored as the character “1”.

Why is this a problem? Because while 1 + 1 = 2, “1” + “1” yields the far more sought after:

Error in “1” + “1” : non-numeric argument to binary operator

In essence, most R functions will attempt to add the two character strings together, generate an unexpected result, and immediately halt and catch fire. Which delays finishing up your work and grabbing a beer. A very bad outcome indeed.

Hopefully this shows the utility of learning how to convert character string variables in your dataframe into a numeric value. You’ll need this for any statistical analysis or numeric modeling.

How to Convert a Character to a Numeric in R?

We’ll first start by creating our data of characters. I generate 100 random numbers with the levels 1, 3, 6 and 9. I use the as.character() command to store them as characters in the variable ‘myData’.

> myData <- as.character(sample(c(1, 3, 6, 9), 100, replace = TRUE))

If I now print myData, this is what I see.

# how to convert character to numeric in r
> myData
  [1] "9" "3" "9" "1" "3" "9" "3" "9" "1" "1" "3" "6" "3" "6" "1"
 [16] "6" "9" "6" "3" "9" "3" "1" "6" "3" "3" "9" "3" "6" "1" "9"
 [31] "1" "6" "3" "1" "1" "6" "1" "9" "9" "1" "1" "6" "3" "6" "1"
 [46] "3" "3" "3" "9" "9" "1" "3" "6" "6" "3" "9" "6" "9" "9" "6"
 [61] "3" "3" "3" "3" "6" "6" "3" "3" "3" "9" "1" "9" "1" "6" "1"
 [76] "1" "9" "1" "9" "1" "1" "3" "6" "6" "6" "6" "3" "1" "3" "1"
 [91] "3" "3" "6" "3" "6" "6" "1" "3" "3" "9"

An important observation is that the data values have been stored as characters, which is why you can see all the data values enclosed in inverted commas. This kind of data can have its limitations. Suppose you wanted to add 5 to each observation in this data. Let’s try that on R.

> 5 + myData
Error in 5 + myData : non-numeric argument to binary operator

I get an error when I try to perform computations on the character variable. However, we’ll now convert these data values into numeric. Probably one of the easiest ways to do this on R is by using the as.numeric() command. Not just for characters but any data type, whenever you are converting to numeric, you can use the as.numeric() command. It does not come as part of a package, rather it is a native command of R that you can directly use.

> NumericalData <- as.numeric(myData)

The variable ‘NumericalData’ now stores the values of myData as numeric values. It’s as simple as that.

You can verify this by printing the variable.

> NumericalData
  [1] 9 3 9 1 3 9 3 9 1 1 3 6 3 6 1 6 9 6 3 9 3 1 6 3 3 9 3 6 1 9 1
 [32] 6 3 1 1 6 1 9 9 1 1 6 3 6 1 3 3 3 9 9 1 3 6 6 3 9 6 9 9 6 3 3
 [63] 3 3 6 6 3 3 3 9 1 9 1 6 1 1 9 1 9 1 1 3 6 6 6 6 3 1 3 1 3 3 6
 [94] 3 6 6 1 3 3 9

The values of ‘NumericalData’ are not enclosed in inverted commas hence verifying that these are numeric values. We can now also perform computational tasks on this data.

> 5 + NumericalData
  [1] 14  8 14  6  8 14  8 14  6  6  8 11  8 11  6 11 14 11  8 14
 [21]  8  6 11  8  8 14  8 11  6 14  6 11  8  6  6 11  6 14 14  6
 [41]  6 11  8 11  6  8  8  8 14 14  6  8 11 11  8 14 11 14 14 11
 [61]  8  8  8  8 11 11  8  8  8 14  6 14  6 11  6  6 14  6 14  6
 [81]  6  8 11 11 11 11  8  6  8  6  8  8 11  8 11 11  6  8  8 14

Convert Entire Data Frame to Numeric

Previously we worked with a single character variable, but now that you have some basic understanding of how numeric conversion works, I can extend the discussion to data frames that use multiple columns of characters. I’ll begin by creating a data frame.

> a <- c("1", "3", "5", "8")                      
> b <- c("12", "13", "15", "19") 
> c <- as.factor(c("25", "30", "30", "31")) 
> d <- c(1, 12, 13, 27)
> myData2 <- data.frame(a, b, c, d, stringsAsFactors = FALSE)

Printing ‘myData2’ gives us the data frame that you can see below.

a dataframe which you can use to learn how to convert characters to numeric in r

Using the as.numeric() command again, we’ll convert the columns of this data set that have been stored as characters, i.e., columns a and b. Notice that column c is not a character column, rather it is a factor. Our method here converts the factors into numeric as well but to get more information on why this works, I encourage you to read conversion of factors to numeric.

> NumericalData2 <- as.data.frame(apply(myData2, 2, as.numeric))

Data frames are used differently with the as.numeric() command, to learn more about the syntax, I encourage you to read the R documentation on data frames. However, using the above code, we can convert all columns into numeric, you can verify this using the sapply() function. The code below gives the class of each column in our data frame.

> sapply(NumericalData2, class)  

[Therefore, we have converted all character columns of the data frame into numeric.

Troubleshooting:

At some point, you’re going to encounter situations where the encoded data has a leading zero. These string variable types can be easily converted into a numeric column or vector using the as.numeric() function above. As.numeric() will purge the leading zero as part of change. Be sure to watch for integer vs. decimal point accuracy in the numeric type conversion process.

Unicode character issues are beyond the scope of this article. You can often encounter these if you’re working with international data sources, particularly ones which aren’t using an American or EU standard systems. Consult Stack Overflow (generally someone has posted a question for that specific data type and system).

Pay careful attention to missing values in the convert process. You’ll see these in the numeric data results if the conversion function can’t make sense of the character string.

While many old school programmers like to use a regular expression (regex) helper function to extract character values from a string, these are often a serious challenge to maintain and share with others. There are easier ways to typecast a character column into a numeric vector.

These tricks for converting characters into numeric codes is a simple one, yet extremely helpful in reading and manipulating your data. It can be used for converting character vectors to numerical vectors, dataframes, and more! We hope you found this page useful, and encourage you to check out the rest of ProgrammingR for more helpful tips!

Scroll to top
Privacy Policy