Using replace in r to replace values in a data object

When doing data science, sometimes you need to do a replace in r. Fortunately, r programming supplies a replace method for many situations. This is a tool designed to work on vectors and data frames.

Description – replace in R

Whenever you do a replacement in the r programming language, you use the replacement function. Its format is not repl but rather it has the format of replace(x, list, values) where “x” is the vector being worked on, “list” is a list of the locations being worked on, and “values” is a replacement value list. If you are applying this function to a data frame, “x” is the data frame, “list” is a list of column names, and “values” is the list of replacement values. It does not handle individual characters in a character string, the str replace function from the stringr package does this job.

Explanation – Replace in R

When you are using the replacement function, it does not select specific values, but locations within the vector. If you try to substitute values within a factor with new values that are not already a level in the factor, you will get a warning message, and the replacement will use missing values. It will also do a replacement of missing data if it is in a selected location. This process does multiple replacements with one or multiple values. In a data frame, you can use the names of columns as the target, replacing the entire column, or a column can be used as a vector variable. In both cases, you can substitute an existing string value with a replacement string.

Examples of Replace in R

Here are code examples illustrating replacing in R. They include vectors and data frames and use different situations.

> x = c(1,2,3,7,8,9,10,11,12)
> x
[1] 1 2 3 7 8 9 10 11 12
> replace(x, c(4, 8), 22)
[1] 1 2 3 22 8 9 10 22 12

In this example, we swap out two values in a vector for the same substitute value. Note that the numbers in the “list” argument represent location, rather than values. This makes it easier for this process to work on a character vector as well.

> x = c(1,2,3,7,8,9,10,11,12)
> x
[1] 1 2 3 7 8 9 10 11 12
> replace(x, c(4, 8), c(22, 33))
[1] 1 2 3 22 8 9 10 33 12

In this example, we substitute multiple locations, with multiple new values.

> df=data.frame(Z=c(“A”,”A”,”B”,”B”,”A”,”B”),
+ X=c(“F”,”A”,”E”,”C”,”D”,”B”),
+ Y=c( 44,22,33,29,31,16))
> df
Z X Y
1 A F 44
2 A A 22
3 B E 33
4 B C 29
5 A D 31
6 B B 16
> replace(df, “A”, c(“A”,”B”,”C”,”D”,”E”,”F”))
Z X Y A
1 A F 44 A
2 A A 22 B
3 B E 33 C
4 B C 29 D
5 A D 31 E
6 B B 16 F
> replace(df, “X”, c(“A”,”B”,”C”,”D”,”E”,”F”))
Z X Y
1 A A 44
2 A B 22
3 B C 33
4 B D 29
5 A E 31
6 B F 16
> df$Y = replace(df$Y,3,77)
> df$Z = replace(df$Z,3,”Q”)
Warning message:
In `[=.factor`(`*tmp*`, list, value = “Q”) :
invalid factor level, NA generated
> df
Z X Y
1 A F 44
2 A A 22
3 NA E 77
4 B C 29
5 A D 31
6 B B 16

In this example, we have several replacements in an r data frame. In the first case, we have a new column name and values, the result is a new column. The second case replaces an entire column while keeping the column name. The third replaces a specific location in a selected column. The fourth replaces a selected location of a factor and ends up producing a warning message and NA value.

> df=data.frame(Z=c(“A”,”A”,”B”,”B”,”A”,”B”),
+ X=c(“F”,”A”,”E”,”C”,”D”,”B”),
+ Y=c(44,22,33,29,31,16))
> df
Z X Y
1 A F 44
2 A A 22
3 B E 33
4 B C 29
5 A D 31
6 B B 16
> x = as.character(df$Z)
> x = replace(x,3,”Q”)
> df$Z = as.factor(x)
> df
Z X Y
1 A F 44
2 A A 22
3 Q E 33
4 B C 29
5 A D 31
6 B B 16

In this final example, we show the proper way of replacing values within a factor, this would work for either a data frame or vector.

Application – Replace in R

There are three main applications to using replacement in R All of which are associated with data analysis. The first one is correcting errors in the original data set. The second would be updating a data set because new information is available. Updating could include adding new information or updating old values. When running a series of calculations on a vector or data frame, being able to update the content as you run the calculations is helpful.

Doing a replacement of data can be necessary, for the correction of mistakes, updating old values, and running calculations. Within R programming this is a simple process, with an easy-to-use function. Once you understand how it works you should have no problems with it.

Scroll to top
Privacy Policy