Sometimes when working with factors in data science it is necessary to remove a level from a factor that you are working with. R programming has a simple two-step process for performing this task.
Ways To Remove Levels From a Factor In R
Removing a level from a factor consists of two steps. The first has the form of factor[-level] which defines the factor to be processed in the position in the vector to be removed. The second step uses the droplevels function with the format of droplevels(factor), which drops any empty levels from the factor. Despite being a two-step process it is still extremely simple to perform because all you need is the name of the factor you are processing and the position of any values that you need to remove.
How This Works
When removing a level from a factor you first remove any values within that level based on their position. The droplevels function then removes any empty levels. If you remove a position that has a duplicate value within a factor the droplevels function will not remove any levels because there will be none that are unoccupied.
Two Examples of Removing Levels from a Factor in R
Here are two examples of removing a level from a factor in R programming. One is a simple one through eight list and the other is a factor of random numbers.
> x = factor(1:8)
> x
[1] 1 2 3 4 5 6 7 8
Levels: 1 2 3 4 5 6 7 8
> x = x[- 1]
> x
[1] 2 3 4 5 6 7 8
Levels: 1 2 3 4 5 6 7 8
> x = droplevels(x)
> x
[1] 2 3 4 5 6 7 8
Levels: 2 3 4 5 6 7 8
This is a factor consisting of a simple list of numbers one through eight. The first level was removed in two steps. The first one removes the position and the second one drops the level.
> t = as.numeric(Sys.time())
> set.seed(t)
> x = factor(as.integer(abs(rnorm(10)*10)))
> x
[1] 4 6 0 4 8 4 11 9 1 5
Levels: 0 1 4 5 6 8 9 11
> x = x[- 1]
> x
[1] 6 0 4 8 4 11 9 1 5
Levels: 0 1 4 5 6 8 9 11
> x = droplevels(x)
> x
[1] 6 0 4 8 4 11 9 1 5
Levels: 0 1 4 5 6 8 9 11
This is a factor consisting of random numbers. The first level is removed in two steps just like in the first example.
Applications of Dropping Levels from a Factor In R
The applications are removing the level from a factor are any situation where you have the eliminated values from a factor producing unused levels. This is a handy way of reducing the space being used to store data. After all, any unused levels are a waste of memory.
Being able to remove unused levels from a factor, after you have removed the data that originally occupied those levels is a handy way of reducing space. This makes it a handy tool for your programming toolbox.