How to drop levels in R - Cleaning Up An Unused Factor

The action droplevels is utilized to drop untouched levels from a factor or, more generally, from factors in a data grid.

Implementation examples

## S3 approach for class ‘factor’
droplevels(x, exclude = if(anyNA(levels(x))) NULL else NA, …)
## S3 approach for class ‘data.frame’
droplevels(x, except, exclude, …)

x
an item from which to drop untouched factor levels.

exclude
passed to factor(); aspect levels which must be excluded from the result also if present. Remember that this was essentially NA in R less than/equal to 3.3.1 that did drop NA levels also when present in x, contradictory to the documentation. The current default is adaptable with x[ , drop=TRUE].**

…
further disputes passed to approaches

except
indices of lines from so not to drop levels

How droplevels works

The approach for class “factor” is currently comparable to factor(x, exclude=exclude). Concerning the data frame approach, you must rarely specify exclude “globally” for every factor columns; instead the default utilizes the exact factor-specific exclude as the factor approach itself.**

The except argument follow the usual indexing guidelines.

<strong>Amount</strong>

droplevels returns an object of the exact class as x

How to Utilize the droplevels Function in R

The droplevels() action in R can be utilized to drop untouched factor levels.

This action is particularly helpful if we desire to drop factor levels that are no longer utilized due to subsetting a vector or a data grid.

This action uses the next syntax:

droplevels(x)

where x is an item from where to drop untouched factor levels.**

aq : transform(airquality, Month = factor(Month, labels = month.abb[5:9]))
aq : subset(aq, Month != “Jul”)
table( aq $Month)
table(droplevels(aq)$Month)

droplevels: Drop Unused Levels from Factors

———————-

Note

It is mainly intended for cases which one or more factors in a data grid contains only items from a reduced level set following subsetting. (Remember that subsetting does not in general drop untouched levels). By preset, levels are dropped from every factor in a data grid, but the except dispute allows you to specify lines for which this is not needed.**

Also Note: subset for subsetting data grids. factor for explanation of factors. drop for dropping array dimensions. drop1 for dropping terms from a plan. [.factor for subsetting of factors.*”

Excluding Levels from a Factor in R Programming – droplevels() Action
droplevels() action in R programming utilized to remove untouched levels from a Factor.

Syntax:
# For vector item
droplevels(x, exclude = if(anyNA(levels(x))) NULL else NA, …)

# For data grid object
droplevels(x, except, exclude)

Parameter amounts:
x represents object from which untouched level has to be dropped
exclude displays factor levels that must be excluded also if present
except represents indices of lines from which levels must not be dropped

———————

The except argument follow the usual indexing rules.

Below provides a couple examples of how to utilize this action in practice.

Example 1: Drop Untouched Factor Levels in a Vector

Imagine creating a vector of information with five factor levels. Next suppose defining a new vector of information with only three of the default five factor levels.

#define information with 5 factor levels
data : factor(c(1, 2, 3, 4, 5))

#define new information as original information less 4th and 5th factor levels
new_data : data[-c(4, 5)]

#view new information
new_data

[1] 1 2 3
Levels: 1 2 3 4 5

Because the new data only includes three factors, we can view that it still includes the default five factor levels.

To remove these untouched factor levels, we can utilize the droplevels() action:

#drop untouched factor levels
new_data : droplevels(new_data)

#view data
new_data

[1] 1 2 3
Levels: 1 2 3

The new data now includes only three factor levels.

Example 2: Drop Untouched Factor Levels in a Data Grid

Imagine creating a data grid in which one of the variables is a factor with five levels. Next suppose we identified a new data grid that happens to exclude two of these factor levels:

#create data grid
df : data.frame(region=factor(c(‘A’, ‘B’, ‘C’, ‘D’, ‘E’)),
sales = c(13, 16, 22, 27, 34))

#view data grid
df

region sales
1 A 13
2 B 16
3 C 22
4 D 27
5 E 34

#define new data grid
new_df : subset(df, sales less than 25)

#view new data grid
new_df

region sales
1 A 13
2 B 16
3 C 22

#check levels of region variable
levels(new_df$region)

[1] “A” “B” “C” “D” “E”

Because the new data grid includes only three factors in the region line, it still includes the default five factor levels. Thus w
creating some issues if we attempted to create any plots utilizing this information.

To remove the untouched factor levels against the region variable, we can utilize the droplevels() action:

#drop untouched factor levels
new_df$region : droplevels(new_df$region)

#check levels of area variable
levels(new_df$region)

[1] “A” “B” “C”

Now the area variable only includes three factor levels.