I love dates. Not really, they tend to be a common cause of programming problems and random frustration.
If you’re working with time-series data in R, you might need to extract the year from a date value to analyze trends and patterns over time. However, extracting the year from a date value can be tricky, especially when the date is stored in different formats or has missing or invalid values. In this article, we’ll guide you through the different ways to extract the year from a date value under multiple circumstances.
The article explains how to extract the year from a date value stored in a vector or a dataframe column using the
format() function. By following these tips and examples, you’ll be able to extract the year from a date value in R with ease, regardless of the circumstances.
A Basic Example of Extracting A Date In R
This is particularly useful if you need to roll up daily data into a higher level aggregate, such as a yearly number. Consider the following list of orders from various customers for some cute teddy bears at Christmas:
# How To Extract Year From Date in R - setup > bears <- data.frame(orderdate=c('2019-12-01','2019-12-03','2019-12-04','2019-12-06','2019-12-09','2019-12-10','2019-12-11','2019-12-12','2019-12-12','2019-12-12','2019-12-12','2019-12-12','2019-12-15','2019-12-19','2019-12-23','2019-12-24','2020-01-01','2020-01-15','2020-01-21','2020-01-21','2020-01-21','2020-01-21'),amount= c(1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,10,3,5,15)) > head(bears) orderdate amount 1 2019-12-01 1 2 2019-12-03 2 3 2019-12-04 1 4 2019-12-06 1 5 2019-12-09 1 6 2019-12-10 1
So we sell a trickle of bears (presumably at high prices) in 2019 and then need to dump the rest (at a greatly reduced price) in 2020. Typical holiday product, in other words. So – how did we do, in terms of total sales?
For this example, we’re going to do a couple of things. First, we’re going to show you how to extract year from date in R using the as function. Next, we’re going to roll up the amount of units sold by year using the aggregate function in R. The aggregate function can group and sum data by the levels of a variable. The final result will be a report on our sales of bears by year, in a nice date format that can tell us the value of the date object for a specified date year- that is, how many orders were placed in that year interval.
How To Extract Year From Date in R – The Code
Out of all of the datetime functions available for use in your R code, the extract year function is one of the most useful. There are many other extract date function options, including an extract month function, extract month number or week number, find date value, find current date, or convert date. Each of these has a different effect on the given datetime object, and can help you perform a wide variety of data analysis tasks on your specified date value or date range.
# How To Extract Year From Date in R - example # make sure the date is indeed a date > bears$orderdate <- as.date(bears$orderdate) # extract the year and convert to numeric format > bears$year <- as.numeric(format(bears$orderdate, "%Y")) # spot check for quality; yes, showing the year > head(bears) orderdate amount year 1 2019-12-01 1 2019 2 2019-12-03 2 2019 3 2019-12-04 1 2019 4 2019-12-06 1 2019 5 2019-12-09 1 2019 6 2019-12-10 1 2019
We’re prepped. Time to tally the results.
How To Extract Year From Date in R – Denouement
We’re going to use the aggregate year function in R to roll up data by year.
> aggregate(bears$amount, by=list(year=bears$year), FUN=sum) year x 1 2019 17 2 2020 35
Hmmm. We sold 17 toy bears in the month before Christmas and 35 toy bears when we liquidated the remaining inventory in the month after the holidays. Not a great year for bears. I’m thinking we should try another product next year. We may have made decent margins on the first 17 bears sold during the holiday season, but dumping 2/3 of our inventory at the end of the season has got to hurt…
Anyone want to give the Christmas Ferret a go?
Examples of Extracting the Year from a Date in Different Contexts
Extracting the Year from a Date Stored in a Vector or a Dataframe Column
format() function for vectors and data frame columns. This function takes a date object as input and returns a character string in the specified format.
> dates <- c("2022-10-31", "2023-01-01", "2023-06-30") > years <- format(as.Date(dates), "%Y") > years  "2022" "2023" "2023"
To extract the year from a date stored in a dataframe column, use the same approach. For example, to extract the year from a “Date” column in a dataframe called “mydata”, use the following code:
> mydata$Year <- format(as.Date(mydata$Date), "%Y")
Extracting the Year from a Date that is Stored as a Character String
If the date is stored as a character string, you can still extract the year by converting the string to a date object using the
> date <- "2022-10-31" > year <- format(as.Date(date), "%Y") > year  "2022"
> date <- "10/31/2022" > year <- format(as.Date(date, format = "%m/%d/%Y"), "%Y") > year  "2022"