I love dates. Not really, they tend to be a common cause of programming problems and random frustration. In a dataframe, your code may mistake some decimal number for a timestamp value, attempting to carry out a datetime function on some random code snippet that isn’t even in the date column. That mistaken timestamp value messes up your whole character string, making your query unrecognizable as the code it is supposed to be. When you are dealing with actual datetime functions, it is important to be able to divide up the date string, and extract the specific information you need to do your calculations. A common question you will may ask is, “how can you extract the year from a date function in R?”
This is particularly useful if you need to roll up daily data into a higher level aggregate, such as a yearly number. Consider the following list of orders from various customers for some cute teddy bears at Christmas:
# How To Extract Year From Date in R - setup > bears <- data.frame(orderdate=c('2019-12-01','2019-12-03','2019-12-04','2019-12-06','2019-12-09','2019-12-10','2019-12-11','2019-12-12','2019-12-12','2019-12-12','2019-12-12','2019-12-12','2019-12-15','2019-12-19','2019-12-23','2019-12-24','2020-01-01','2020-01-15','2020-01-21','2020-01-21','2020-01-21','2020-01-21'),amount= c(1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,10,3,5,15)) > head(bears) orderdate amount 1 2019-12-01 1 2 2019-12-03 2 3 2019-12-04 1 4 2019-12-06 1 5 2019-12-09 1 6 2019-12-10 1
So we sell a trickle of bears (presumably at high prices) in 2019 and then need to dump the rest (at a greatly reduced price) in 2020. Typical holiday product, in other words. So – how did we do, in terms of total sales?
For this example, we’re going to do a couple of things. First, we’re going to show you how to extract year from date in R using the as function. Next, we’re going to roll up the amount of units sold by year using the aggregate function in R. The aggregate function can group and sum data by the levels of a variable. The final result will be a report on our sales of bears by year, in a nice date format that can tell us the value of the date object for a specified date year- that is, how many orders were placed in that year interval.
How To Extract Year From Date in R – The Code
Out of all of the datetime functions you can use in your R code, the extract year function is one of the most useful. There are many other extract date function options, including an extract month function, extract month number or week number, find date value, find current date, or convert date. Each of these has a different effect on the given datetime object, and can help you perform a wide variety of data analysis tasks on your specified date value or date range.
# How To Extract Year From Date in R - example # make sure the date is indeed a date > bears$orderdate <- as.date(bears$orderdate) # extract the year and convert to numeric format > bears$year <- as.numeric(format(bears$orderdate, "%Y")) # spot check for quality; yes, showing the year > head(bears) orderdate amount year 1 2019-12-01 1 2019 2 2019-12-03 2 2019 3 2019-12-04 1 2019 4 2019-12-06 1 2019 5 2019-12-09 1 2019 6 2019-12-10 1 2019
We’re prepped. Time to tally the results.
How To Extract Year From Date in R – Denouement
We’re going to use the aggregate year function in R to roll up data by year.
> aggregate(bears$amount, by=list(year=bears$year), FUN=sum) year x 1 2019 17 2 2020 35
Hmmm. We sold 17 toy bears in the month before Christmas and 35 toy bears when we liquidated the remaining inventory in the month after the holidays. Not a great year for bears. I’m thinking we should try another product next year. We may have made decent margins on the first 17 bears sold during the holiday season, but dumping 2/3 of our inventory at the end of the season has got to hurt…
Anyone want to give the Christmas Ferret a go?