When a data scientist does a lag in r with a time series there is a simple function for the task. Doing a lag on a time series pushes the data back to an earlier time depending upon the size of the lag. This lag can be any integer value.
Description – Lag in R
When doing a lag in a time series, you use the lag function which has the format of lag(ts, k) where “ts” is the time series and “k” is the lag. These are required fields, but the lag has a default value of one. The lag also must be an integer value, and it pushes the values backed by the number of months shown by the lag value. It is a simple formula to use, but it only provides meaningful results when dealing with a time series. In other cases, you can get an error message or meaningless results.
Explanation
When using the lag function, it takes the value of the lag and shifts the data that a number of months. By using a negative number for the lag, you can produce a lead in the data. This function will not work with a vector, data frame, or another variable, other than a time series. This makes it extremely easy to understand because all you are doing is shifting the data a certain number of months. If you have the wrong data type you either get an error message or meaningless results. In any case, you do not want to use this function in such a manner.
Examples of Using Lag in R
Here are two examples using the lag function. The lag argument can be any integer value and it is not limited to twelve. We just use twelve, so that each series looks nice.
> x = 1:10
> ts1 = ts(x, start=c(2010, 1), end=c(2012, 12), frequency=12)
> ts1
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2010 1 2 3 4 5 6 7 8 9 10 1 2
2011 3 4 5 6 7 8 9 10 1 2 3 4
2012 5 6 7 8 9 10 1 2 3 4 5 6
> lag(ts1, 12)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2009 1 2 3 4 5 6 7 8 9 10 1 2
2010 3 4 5 6 7 8 9 10 1 2 3 4
2011 5 6 7 8 9 10 1 2 3 4 5 6
> set.seed(13579)
> N = 60
> r = as.integer(rnorm(N)*100)
> ts1 = ts(r, start=c(2017, 1), end=c(2019, 12), frequency=12)
> ts1
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2017 -123 -125 -25 -152 109 248 77 18 -102 -25 74 47
2018 -76 -43 -47 -56 -121 24 -86 65 -3 -95 -169 89
2019 72 -16 -76 181 23 -9 90 3 24 -124 -147 -84
> lag(ts1, 12)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2016 -123 -125 -25 -152 109 248 77 18 -102 -25 74 47
2017 -76 -43 -47 -56 -121 24 -86 65 -3 -95 -169 89
2018 72 -16 -76 181 23 -9 90 3 24 -124 -147 -84
Applications of Lag in R
The main application of the lag function would be to correct a misalignment of data and dates in a time series. Each observation needs to be matched up to the correct date to show a legitimate trend if any. It can also be used to correct the misalignment between a model and reality. In this case, the realignment could show that the model may be essentially correct but simply shifted by an added factor that might be capable of being corrected. Unfortunately, one other possible application of the lag function would be dishonestly manipulating data. Fortunately, it also provides a way to catch such manipulation.
When doing a time series lag in r, it is a simple process. However, it can be used dishonestly, but a quick look at the code would show what was done. It can however be used to make some honest and needed adjustments to correct a legitimate problem.