Gabriel Maher

Time Series Forecasting with Harmonic Regression

July 24th, 2019

Recently I have been looking into time series forecasting. The most common approaches for forecasting are typically exponential smoothing, or autoregressive methods. A less well known approach is to consider the time series as purely a function of time. We can then fit a parameterized function of time to the data. An interesting question then is, how do we model the time series as a function of time if it contains complex seasonal relationships? I investigate this question here, code for this post can be found here.

One way to do this, is via Harmonic Regression. We separately model the trend and seasonal component of the series. In particular, for the seasonal component, we break it down into a sum of periodic basis functions, where each basis function represents one of the seasonal frequencies we have identified. In mathematical terms $$ h(t) = g(t) + s(t) $$ $$ g(t) = b+at$$ $$ s(t) = \sum_{i=1}^{N} \alpha_i \phi_i(t) $$ Here $h(t)$ is our prediction, $g(t)$ is the trend (assumed linear) and $s(t)$ is the seasonal component. $\phi_i(t)$ are the periodic basis functions we use to model the seasonality of the time series. For example we could use one basis function to model a yearly seasonal component and one to model a monthly component etc.

We still need to pick a class of function to use for the $\phi_i(t)$. An obvious class of periodic function are the trigonometric functions $\sin(\frac{2\pi}{k_i}t)$ and $\cos(\frac{2\pi}{k_i}t)$. In this case the frequency of the functions is $\frac{2\pi}{k_i}$ where $k_i$ would be the number of time steps to complete one seasonal cycle, e.g. 365 for a yearly seasonal component and $30$ for a monthly component.

Putting this all together, our model becomes $$ h(t) = a+bt+\sum_{i=1}^N \alpha_i \sin(\frac{2\pi}{k_i}t) + \sum_{i=1}^N \beta_i \cos(\frac{2\pi}{k_i}t)$$ where $a,\ b,\ \alpha_i$ and $\beta_i$ are coefficients we will have to fit to our data.

The question now is, how to determine the coefficient values from any data we are given? Typically, for time-series data, we have a dataset of historical values $$ D = \{(y_1,t_1),...,(y_T,t_T)\} $$ and we want to find coefficient values such that our predictions are accurate, $h(t_i) \approx y_i$. As the name Harmonic Regression suggests, we can formulate this as a regression problem.

Let $x(t)$ and $w$ be vectors with the form $$ x(t) = \begin{bmatrix} 1 \\ t \\ \sin(\frac{2\pi}{k_1}t) \\ \vdots \\ \sin(\frac{2\pi}{k_N}t) \\ \cos(\frac{2\pi}{k_1}t) \\ \vdots \\ \cos(\frac{2\pi}{k_N}t) \end{bmatrix},\ w = \begin{bmatrix} a \\ b \\ \alpha_1 \\ \vdots \\ \alpha_N \\ \beta_1 \\ \vdots \\ \beta_N \end{bmatrix}. $$ We can then write $$h(t) = w^Tx(t)$$ , which we see is the standard linear regression problem. In fact we also see that Harmonic Regression is really just computing lots of derived features from the time variable to model the seasonal component of our output variable. Fortunately there are lots of existing ways to solve linear regression problems which makes it easy for us to try out harmonic regression.

Testing it out

Consider the example time series $$ y(t) = \frac{1}{2} + \frac{1}{5}t + 2\sin(\frac{2\pi}{25}t) + 2\sin(\frac{2\pi}{7}t) + \epsilon_t $$ where $\epsilon_t \sim N(0,\sigma^2)$ is normally distributed measurement noise. If we plot an instance of the the time-series for 100 timesteps it looks like this.

Despite the noise, we can see there's roughly a linear trend and some seasonality. To get a better ideas of the seasonal modes we can detrend the series and compute the fourier power spectrum.

It is important to note that the power spectrum considers energy of modes with the form $\sin(\omega t)$ so to convert to our earlier form we need to compute $$ k = \frac{2\pi}{\omega} $$

Now lets make our model and fit it to the data. We will use the first half of the series as training data and the second half as testing data. In general we won't know exactly what seasonal modes are in our data, so to emulate this we will include some additional terms in our model that are not in the true time series to see how it affects our accuracy. Here I used the model $$ h(t) = b + at + \alpha_1\sin(\frac{2\pi}{25}t) + \alpha_2\sin(\frac{2\pi}{7}t) + \beta_1\cos(\frac{2\pi}{25}t) + \beta_2\cos(\frac{2\pi}{7}t) $$

To fit the model I calculate the $x$ vector as described above $$ x(t) = \begin{bmatrix} 1 \\ t \\ \sin(\frac{2\pi}{25}t) \\ \sin(\frac{2\pi}{7}t) \\ \cos(\frac{2\pi}{25}t) \\ \cos(\frac{2\pi}{7}t) \end{bmatrix} $$ and used SciKit-learn's linear regression module to fit the model.

The coefficients I got were $$ a=0.18,\ \alpha_1=2,\ \alpha_2=2.25,\ \beta_1=0.14,\ \beta_2=0.024 $$

We can now compute predictions for the training and testing set. If we plot them alongside the original time-series, this is what it looks like

As we can see, the predictions are not bad, roughly having similar maxima and minima to the actual series.