We’ve explored using a regression for time series forecasting, but what if there are seasonal or cyclical patterns in the data?
Let’s explore an example of how to use regression to identify cyclical patterns and perform seasonality analysis with time series data.
11.1 Data Loading
For a time series dataset that exemplifies cyclical patterns, let’s consider this dataset of U.S. employment over time, from the Federal Reserve Economic Data (FRED).
“All Employees: Total Nonfarm, commonly known as Total Nonfarm Payroll, is a measure of the number of U.S. workers in the economy that excludes proprietors, private household employees, unpaid volunteers, farm employees, and the unincorporated self-employed.”
“Generally, the U.S. labor force and levels of employment and unemployment are subject to fluctuations due to seasonal changes in weather, major holidays, and the opening and closing of schools.”
“The Bureau of Labor Statistics (BLS) adjusts the data to offset the seasonal effects to show non-seasonal changes: for example, women’s participation in the labor force; or a general decline in the number of employees, a possible indication of a downturn in the economy.
To closely examine seasonal and non-seasonal changes, the BLS releases two monthly statistical measures: the seasonally adjusted All Employees: Total Nonfarm (PAYEMS) and All Employees: Total Nonfarm (PAYNSA), which is not seasonally adjusted.”
This “PYYNSA” data is expressed in “Thousands of Persons”, and is “Not Seasonally Adjusted”.
The dataset frequency is “Monthly”.
Wrangling the data, including renaming columns and converting the date index to be datetime-aware, may make it easier for us to work with this data:
from pandas import to_datetimedf.rename(columns={DATASET_NAME: "employment"}, inplace=True)df.index.name ="date"df.index = to_datetime(df.index)df
employment
date
1939-01-01
29296
1939-02-01
29394
1939-03-01
29804
...
...
2025-06-01
160256
2025-07-01
159210
2025-08-01
159410
1040 rows × 1 columns
11.2 Data Exploration
Visualizing the data:
import plotly.express as pxpx.line(df, y="employment", height=450, title="US Employment by month (non-seasonally adjusted)", labels={"employment": "Employment (in thousands of persons)"},)
Cyclical Patterns
Exploring cyclical patterns in the data:
px.line(df[(df.index.year >=1970) & (df.index.year <=1980)], y="employment", title="US Employment by month (selected years)", height=450, labels={"Employment": "Employment (in thousands)"},)
TipInteractive dataviz
Hover over the dataviz to see which month(s) typically have higher employment, and which month(s) typically have lower employment.
Trend Analysis
Exploring trends:
import plotly.express as pxpx.scatter(df, y="employment", height=450, title="US Employment by month (vs Trend)", labels={"employment": "Employment (in thousands)"}, trendline="ols", trendline_color_override="red")
Looks like evidence of a possible linear relationship. Let’s perform a more formal regression analysis.
11.3 Data Encoding
Because we need numeric features to perform a regression, we convert the dates to a linear time step of integers (after sorting the data first for good measure):
#from pandas import DataFrame### get all rows from the original dataset that wound up in the training set:#training_set = df.loc[x_train.index].copy()#print(len(training_set))### create a dataset for the predictions and the residuals:#training_preds = DataFrame({# "prediction": results.fittedvalues,# "residual": results.resid#})## merge the training set with the results:#training_set = training_set.merge(training_preds,# how="inner", left_index=True, right_index=True#)### calculate error for each datapoint:#training_set
Regression Trends
Plotting trend line:
px.line(df, y=["employment", "prediction"], height=350, title="US Employment (monthly) vs linear trend", labels={"value":""})
Regression Residuals
Removing the trend, plotting just the residuals:
px.line(df, y="residual", title="US Employment (monthly) vs linear trend residuals", height=350)
There seem to be some periodic movements in the residuals.
11.5.0.1 Seasonality via Means of Periodic Residuals
Observe there may be some cyclical patterns in the residuals, by calculating periodic means:
Here we are grouping the data by quarter and calculating the average residual. This shows us for each quarter, on average, whether predictions are above or below trend:
11.5.0.2 Seasonality via Regression on Periodic Residuals
Let’s perform a regression using months as the features and the trend residuals as the target. This can help us understand the degree to which employment will be over or under trend for a given month.
# https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html# "one hot encode" the monthly values:from pandas import get_dummies as one_hot_encodex_monthly = one_hot_encode(df["month"])x_monthly.columns=["Jan", "Feb", "Mar", "Apr","May", "Jun", "Jul", "Aug","Sep", "Oct", "Nov", "Dec"]x_monthly = x_monthly.astype(int)x_monthly
<class 'statsmodels.regression.linear_model.OLS'>
<class 'statsmodels.regression.linear_model.RegressionResultsWrapper'>
OLS Regression Results
==============================================================================
Dep. Variable: residual R-squared: 0.021
Model: OLS Adj. R-squared: 0.010
Method: Least Squares F-statistic: 1.997
Date: Mon, 10 Nov 2025 Prob (F-statistic): 0.0257
Time: 01:07:36 Log-Likelihood: -10292.
No. Observations: 1040 AIC: 2.061e+04
Df Residuals: 1028 BIC: 2.067e+04
Df Model: 11
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Jan -1285.7225 518.285 -2.481 0.013 -2302.739 -268.706
Feb -1066.3405 518.285 -2.057 0.040 -2083.357 -49.324
Mar -654.6366 518.285 -1.263 0.207 -1671.653 362.380
Apr -343.4614 518.285 -0.663 0.508 -1360.478 673.555
May 151.7252 518.285 0.293 0.770 -865.291 1168.742
Jun 644.5325 518.285 1.244 0.214 -372.484 1661.549
Jul -179.7406 518.285 -0.347 0.729 -1196.757 837.276
Aug -43.1747 518.285 -0.083 0.934 -1060.191 973.842
Sep 386.4793 521.289 0.741 0.459 -636.433 1409.392
Oct 727.1486 521.289 1.395 0.163 -295.764 1750.061
Nov 830.0620 521.289 1.592 0.112 -192.850 1852.974
Dec 865.4173 521.289 1.660 0.097 -157.495 1888.330
==============================================================================
Omnibus: 6.053 Durbin-Watson: 0.025
Prob(Omnibus): 0.048 Jarque-Bera (JB): 5.541
Skew: 0.127 Prob(JB): 0.0626
Kurtosis: 2.749 Cond. No. 1.01
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
The coefficients tell us how each month contributes towards the regression residuals, in other words, for each month, to what degree does the model predict we will be above or below trend?