Time Series Cross Validation#
import pandas as pd
import numpy as np
#suppress ARIMA warnings
import warnings
warnings.filterwarnings('ignore')
import sys
# if running in Google Colab install forecast-tools
if 'google.colab' in sys.modules:
!pip install forecast-tools
Up till now we have used a single validation period to select our best model. The weakness of that approach is that it gives you a sample size of 1 (that’s better than nothing, but generally poor statistics!). Time series cross validation is an approach to provide more data points when comparing models. In the classicial time series literature time series cross validation is called a Rolling Forecast Origin. There may also be benefit of taking a sliding window approach to cross validaiton. This second approach maintains a fixed sized training set. I.e. it drops older values from the time series during validation.
Rolling Forecast Origin#
The following code and output provide a simplified view of how rolling forecast horizons work in practice.
def rolling_forecast_origin(train, min_train_size, horizon):
'''
Rolling forecast origin generator.
'''
for i in range(len(train) - min_train_size - horizon + 1):
split_train = train[:min_train_size+i]
split_val = train[min_train_size+i:min_train_size+i+horizon]
yield split_train, split_val
full_series = [2502, 2414, 2800, 2143, 2708, 1900, 2333, 2222, 1234, 3456]
test = full_series[-2:]
train = full_series[:-2]
print('full training set: {0}'.format(train))
print('hidden test set: {0}'.format(test))
full training set: [2502, 2414, 2800, 2143, 2708, 1900, 2333, 2222]
hidden test set: [1234, 3456]
cv_rolling = rolling_forecast_origin(train, min_train_size=4, horizon=2)
cv_rolling
<generator object rolling_forecast_origin at 0x7fc04d75cc80>
i = 0
for cv_train, cv_val in cv_rolling:
print(f'CV[{i+1}]')
print(f'Train:\t{cv_train}')
print(f'Val:\t{cv_val}')
print('-----')
i += 1
CV[1]
Train: [2502, 2414, 2800, 2143]
Val: [2708, 1900]
-----
CV[2]
Train: [2502, 2414, 2800, 2143, 2708]
Val: [1900, 2333]
-----
CV[3]
Train: [2502, 2414, 2800, 2143, 2708, 1900]
Val: [2333, 2222]
-----
Sliding Window Cross Validation#
def sliding_window(train, window_size, horizon, step=1):
'''
sliding window generator.
Parameters:
--------
train: array-like
training data for time series method
window_size: int
lookback - how much data to include.
horizon: int
forecast horizon
step: int, optional (default=1)
step=1 means that a single additional data point is added to the time
series. increase step to run less splits.
Returns:
array-like, array-like
split_training, split_validation
'''
for i in range(0, len(train) - window_size - horizon + 1, step):
split_train = train[i:window_size+i]
split_val = train[i+window_size:window_size+i+horizon]
yield split_train, split_val
This code tests its with step=1
cv_sliding = sliding_window(train, window_size=4, horizon=1)
print('full training set: {0}\n'.format(train))
i = 0
for cv_train, cv_val in cv_sliding:
print(f'CV[{i+1}]')
print(f'Train:\t{cv_train}')
print(f'Val:\t{cv_val}')
print('-----')
i += 1
full training set: [2502, 2414, 2800, 2143, 2708, 1900, 2333, 2222]
CV[1]
Train: [2502, 2414, 2800, 2143]
Val: [2708]
-----
CV[2]
Train: [2414, 2800, 2143, 2708]
Val: [1900]
-----
CV[3]
Train: [2800, 2143, 2708, 1900]
Val: [2333]
-----
CV[4]
Train: [2143, 2708, 1900, 2333]
Val: [2222]
-----
The following code tests it with step=2
. Note that you get less splits. The code is less computationally expensive at the cost of less data. That is probably okay.
cv_sliding = sliding_window(train, window_size=4, horizon=1, step=2)
print('full training set: {0}\n'.format(train))
i = 0
for cv_train, cv_val in cv_sliding:
print(f'CV[{i+1}]')
print(f'Train:\t{cv_train}')
print(f'Val:\t{cv_val}')
print('-----')
i += 1
full training set: [2502, 2414, 2800, 2143, 2708, 1900, 2333, 2222]
CV[1]
Train: [2502, 2414, 2800, 2143]
Val: [2708]
-----
CV[2]
Train: [2800, 2143, 2708, 1900]
Val: [2333]
-----
Parallel Cross Validation Example using Naive1#
from forecast_tools.baseline import SNaive, Naive1
from forecast_tools.datasets import load_emergency_dept
#optimised version of the functions above...
from forecast_tools.model_selection import (rolling_forecast_origin,
sliding_window,
cross_validation_score)
from sklearn.metrics import mean_absolute_error
train = load_emergency_dept()
model = Naive1()
#%%timeit runs the code multiple times to get an estimate of runtime.
#comment if out to run the code only once.
Run on a single core
%%time
cv = sliding_window(train, window_size=14, horizon=7, step=1)
results_1 = cross_validation_score(model, train, cv, mean_absolute_error,
n_jobs=1)
CPU times: user 812 ms, sys: 0 ns, total: 812 ms
Wall time: 809 ms
Run across multiple cores by setting n_jobs=-1
%%time
cv = sliding_window(train, window_size=14, horizon=7, step=1)
results_2 = cross_validation_score(model, train, cv, mean_absolute_error,
n_jobs=-1)
CPU times: user 369 ms, sys: 71.5 ms, total: 440 ms
Wall time: 990 ms
results_1.shape
(324, 1)
results_2.shape
(324, 1)
print(results_1.mean(), results_1.std())
26.653439153439148 10.346957901088238
just to illustrate that the results are the same - the difference is runtime.
print(results_2.mean(), results_2.std())
26.653439153439148 10.346957901088238
Cross validation with multiple forecast horizons#
horizons = [7, 14, 21]
cv = sliding_window(train, window_size=14, horizon=max(horizons), step=1)
#note that we now pass in the horizons list to cross_val_score
results_h = cross_validation_score(model, train, cv, mean_absolute_error,
horizons=horizons, n_jobs=-1)
#results are returned as numpy array - easy to cast to dataframe and display
pd.DataFrame(results_h, columns=['7days', '14days', '21days']).head()
7days | 14days | 21days | |
---|---|---|---|
0 | 21.142857 | 22.357143 | 23.190476 |
1 | 14.857143 | 18.285714 | 19.000000 |
2 | 17.857143 | 22.357143 | 25.047619 |
3 | 17.142857 | 18.571429 | 19.285714 |
4 | 23.142857 | 19.500000 | 20.142857 |
Cross validation example using ARIMA - does it speed up when CV run in Parallel?#
#use ARIMA from pmdarima as that has a similar interface to baseline models.
from pmdarima import ARIMA, auto_arima
#ato_model = auto_arima(train, suppress_warnings=True, n_jobs=-1, m=7)
#auto_model
#create arima model - reasonably complex model
#order=(1, 1, 2), seasonal_order=(2, 0, 2, 7)
args = {'order':(1, 1, 2), 'seasonal_order':(2, 0, 2, 7)}
model = ARIMA(order=args['order'], seasonal_order=args['seasonal_order'],
enforce_stationarity=False, suppress_warnings=True)
%%time
cv = rolling_forecast_origin(train, min_train_size=320, horizon=7)
results_1 = cross_validation_score(model, train, cv, mean_absolute_error,
n_jobs=1)
CPU times: user 39.8 s, sys: 373 ms, total: 40.1 s
Wall time: 13.4 s
comment out %%timeit to run the code only once!
you should see a big improvement in performance. mine went from 12.3 seconds to 2.4 seconds.
%%time
cv = rolling_forecast_origin(train, min_train_size=320, horizon=7)
results_2 = cross_validation_score(model, train, cv, mean_absolute_error,
n_jobs=-1)
CPU times: user 1.63 s, sys: 252 ms, total: 1.88 s
Wall time: 4.32 s
results_1.shape
(18, 1)
results_2.shape
(18, 1)
results_1.mean()
15.58663509219668
results_2.mean()
15.586662939018318