Automatically selecting a naive model to use as a benchmark#
forecast-tools provides a auto_naive
function that uses point-forecast cross validation to select the ‘best’ naive model to use as a benchmark.
The function tests all of the naive Forecast
methods.
This notebook covers how to use auto_naive
and also how to trouble shoot it use if there are conflicts between parameters.
Imports#
import sys
# if running in Google Colab install forecast-tools
if 'google.colab' in sys.modules:
!pip install forecast-tools
import numpy as np
from forecast_tools.datasets import load_emergency_dept
from forecast_tools.model_selection import auto_naive
help(auto_naive)
Help on function auto_naive in module forecast_tools.model_selection:
auto_naive(y_train, horizon=1, seasonal_period=1, min_train_size='auto', method='cv', step=1, window_size='auto', metric='mae')
Automatic selection of the 'best' naive benchmark on a 'single' series
The selection process uses out-of-sample cv performance.
By default auto_naive uses cross validation to estimate the mean
point forecast peformance of all naive methods. It selects the method
with the lowest point forecast metric on average.
If there is limited data for training a basic holdout sample could be
used.
Dev note: the plan is to update this to work with multiple series.
It would be best to use MASE for multiple series comparison.
Parameters:
----------
y_train: array-like
training data. typically in a pandas.Series, pandas.DataFrame
or numpy.ndarray format.
horizon: int, optional (default=1)
Forecast horizon.
seasonal_period: int, optional (default=1)
Frequency of the data. E.g. 7 for weekly pattern, 12 for monthly
365 for daily.
min_train_size: int or str, optional (default='auto')
The size of the initial training set (if method=='ro' or 'sw').
If 'auto' then then min_train_size is set to len(y_train) // 3
If main_train_size='auto' and method='holdout' then
min_train_size = len(y_train) - horizon.
method: str, optional (default='cv')
out of sample selection method.
'ro' - rolling forecast origin
'sw' - sliding window
'cv' - scores from both ro and sw
'holdout' - single train/test split
Methods'ro' and 'sw' are similar, however, sw has a fixed
window_size and drops older data from training.
step: int, optional (default=1)
The stride/step of the cross-validation. I.e. the number
of observations to move forward between folds.
window_size: str or int, optional (default='auto')
The window_size if using sliding window cross validation
When 'auto' and method='sw' then
window_size=len(y_train) // 3
metric: str, optional (default='mae')
The metric to measure out of sample accuracy.
Options: mase, mae, mape, smape, mse, rmse, me.
Returns:
--------
dict
'model': baseline.Forecast
f'{metric}': float
Contains the model and its CV performance.
Raises:
-------
ValueError
For invalid method, metric, window_size parameters
See Also:
--------
forecast_tools.baseline.Naive1
forecast_tools.baseline.SNaive
forecast_tools.baseline.Drift
forecast_tools.baseline.Average
forecast_tools.baseline.EnsembleNaive
forecast_tools.baseline.baseline_estimators
forecast_tools.model_selection.rolling_forecast_origin
forecast_tools.model_selection.sliding_window
forecast_tools.model_selection.mase_cross_validation_score
forecast_tools.metrics.mean_absolute_scaled_error
Examples:
---------
Measuring MAE and taking the best method using both
rolling origin and sliding window cross validation
of a 56 day forecast.
>>> from forecast_tools.datasets import load_emergency_dept
>>> y_train = load_emergency_dept
>>> best = auto_naive(y_train, seasonal_period=7, horizon=56)
>>> best
{'model': Average(), 'mae': 19.63791579700355}
Take a step of 7 days between cv folds.
>>> from forecast_tools.datasets import load_emergency_dept
>>> y_train = load_emergency_dept
>>> best = auto_naive(y_train, seasonal_period=7, horizon=56,
... step=7)
>>> best
{'model': Average(), 'mae': 19.675635558539383}
Load the training data#
y_train = load_emergency_dept()
Select the best naive model for a h-step horizon of 7 days.#
Let’s select a method for the emergency deparment daily level to predict 7 days ahead. By default the function using the mean absolute error to evaluate forecast accuracy.
best = auto_naive(y_train, horizon=7, seasonal_period=7)
best
{'model': Average(), 'mae': 19.679856211931035}
y_preds = best['model'].fit_predict(y_train, horizon=7)
y_preds
array([221.06395349, 221.06395349, 221.06395349, 221.06395349,
221.06395349, 221.06395349, 221.06395349])
Using a different forecasting error metric#
best = auto_naive(y_train, horizon=7, seasonal_period=7, metric='mape')
best
{'model': Average(), 'mape': 8.69955926909263}
Using a single train-test split when data are limited.#
If your forecast horizon means that h-step cross-validation is infeasible then you can automatically select using a single holdout sample.
best = auto_naive(y_train, horizon=7, seasonal_period=7, method='holdout')
best
{'model': Average(), 'mae': 30.182280627384486}
Trouble shooting use of auto_naive
#
Problem 1: Training data is shorter than the min_train_size
+ horizon
For any validation to take place, including a simple holdout - the time series used must allow at least one train test split to take place. This can be a problem when seasonal_period is set to a length similar to the length of the time series.
# generate a synthetic daily time series of exactly one year in length.
y_train = np.random.randint(100, 250, size=365)
Let’s set seasonal period to seasonal_period=365
(the length of the time series) and horizon=7
.
We will also manually set min_train_size=365
This will generate a ValueError
reporting that the “The training data is shorter than min_train_size + horizon No validation can be performed.”
best = auto_naive(y_train, horizon=7, seasonal_period=365, method='ro',
min_train_size=365, metric='mae')
best
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-d9dc4e172979> in <module>
----> 1 best = auto_naive(y_train, horizon=7, seasonal_period=365, method='ro',
2 min_train_size=365, metric='mae')
3
4 best
~/opt/anaconda3/envs/forecast_dev/lib/python3.8/site-packages/forecast_tools/model_selection.py in auto_naive(y_train, horizon, seasonal_period, min_train_size, method, step, window_size, metric)
432 msg = f"The training data is shorter than {min_train_size=} + {horizon=}" \
433 + " No validation can be performed. "
--> 434 raise ValueError(msg)
435 elif min_train_size < seasonal_period and (method == 'cv' or method == 'ro'):
436 msg = "Seasonal period is longer than the minimum training size for" \
ValueError: The training data is shorter than min_train_size=365 + horizon=7 No validation can be performed.
A longer time series or a shorter seasonal period will fix this problem.
# a longer synthetic time series.
y_train = np.random.randint(100, 250, size=365+7)
best = auto_naive(y_train, horizon=7, seasonal_period=365, method='ro',
min_train_size=365, metric='mae')
best
{'model': Average(), 'mae': 43.29549902152642}
# a shorter seasonal period and minimum training size
y_train = np.random.randint(100, 250, size=365)
best = auto_naive(y_train, horizon=7, seasonal_period=7, method='ro',
min_train_size=7, metric='mae')
best
{'model': Average(), 'mae': 37.50786553941686}