Forecasting Paediatric Emergency Department Attendances¶

HPDM097 – Making a Difference with Health Data

Executive Summary¶

This project was undertaken to support an NHS acute trust in forecasting daily paediatric emergency department attendances over a 28-day horizon, with the aim of informing staffing and planning decisions.

Historical daily attendance data were analysed to understand underlying patterns and sources of variability. The series showed strong weekly seasonality, with higher attendances on certain days of the week, as well as seasonal effects over the year. Day-to-day variability was substantial, but there was little evidence of a sustained long-term trend over the study period.

Forecasting approaches were evaluated using a rolling-origin cross-validation framework. Simple naïve methods were first assessed to establish a baseline. These were then compared with a seasonal ARIMA (SARIMAX) model, which explicitly accounts for weekly patterns and short-term temporal dependencies. Forecast accuracy was assessed using mean absolute error across the full 28-day forecast horizon.

The SARIMAX model consistently outperformed the naïve benchmark. While uncertainty increases further into the forecast horizon, the SARIMAX model provided more accurate and reliable forecasts overall.

Based on these findings, a SARIMAX model is recommended to support paediatric ED staffing over the next four weeks. A final 28-day forecast, including prediction intervals to reflect uncertainty, was produced to inform planning. All analyses were implemented in reproducible Python code, allowing forecasts to be updated easily in future.

1.1 Introduction¶

Demand forecasting can help to support service delivery. Short-term forecasts help hospitals plan staffing levels and manage operational pressures. Paediatric emergency attendances are influenced by factors which can make demand variable and difficult to forecast accurately.

The aim of this project is to develop and evaluate a short-term forecasting approach for daily paediatric ED attendances. Using historical attendance data, the task is to identify a suitable forecasting method and generate a 28-day forecast to support staffing decisions within an NHS acute trust.

1.2 Objectives¶

The objectives of this project are to:

  • Explore and describe patterns in daily paediatric emergency department attendances
  • Evaluate naïve and statistical forecasting methods for short-term demand
  • Select a forecasting approach suitable for supporting staffing decisions
  • Produce a reproducible 28-day forecast of paediatric ED attendances

2 Setup¶

This analysis was conducted in Python using the HDS_forecast environment provided for the module. All results are fully reproducible by running the notebook from top to bottom. Key scientific computing and time-series libraries were used for data handling, modelling, and visualisation.

The input data file path is defined once at the top of the notebook, and all subsequent analyses reference this variable. This structure is intended to support reproducibility and allow NHS analysts to rerun or adapt the workflow easily using updated data.

In [1]:
# ------------------------------------------------------------
# 2. Import libraries and data path
# ------------------------------------------------------------

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from IPython.display import display


DATA_FILE = "paediatrics_train.csv"

3. Data Description and Initial Analysis¶

The dataset consists of counts of paediatric emergency department attendances between the April 2014 and February 2017. Observations are recorded at daily frequency.

In [2]:
# ------------------------------------------------------------
# 3.1 Load and inspect data
# ------------------------------------------------------------

df = pd.read_csv(DATA_FILE, parse_dates=["date"], index_col="date")
df = df.asfreq("D")

print(df.head(), "\n")
print(df.info())
            paed_ed_attends
date                       
2014-04-01               47
2014-04-02               46
2014-04-03               47
2014-04-04               48
2014-04-05               52 

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1056 entries, 2014-04-01 to 2017-02-19
Freq: D
Data columns (total 1 columns):
 #   Column           Non-Null Count  Dtype
---  ------           --------------  -----
 0   paed_ed_attends  1056 non-null   int64
dtypes: int64(1)
memory usage: 16.5 KB
None

3.2 Data Quality Checks¶

The data was checked for missing values, duplicate or missing dates, negative or non integer counts. None were identified.

In [3]:
# ------------------------------------------------------------
# 3.2 Data Quality Checks
# ------------------------------------------------------------

# Missing values
missing_values = df.isna().sum()

# Duplicate dates
duplicate_dates = df.index.duplicated().sum()

# Check for continuous daily frequency
full_date_range = pd.date_range(
    start=df.index.min(), end=df.index.max(), freq="D"
)
missing_dates = full_date_range.difference(df.index)

# Validity of attendance counts
negative_counts = (df["paed_ed_attends"] < 0).sum()
non_integer_counts = (df["paed_ed_attends"] % 1 != 0).sum()

# Display results
print("Missing values per column:")
print(missing_values, "\n")

print(f"Number of duplicate dates: {duplicate_dates}")
print(f"Number of missing dates in daily sequence: {len(missing_dates)}")
print(f"Negative attendance values: {negative_counts}")
print(f"Non-integer attendance values: {non_integer_counts}")
Missing values per column:
paed_ed_attends    0
dtype: int64 

Number of duplicate dates: 0
Number of missing dates in daily sequence: 0
Negative attendance values: 0
Non-integer attendance values: 0

3.3 Overview of Time Series Behaviour¶

Attendances were plotted on a time series chare. They show day-to-day variability with no strong long-term trend, but with periodic peaks which may suggest winter respiratory illnesses.

In [4]:
# ------------------------------------------------------------
# 3.3.1 Visualise the data
# ------------------------------------------------------------

# Split into train and test sets
Horizon = 28
train = df.iloc[:-Horizon]
test = df.iloc[-Horizon:]

# Line plot of training data only
plt.figure(figsize=(12, 4))
plt.plot(train.index, train["paed_ed_attends"])
plt.xlabel("Date")
plt.ylabel("Daily paediatric ED attendances")
plt.title("Daily paediatric ED attendances over time (training data only)")
plt.tight_layout()
plt.show()
No description has been provided for this image

3.4 Seasonal effects¶

To explore longer-term seasonal effects, mean daily attendances per month were aggregated across years. Attendances show day-to-day variability with no strong long-term trend. Peaks in November and March suggest seasonal effects consistent with winter respiratory illness.

In [5]:
# ------------------------------------------------------------
# 3.4.1 Visualise seasonality by calendar month
# ------------------------------------------------------------

# Add month information to training data
train_month = train.copy()
train_month["month"] = train_month.index.month
train_month["month_name"] = train_month.index.month_name()

# Calculate mean daily attendances per calendar month
monthly_mean = (
    train_month.groupby(["month", "month_name"])["paed_ed_attends"]
    .mean()
    .reset_index()
    .sort_values("month")
)

# Plot mean daily attendances by month
plt.figure(figsize=(8, 4))
plt.plot(
    monthly_mean["month_name"], monthly_mean["paed_ed_attends"], marker="o"
)
plt.xlabel("Month")
plt.ylabel("Mean daily attendances")
plt.title("Mean daily paediatric ED attendances by month (training data)")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
No description has been provided for this image

3.5 Short term effects¶

To examine variation within-week, attendances were grouped by day of week and summarised using a boxplot. A weekly pattern is suggested, with higher attendances on Sundays and Mondays.

In [6]:
# ------------------------------------------------------------
# 3.5.1 Visualise seasonality by day of week
# ------------------------------------------------------------

# Add day-of-week to training data
train_dayofweek = train.copy()
train_dayofweek["day_name"] = train_dayofweek.index.day_name()

# Order for days of the week
weekday_order = [
    "Monday",
    "Tuesday",
    "Wednesday",
    "Thursday",
    "Friday",
    "Saturday",
    "Sunday",
]

train_dayofweek["day_name"] = pd.Categorical(
    train_dayofweek["day_name"], categories=weekday_order, ordered=True
)


# Boxplot by day of week
plt.figure(figsize=(8, 4))
train_dayofweek.boxplot(
    column="paed_ed_attends", by="day_name", grid=False, showfliers=False
)
plt.suptitle("")
plt.title("Paediatric ED attendances by day of week (training data)")
plt.xlabel("Day of week")
plt.ylabel("Daily attendances")
plt.tight_layout()
plt.show()
<Figure size 800x400 with 0 Axes>
No description has been provided for this image

3.6 Distribution and variability¶

The distribution of daily attendances was examined.

It is slighly right-skewed, as expected for count data, with most observations clustered around a central range. and occasional higher-demand days. There are no extreme outliers. Inspection of rolling statistics suggests variability is broadly stable over time.

These characteristics support the use of absolute-error-based evaluation metrics such as mean absolute error (MAE)

In [7]:
# ------------------------------------------------------------
# 3.6.1 Visualise distribution of daily attendances
# ------------------------------------------------------------

# Histogram of daily attendances (distribution)
plt.figure(figsize=(8, 4))
plt.hist(train["paed_ed_attends"], bins=30)
plt.xlabel("Daily attendances")
plt.ylabel("Frequency")
plt.title("Distribution of daily paediatric ED attendances (training data)")
plt.tight_layout()
plt.show()
No description has been provided for this image
In [8]:
# ------------------------------------------------------------
# 3.6.2 Visualise rolling mean and variability
# ------------------------------------------------------------

# Rolling mean and variability of daily attendances (line plot)
rolling_mean = train["paed_ed_attends"].rolling(30).mean()
rolling_std = train["paed_ed_attends"].rolling(30).std()

plt.figure(figsize=(12, 4))
plt.plot(rolling_mean, label="30-day rolling mean")
plt.plot(rolling_std, label="30-day rolling std")
plt.legend()
plt.title("Rolling mean and variability (training data)")
plt.tight_layout()
plt.show()
No description has been provided for this image

3.7 Implications for Forecasting¶

In summary, daily paediatric emergency department attendances show clear seasonal patterns. Winter months tend to be busier than summer months, and across the week attendances are highest on Sundays and Mondays. Day-to-day attendance numbers vary considerably, but this variability appears broadly stable over time. There is no clear evidence of a sustained long-term increase or decrease in attendances over the study period. Taken together, these findings suggest a forecasting approach which make uses of recently and seasonally similar days.

4. Forecasting Framework¶

This section outlines the forecasting task, data partitioning strategy, and model evaluation approach used to support short-term staffing decisions in the paediatric emergency department.

4.1 Forecasting Task Definition¶

The forecasting task is to predict daily paediatric emergency department attendances over a 28-day horizon. This horizon reflects the Trust’s operational need to plan staffing levels several weeks in advance, balancing rota preparation with the inherent uncertainty of emergency care demand.

The primary objective of the forecasting exercise is therefore short-term accuracy, rather than long-term trend estimation. Forecasts are required at a daily resolution, and performance is assessed in terms of how closely predicted attendances match observed values over the forecast horizon.

4.2 Train–Test Split¶

The final 28 days of the available time series were reserved as a hold-out test set. This test set represents the future period for which staffing decisions would need to be made in practice.

The test data were not used during exploratory analysis, model fitting, or model selection. All model development and comparison were carried out using the remaining historical data (the training set), ensuring that forecast performance is evaluated on genuinely unseen observations.

In [9]:
# ------------------------------------------------------------
# 4.2 Train–Test split (hold out final 28 days)
# ------------------------------------------------------------

H = 28  # forecast horizon in days

# Split by time
train = df.iloc[:-H].copy()
test = df.iloc[-H:].copy()

# Print summary
print(f"Total observations: {len(df)}")
print(f"Training observations: {len(train)}")
print(f"Test observations: {len(test)}")

print("\nTraining period:")
print(f"  {train.index.min().date()} to {train.index.max().date()}")

print("\nTest period (held out):")
print(f"  {test.index.min().date()} to {test.index.max().date()}")
Total observations: 1056
Training observations: 1028
Test observations: 28

Training period:
  2014-04-01 to 2017-01-22

Test period (held out):
  2017-01-23 to 2017-02-19

4.3 Cross-Validation Strategy¶

Model performance was assessed using rolling origin cross-validation with an expanding training window. Under this approach, models are repeatedly trained on historical data and evaluated on subsequent 28-day forecast periods, mimicking the way forecasts would be generated in routine operational use.

At each iteration, forecast errors were calculated across the full forecast horizon, allowing accuracy to be assessed by lead time as well as on average. This provides insight into how forecast reliability changes as predictions extend further into the future, which is particularly relevant for short-term staffing decisions.

Rolling origin cross-validation offers a robust and realistic evaluation framework for time series forecasting, avoiding information leakage and providing a more reliable estimate of expected forecast accuracy than a single train–test split.

In [10]:
# ------------------------------------------------------------
# 4.3 Rolling origin cross-validation (expanding window)
# ------------------------------------------------------------


def mae(y_true: np.ndarray, y_pred: np.ndarray) -> float:
    y_true = np.asarray(y_true)
    y_pred = np.asarray(y_pred)
    return float(np.mean(np.abs(y_true - y_pred)))


def rolling_origin_cv(
    series: pd.Series,
    horizon: int = 28,
    initial_window: int = 365 * 2,
    step: int = 1,
    forecaster=None,
) -> pd.DataFrame:
    """
    Rolling origin CV with an expanding window.

    Parameters
    ----------
    series : pd.Series
        Time series indexed by date.
    horizon : int
        Forecast horizon (days).
    initial_window : int
        Number of initial observations used in the first training window.
    step : int
        How far to move the origin forward each iteration (eg 1 day or 7 days).
    forecaster : callable
        Function that takes (train_series, horizon) and returns a 1D array forecast of length = horizon.

    Returns
    -------
    pd.DataFrame
        Long-format results with one row per origin per horizon step.
    """
    if forecaster is None:
        raise ValueError(
            "You must pass a forecaster function: forecaster(train_series, horizon) -> forecast array"
        )

    series = series.dropna()
    n = len(series)

    # check that we have enough data for at least one origin
    if initial_window + horizon > n:
        raise ValueError(
            "initial_window + horizon is larger than the available series length."
        )

    results = []

    last_origin_start = n - horizon

    # loop through origins
    for origin_end in range(initial_window, last_origin_start + 1, step):
        train_series = series.iloc[:origin_end]
        test_slice = series.iloc[origin_end : origin_end + horizon]

        y_pred = np.asarray(forecaster(train_series, horizon), dtype=float)

        if len(y_pred) != horizon:
            raise ValueError(
                f"Forecaster returned length {len(y_pred)} but expected {horizon}."
            )

        y_true = test_slice.values.astype(float)

        abs_err = np.abs(y_true - y_pred)

        origin_date = test_slice.index[0]  # first day being forecast

        # store horizon-wise results (1..H)
        for h in range(1, horizon + 1):
            results.append(
                {
                    "origin_date": origin_date,
                    "h": h,
                    "y_true": y_true[h - 1],
                    "y_pred": y_pred[h - 1],
                    "abs_error": abs_err[h - 1],
                }
            )

    return pd.DataFrame(results)

5. Benchmark Models (Naïve Methods)¶

5.1 Rationale for Benchmark Models¶

In time series forecasting, naïve methods often perform well, particularly when strong seasonal patterns are present. These methods are easy to understand, quick to implement, and provide a useful reference point for evaluating whether more complex models offer meaningful improvement.

 5.2 Benchmark Model Definitions¶

 Naïve (Last Observation) Forecast¶

The naïve forecast assumes that future attendances will be equal to the most recently observed value. While simple, this approach can perform reasonably well when demand is stable over short periods.

Seasonal Naïve (Weekly) Forecast¶

The seasonal naïve forecast accounts for the observed weekly pattern in attendances. Forecasts are generated by repeating the values observed in the most recent week, so that, for example, a future Monday is forecast using the most recent Monday’s attendance.

 5.3 Benchmark Model Implementation¶

This code defines the naïve and seasonal naïve forecasting functions, which take historical attendance data as input and return a 28-day forecast.

In [11]:
# ------------------------------------------------------------
# 5.3 Benchmark forecasters
# ------------------------------------------------------------


def forecast_naive1(train_series: pd.Series, horizon: int) -> np.ndarray:
    """Naive (NF1): forecast all future days as the last observed value."""
    if len(train_series) == 0:
        raise ValueError("train_series is empty.")
    last_val = float(train_series.iloc[-1])
    return np.repeat(last_val, horizon)


def forecast_snaive_weekly(
    train_series: pd.Series, horizon: int
) -> np.ndarray:
    """Seasonal naive (weekly): repeat the last observed 7 days."""
    season_length = 7

    if len(train_series) == 0:
        raise ValueError("train_series is empty.")

    if len(train_series) < season_length:
        # fallback to naive1 if not enough history
        return forecast_naive1(train_series, horizon)

    last_week = train_series.iloc[-season_length:].to_numpy(dtype=float)
    reps = int(np.ceil(horizon / season_length))
    return np.tile(last_week, reps)[:horizon]

5.4 Benchmark Model Evaluation¶

Benchmark models were evaluated using the rolling origin cross-validation framework described in Section 4.3. Forecast accuracy was assessed using mean absolute error (MAE), calculated across the full 28-day forecast horizon and also examined by lead time.

The simple naïve benchmark and the weekly seasonal naïve approach perform very similarly across the 28-day horizon. While the seasonal naïve model produces slightly smoother errors, it does not consistently outperform the simpler naïve approach. In fact, the naïve model achieves marginally lower MAE at several forecast lead times, resulting in a slightly better overall average performance (9.30 v 9.32)

These results suggest that very recent attendance levels provide a strong short-term signal in this dataset, and that explicitly repeating the previous week does not yield a substantial improvement in forecast accuracy. Given the small differences observed, the naïve benchmark provides a strong baseline for subsequent model comparison.

In [12]:
# ------------------------------------------------------------
# 5.4.1 Benchmark Model Evaluation (Rolling origin CV on training data)
# ------------------------------------------------------------
y_train = train["paed_ed_attends"]

cv_naive1 = rolling_origin_cv(
    series=y_train,
    horizon=H,
    initial_window=365 * 2,
    step=1,
    forecaster=forecast_naive1,
)

cv_snaive = rolling_origin_cv(
    series=y_train,
    horizon=H,
    initial_window=365 * 2,
    step=1,
    forecaster=forecast_snaive_weekly,
)

# Overall MAE across all origins and horizons
summary = pd.DataFrame(
    {
        "Model": ["Naive1 (last value)", "Seasonal Naive (weekly)"],
        "Mean MAE (h=1..28)": [
            cv_naive1["abs_error"].mean(),
            cv_snaive["abs_error"].mean(),
        ],
    }
).sort_values("Mean MAE (h=1..28)")

display(summary)
Model Mean MAE (h=1..28)
0 Naive1 (last value) 9.308777
1 Seasonal Naive (weekly) 9.329863
In [13]:
# ------------------------------------------------------------
# 5.4.2 visualise horizon-wise accuracy
# ------------------------------------------------------------

# Horizon-wise MAE
mae_by_h_naive1 = (
    cv_naive1.groupby("h")["abs_error"].mean().reset_index(name="MAE")
)
mae_by_h_snaive = (
    cv_snaive.groupby("h")["abs_error"].mean().reset_index(name="MAE")
)

# Plot horizon-wise MAE
plt.figure(figsize=(8, 4))
plt.plot(mae_by_h_naive1["h"], mae_by_h_naive1["MAE"], label="Naive1")
plt.plot(
    mae_by_h_snaive["h"],
    mae_by_h_snaive["MAE"],
    label="Seasonal Naive (weekly)",
)
plt.xlabel("Forecast lead time (days ahead)")
plt.ylabel("MAE")
plt.title("Benchmark accuracy by forecast horizon")
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

6. Candidate Forecasting Model¶

6.1 Choice of model¶

A seasonal ARIMA (SARIMAX) model was selected as a candidate forecasting approach due to its suitability for short-term time series forecasting and its ability to model seasonal patterns.

6.2 Model Specification and Fitting¶

Model parameters were selected using a grid search over a small, predefined set of non-seasonal and seasonal orders, applied to the training data only. Candidate models were compared using the Akaike Information Criterion (AIC), with lower values indicating better trade-offs between model fit and complexity.

The selected specification, SARIMAX(1,0,2)(0,1,1)7 , incorporates short-term autoregressive and moving-average components alongside weekly seasonal differencing.

All models were fitted using the statsmodels Python library. The chosen specification was then fixed and carried forward to the cross-validation stage.

In [14]:
# ------------------------------------------------------------
# 6.2.1 SARIMAX specification and fitting (statsmodels)
# ------------------------------------------------------------

y_train = train["paed_ed_attends"].astype(float)

m = 7  # weekly seasonality for daily data

# Small grid so runtime is manageable here
p_values = [0, 1, 2]
d_values = [0, 1]
q_values = [0, 1, 2]

P_values = [0, 1]
D_values = [0, 1]
Q_values = [0, 1]

results = []

# Grid search over specified parameter ranges
for p in p_values:
    for d in d_values:
        for q in q_values:
            for P in P_values:
                for D in D_values:
                    for Q in Q_values:
                        order = (p, d, q)
                        seasonal_order = (P, D, Q, m)
                        try:
                            model = sm.tsa.statespace.SARIMAX(
                                y_train,
                                order=order,
                                seasonal_order=seasonal_order,
                                enforce_stationarity=False,
                                enforce_invertibility=False,
                            )
                            fit = model.fit(
                                disp=False, method="lbfgs", maxiter=200
                            )

                            results.append(
                                {
                                    "order": order,
                                    "seasonal_order": seasonal_order,
                                    "aic": fit.aic,
                                }
                            )
                        except Exception:
                            # ignore faiures to converge or other issues
                            continue

grid_df = pd.DataFrame(results).sort_values("aic").reset_index(drop=True)
display(grid_df.head(10))

best_order = tuple(grid_df.loc[0, "order"])
best_seasonal_order = tuple(grid_df.loc[0, "seasonal_order"])

print("Selected order:", best_order)
print("Selected seasonal_order:", best_seasonal_order)
print("Best AIC:", round(grid_df.loc[0, "aic"], 2))
/opt/miniconda3/envs/hds_forecast/lib/python3.11/site-packages/statsmodels/base/model.py:607: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  warnings.warn("Maximum Likelihood optimization failed to "
order seasonal_order aic
0 (1, 0, 2) (0, 1, 1, 7) 7061.818015
1 (1, 0, 2) (1, 1, 1, 7) 7063.669564
2 (2, 0, 2) (0, 1, 1, 7) 7063.692143
3 (2, 0, 2) (1, 1, 1, 7) 7065.451773
4 (2, 0, 1) (0, 1, 1, 7) 7067.767247
5 (1, 0, 1) (0, 1, 1, 7) 7069.371915
6 (2, 0, 1) (1, 1, 1, 7) 7069.662216
7 (2, 1, 2) (0, 1, 1, 7) 7070.421185
8 (1, 0, 1) (1, 1, 1, 7) 7071.257823
9 (2, 1, 2) (1, 1, 1, 7) 7072.391947
Selected order: (1, 0, 2)
Selected seasonal_order: (0, 1, 1, 7)
Best AIC: 7061.82
In [15]:
# ------------------------------------------------------------
# 6.2.2 Define SARIMAX forecaster using selected specification
# ------------------------------------------------------------


def forecast_sarimax_fixed(
    train_series: pd.Series, horizon: int
) -> np.ndarray:
    """
    Fit SARIMAX(best_order, best_seasonal_order) to train_series and return a horizon-step forecast.
    Designed to be passed into rolling_origin_cv().
    """
    train_series = train_series.astype(float)

    # Fit SARIMAX with the best parameters found in the grid search
    model = sm.tsa.statespace.SARIMAX(
        train_series,
        order=best_order,
        seasonal_order=best_seasonal_order,
        enforce_stationarity=False,
        enforce_invertibility=False,
    )
    fit = model.fit(disp=False, method="lbfgs", maxiter=200)
    fc = fit.forecast(steps=horizon)
    return np.asarray(fc, dtype=float)

6.3 Cross-Validation Results¶

The ARIMA model was evaluated using the same rolling origin cross-validation framework applied to the benchmark models. Forecast accuracy was assessed using mean absolute error (MAE) across the full 28-day horizon, allowing for direct and fair comparison.

Across all forecast horizons, the SARIMAX model achieves lower mean absolute error.

In [16]:
# ------------------------------------------------------------
# 6.3 Cross-validation results (SARIMAX)
# ------------------------------------------------------------
y_train = train["paed_ed_attends"].astype(float)

STEP = 7

# Run rolling origin CV for SARIMAX forecaster
cv_sarimax = rolling_origin_cv(
    series=y_train,
    horizon=H,
    initial_window=365 * 2,
    step=STEP,
    forecaster=forecast_sarimax_fixed,
)

# Overall MAE across all origins and horizons
mae_by_h_sarimax = (
    cv_sarimax.groupby("h")["abs_error"]
    .mean()
    .reset_index(name="MAE")
    .sort_values("h")
)

# comparison with naive1 benchmark
plt.figure(figsize=(8, 4))
plt.plot(mae_by_h_naive1["h"], mae_by_h_naive1["MAE"], label="Naive")
plt.plot(mae_by_h_sarimax["h"], mae_by_h_sarimax["MAE"], label="SARIMAX")
plt.xlabel("Forecast lead time (days ahead)")
plt.ylabel("MAE")
plt.title("Forecast accuracy by lead time: SARIMAX vs benchmark")
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

7. Model Selection and Final Forecast¶

7.1 Model Selection¶

Based on the cross-validation results, the SARIMAX model was selected as the preferred forecasting approach. It provides mproved accuracy compared with the naïve benchmark while remaining interpretable and suitable for operational use.

The model accounts for weekly seasonality and short-term temporal dependencies, both of which are prominent features of paediatric emergency department attendances. These characteristics are clinically and operationally plausible, reflecting differences between weekdays and weekends as well as short-term persistence in demand.

For these reasons, SARIMAX is recommended as the forecasting method to support paediatric emergency department staffing over the next 28 days.

7.2 Final 28 day Forecast¶

This figure below presents the final 28-day forecast of daily paediatric emergency department attendances generated using the selected SARIMAX model. The forecast continues to show clear weekly variation, with higher attendances on certain days of the week, consistent with historical patterns observed in the data.

Overall demand is expected to remain stable over the forecast period, with no evidence of a sustained upward or downward trend. Day-to-day variability remains substantial, reflecting the nature of childhood illness.

The shaded prediction interval illustrates the uncertainty surrounding the point forecast. While most days are expected to fall close to the central forecast, higher-demand days may occur.

In [17]:
# ------------------------------------------------------------
# 7.2 Cross-validation results (SARIMAX)
# ------------------------------------------------------------

final_model = sm.tsa.statespace.SARIMAX(
    y_train,
    order=best_order,
    seasonal_order=best_seasonal_order,
    enforce_stationarity=False,
    enforce_invertibility=False,
)

final_fit = final_model.fit(disp=False)

forecast_28 = final_fit.get_forecast(steps=28)
forecast_mean = forecast_28.predicted_mean
forecast_ci = forecast_28.conf_int()

plt.figure(figsize=(10, 4))
plt.plot(
    train.index[-90:],
    train["paed_ed_attends"].iloc[-90:],
    label="Recent observed",
)
plt.plot(forecast_mean.index, forecast_mean, label="28-day forecast")

plt.fill_between(
    forecast_ci.index,
    forecast_ci.iloc[:, 0],
    forecast_ci.iloc[:, 1],
    color="grey",
    alpha=0.3,
    label="Prediction interval",
)

plt.xlabel("Date")
plt.ylabel("Daily paediatric ED attendances")
plt.title("28-day forecast of paediatric ED attendances")
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

8. Conclusion¶

This project evaluated approaches for forecasting daily paediatric emergency department attendances to support short-term staffing decisions. Using rolling-origin cross-validation, a seasonal ARIMA (SARIMAX) model was shown to consistently outperform a naïve benchmark. The selected SARIMAX model captures key features of the data, including weekly seasonality and short-term temporal structure, improving forecast accuracy while remaining interpretable and practical for routine use.

A 28-day forecast was produced to inform staffing over the next four weeks, alongside prediction intervals to reflect uncertainty in daily demand.

All analyses were implemented in reproducible Python code within a Jupyter Notebook. The modelling framework can be re-run as new data become available, allowing forecasts to be updated regularly and supporting ongoing, data-driven workforce planning within the Trust.