Speaker
Description
Realized volatility forecasting remains one of the most active research frontiers in financial econometrics, as volatility governs option pricing, risk-parity allocation, Value-at-Risk estimation, and macroprudential stress testing. Classical linear models, especially the Heterogeneous Autoregressive Realized Volatility (HAR-RV) model, remain notoriously difficult to beat at short horizons, yet they cannot capture the nonlinear, regime-dependent dynamics that dominate volatility behavior during market stress. Pure deep-learning approaches, by contrast, typically overfit and fail to exploit the strong linear persistence of volatility. This paper addresses that tension by proposing a disciplined residual-learning hybrid in which a compact Long Short-Term Memory (LSTM) network learns nonlinear corrections to a HAR-RV baseline, incorporating ARIMA log-volatility forecasts as an additional input, complemented by SHAP interpretability. The aim is to test whether nonlinear deep-learning corrections can systematically improve a well-specified linear benchmark in both a developed and an emerging equity market, and to identify economically meaningful drivers behind the improvement.
The empirical analysis uses daily data for the S&P 500 (^GSPC) and the CEE Fund (NYSE: CEE), a U.S.-listed proxy for Central and Eastern European equities, from January 2019 to December 2024. The dependent variable is the natural logarithm of one-step-ahead realized volatility, measured over a five-day rolling window. Features include three HAR horizons (daily, weekly, monthly), high-low range, log volume, negative semivariance, absolute returns, and an expanding-window ARIMA forecast. The pipeline is strictly out-of-sample: HAR-RV is fit by OLS on log horizons; ARIMA(p,d,q) is selected via AIC; and a shrunken LSTM (32 units, L2 regularization, gradient clipping, 22-day lookback) is trained to predict HAR residuals. The hybrid forecast is reconstructed as HAR plus LSTM-residual and exponentiated, benchmarked against standalone HAR-RV, ARIMA, GARCH(1,1), and pure LSTM models. Evaluation uses MAE, RMSE, R², QLIKE, and HMSE.
Preliminary results show the hybrid achieves R²=0.738 on the S&P 500 and R²=0.825 on CEE. On the highly efficient S&P 500, the hybrid matches HAR-RV and ARIMA on symmetric metrics while delivering the lowest HMSE (0.087), indicating superior proportional accuracy during volatility spikes. On the less efficient CEE market, the hybrid is the outright best model on R² and RMSE, with DM p=0.175 versus HAR-RV. Against pure LSTM, the hybrid wins decisively on both markets (DM p<0.001). SHAP reveals a strong recency bias (lags t-1 to t-4) and a distinct cross-market contrast: beyond core volatility persistence, intraday high-low range is the strongest residual driver on the S&P 500, while recent daily shocks (daily RV and absolute returns) drive CEE forecast adjustments, reflecting fundamental differences in market efficiency.