Volatility Forecasting and Return Prediction under Market Regimes: Evidence from High-Frequency Chinese Equity Data
Pith reviewed 2026-06-27 13:55 UTC · model grok-4.3
The pith
Regime-aware volatility models outperform standard forecasts on high-frequency Chinese equity data, while return prediction stays weak except in low-volatility states.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using high-frequency CSI 300 Index data from 2005 to 2023, a two-stage framework first models realized volatility with regime-augmented HARQ specifications combined with Markov-switching GJR-GARCH to capture regimes, then inputs the volatility forecasts, regime indicators, and other predictors into an XGBoost model for return prediction under a strict walk-forward out-of-sample setup. Regime-aware volatility forecasting consistently outperforms baseline HARQ models across metrics and passes formal tests, while return predictability is weak, state-dependent, and concentrated in low-volatility regimes. Naive trading strategies fail after transaction costs, but versions with volatility scaling,
What carries the argument
The sequential two-stage framework that augments HARQ volatility models with Markov-switching GJR-GARCH regime filtering and then uses those outputs plus regime indicators inside an XGBoost return predictor estimated via walk-forward validation.
Load-bearing premise
The Markov-switching GJR-GARCH model correctly identifies distinct market regimes that are stable enough to be useful for out-of-sample forecasting and that the walk-forward procedure with the chosen hyperparameters does not overfit the regime classification or the XGBoost model on the specific Chinese data period.
What would settle it
Re-running the identical walk-forward procedure on a later hold-out period of CSI 300 high-frequency data and finding that the regime-augmented HARQ models no longer outperform baseline HARQ models on volatility forecast accuracy metrics would falsify the main claim.
Figures
read the original abstract
This study investigates whether regime-dependent volatility forecasting and machine-learning-based return prediction can be jointly integrated to improve both statistical forecasting performance and economic strategy outcomes in equity markets. Using high-frequency CSI 300 Index data from 2005 to 2023, a sequential twostage framework is developed. In the first stage, realized volatility is modeled using regime-augmented HARQ specifications combined with Markov-switching GJR-GARCH filtering to capture long-memory dynamics, asymmetry, and structural market regimes. In the second stage, volatility forecasts, regime indicators, and return-related predictors are incorporated into an XGBoost return-prediction model estimated through a strictly walk-forward out-of-sample procedure. The empirical results demonstrate that regime-aware volatility forecasting consistently outperforms baseline HARQ models across forecast evaluation metrics and is generally supported by formal forecast comparison tests. In contrast, return predictability remains weak, state-dependent, and concentrated primarily in low-volatility regimes. Although naive predictive trading strategies generally fail after accounting for realistic transaction costs, carefully designed implementations incorporating volatility scaling, low-volatility gating, threshold calibration, and turnover controls can improve defensive economic performance. The findings suggest that the practical value of predictive systems in financial markets may depend less on generating strong unconditional return forecasts and more on transforming weak state-dependent signals into economically robust portfolio allocation rules. Overall, the study contributes by integrating econometric volatility modeling, regime classification, machine-learning return prediction, and implementation realism within a unified framework.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops a two-stage framework on high-frequency CSI 300 data (2005–2023): Markov-switching GJR-GARCH is used to identify regimes, which are then incorporated into regime-augmented HARQ models for realized-volatility forecasting; the resulting forecasts, regime indicators, and other predictors feed an XGBoost model for returns, all estimated via strictly walk-forward out-of-sample procedures. The central claims are that the regime-aware volatility specifications outperform plain HARQ models on standard forecast metrics and are supported by formal comparison tests, that return predictability is weak overall but stronger in the low-volatility regime, and that naïve trading strategies fail after transaction costs while carefully tuned versions (volatility scaling, low-vol gating, threshold calibration, turnover controls) deliver improved defensive economic performance.
Significance. If the regime labels remain informative out-of-sample and the reported economic improvements survive further robustness checks, the work would usefully illustrate how regime-dependent econometric modeling can be combined with machine-learning return prediction and realistic implementation constraints in an emerging-market setting. The emphasis on the limits of unconditional return forecasts versus the value of state-dependent allocation rules is a constructive contribution to the literature on predictability under structural breaks.
major comments (2)
- [first-stage Markov-switching GJR-GARCH filtering and walk-forward procedure] The claim that regime-aware HARQ specifications consistently outperform baseline HARQ models rests on the stability and out-of-sample informativeness of the two-state Markov-switching GJR-GARCH regime classification. Because the sample contains multiple structural breaks (2008, 2015, 2020), the estimated transition probabilities and volatility parameters may be dominated by crisis episodes; the manuscript does not report whether the MS-GJR-GARCH is re-estimated inside each walk-forward window or fitted once on the full sample, nor does it provide regime-persistence diagnostics or sensitivity checks to the number of regimes. This is load-bearing for both the volatility-forecasting and the return-prediction results.
- [economic strategy results and implementation details] The economic-performance conclusions depend on post-hoc choices of volatility-scaling factors, low-volatility gates, return thresholds, and turnover controls. The manuscript should demonstrate that these choices are either pre-specified or subjected to a formal robustness exercise (e.g., grid search reported in an appendix or out-of-sample validation of the tuning parameters themselves); otherwise the reported improvement in defensive performance after costs cannot be distinguished from in-sample optimization.
minor comments (2)
- [Abstract] The abstract states that regime-aware models “consistently outperform” and are “generally supported by formal forecast comparison tests,” yet supplies no numerical values for RMSE, QLIKE, or Diebold-Mariano statistics; these metrics should appear already in the abstract or at least be summarized with effect sizes.
- [volatility-model section] Notation for the regime-augmented HARQ specification (e.g., how the regime dummy enters the HARQ equation) is introduced only descriptively; an explicit equation would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. These highlight important aspects of our methodology and implementation that require clarification and additional robustness checks. We address each major comment below and will incorporate the suggested revisions into the manuscript.
read point-by-point responses
-
Referee: [first-stage Markov-switching GJR-GARCH filtering and walk-forward procedure] The claim that regime-aware HARQ specifications consistently outperform baseline HARQ models rests on the stability and out-of-sample informativeness of the two-state Markov-switching GJR-GARCH regime classification. Because the sample contains multiple structural breaks (2008, 2015, 2020), the estimated transition probabilities and volatility parameters may be dominated by crisis episodes; the manuscript does not report whether the MS-GJR-GARCH is re-estimated inside each walk-forward window or fitted once on the full sample, nor does it provide regime-persistence diagnostics or sensitivity checks to the number of regimes. This is load-bearing for both the volatility-forecasting and the return-prediction results.
Authors: We agree that explicit documentation of the regime-identification procedure is essential. The MS-GJR-GARCH model was re-estimated at the start of each walk-forward window using only data available up to that point, consistent with the strictly out-of-sample protocol described for the overall framework. We will revise the methodology section to state this explicitly. In addition, we will add (i) regime-persistence statistics (average duration and transition probabilities) computed out-of-sample and (ii) a sensitivity table comparing results under two versus three regimes. These diagnostics will be placed in a new appendix and referenced in the main text. revision: yes
-
Referee: [economic strategy results and implementation details] The economic-performance conclusions depend on post-hoc choices of volatility-scaling factors, low-volatility gates, return thresholds, and turnover controls. The manuscript should demonstrate that these choices are either pre-specified or subjected to a formal robustness exercise (e.g., grid search reported in an appendix or out-of-sample validation of the tuning parameters themselves); otherwise the reported improvement in defensive performance after costs cannot be distinguished from in-sample optimization.
Authors: We acknowledge that the specific parameter values used for volatility scaling, gating, thresholds, and turnover controls require stronger justification. While the core predictors and models are estimated strictly out-of-sample, the strategy hyperparameters were calibrated on a preliminary subsample. We will add a formal robustness appendix that reports a grid search over plausible ranges of these parameters and shows the distribution of Sharpe ratios and maximum drawdowns across the grid. Only parameter combinations that would have been feasible at the time of each rebalancing are considered, thereby addressing the concern about ex-post optimization. revision: yes
Circularity Check
No significant circularity: walk-forward OOS framework keeps derivation self-contained
full rationale
The paper's core claims rest on a two-stage pipeline with Markov-switching GJR-GARCH regime filtering followed by HARQ-XGBoost forecasting, all evaluated via a strictly walk-forward out-of-sample procedure on the 2005-2023 CSI 300 data. No step re-uses fitted parameters or regime labels as both input and output on the same observations; volatility forecasts and return predictions are generated on held-out periods, and economic strategy results are assessed after explicit transaction-cost adjustments. Because the evaluation protocol separates estimation from testing and no self-citation chain or definitional loop is invoked to justify the regime labels or model superiority, the reported outperformance does not reduce to its own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (3)
- HARQ parameters
- GJR-GARCH parameters
- XGBoost hyperparameters
axioms (2)
- domain assumption Market regimes can be adequately captured by a two-state Markov-switching process
- domain assumption Walk-forward out-of-sample procedure prevents look-ahead bias
Reference graph
Works this paper leans on
-
[1]
and Bollerslev, Tim , title =
Andersen, Torben G. and Bollerslev, Tim , title =. International Economic Review , year =
-
[2]
and Bollerslev, Tim and Diebold, Francis X
Andersen, Torben G. and Bollerslev, Tim and Diebold, Francis X. and Labys, Paul , title =. Journal of the American Statistical Association , year =
-
[3]
and Bollerslev, Tim and Diebold, Francis X
Andersen, Torben G. and Bollerslev, Tim and Diebold, Francis X. and Labys, Paul , title =. Econometrica , year =
-
[4]
and Quaedvlieg, Rogier , title =
Bollerslev, Tim and Patton, Andrew J. and Quaedvlieg, Rogier , title =. Journal of Econometrics , year =
-
[5]
and Bollerslev, Tim and Diebold, Francis X
Andersen, Torben G. and Bollerslev, Tim and Diebold, Francis X. , title =. Review of Economics and Statistics , year =
-
[6]
Leushuis, R. M. and Petkov, N. , title =. Financial Innovation , year =
-
[7]
Journal of Financial Econometrics , year =
Corsi, Fulvio , title =. Journal of Financial Econometrics , year =
-
[8]
, title =
Hamilton, James D. , title =. Econometrica , year =
-
[9]
, title =
Hamilton, James D. , title =
-
[10]
Empirical Economics , year =
Klaassen, Franc , title =. Empirical Economics , year =
-
[11]
and Jagannathan, Ravi and Runkle, David E
Glosten, Lawrence R. and Jagannathan, Ravi and Runkle, David E. , title =. Journal of Finance , year =
-
[12]
Review of Economics and Statistics , year =
Bollerslev, Tim , title =. Review of Economics and Statistics , year =
-
[13]
Ma, Fang and Wahab, M. I. M. and Huang, Dong and Xu, Wei , title =. Energy Economics , year =
-
[14]
Accounting & Finance , year =
Wang, Xiaohui and Shrestha, Keshab and Sun, Qing , title =. Accounting & Finance , year =
-
[15]
Applied Economics , year =
Ma, Fang and Lu, Xiaoqing and Yang, Kai and Zhang, Yu , title =. Applied Economics , year =
-
[16]
Journal of Finance , year =
Moreira, Alan and Muir, Tyler , title =. Journal of Finance , year =
-
[17]
, title =
Lehmann, Bruce N. , title =. Quarterly Journal of Economics , year =
-
[18]
Journal of Finance , year =
Jegadeesh, Narasimhan and Titman, Sheridan , title =. Journal of Finance , year =
-
[19]
Review of Financial Studies , year =
Goyal, Amit and Welch, Ivo , title =. Review of Financial Studies , year =
-
[20]
Journal of Financial Economics , year =
Kelly, Bryan and Pruitt, Seth and Su, Yinan , title =. Journal of Financial Economics , year =
-
[21]
Proceedings of the 22nd ACM SIGKDD Conference , year =
Chen, Tianqi and Guestrin, Carlos , title =. Proceedings of the 22nd ACM SIGKDD Conference , year =
- [22]
-
[23]
and Xiu, Dacheng , title =
Gu, Shihao and Kelly, Bryan T. and Xiu, Dacheng , title =. Review of Financial Studies , volume =. 2020 , publisher =
2020
-
[24]
Journal of Statistical Software , year =
Ardia, David and Bluteau, Keven and Boudt, Kris and Catania, Leopoldo and Trottier, David-Alexandre , title =. Journal of Statistical Software , year =
-
[25]
, title =
Campbell, John Y. , title =. Journal of Financial Economics , year =
-
[26]
and Boubaker, H
Ben Romdhane, W. and Boubaker, H. , title =. Journal of Risk and Financial Management , year =
-
[27]
Sukainah, A. B. and Dania, A. N. , title =. Frontiers in Artificial Intelligence , year =
-
[28]
and Mirau, S
Peter, M. and Mirau, S. and Sinkwembe, E. and Kasumo, C. and Guambe, C. , title =. Array , year =
-
[29]
and Varshney, N
Jain, R. and Varshney, N. and Durgarao, M. S. P. and Maurya, S. K. and Mehta, D. K. and Kundu, A. and Verma, A. , title =. National Academy Science Letters , year =
-
[30]
Zhang, Y. J. and Zhang, Y. Y. and Zhang, H. and Tang, Z. , title =. Journal of Futures Markets , year =
-
[31]
and Chong, C
Yihuan, L. and Chong, C. W. and Yap, N. K. and Juan, Z. and Youyuan, W. , title =. International Journal of Academic Research in Accounting, Finance and Management Sciences , year =
-
[32]
, title =
Feng, H. , title =. Proceedings of the 3rd International Conference on Computer Science and Mechatronics (ICCSM 2025) , year =
2025
-
[33]
and Yang, L
Li, X. and Yang, L. and Zha, C. and Xu, Y. , title =. Computational Economics , year =
-
[34]
Handbook of Economic Forecasting , editor =
Timmermann, Allan , title =. Handbook of Economic Forecasting , editor =. 2006 , volume =
2006
-
[35]
Journal of Empirical Finance , volume =
Ledoit, Olivier and Wolf, Michael , title =. Journal of Empirical Finance , volume =. 2008 , doi =
2008
-
[36]
and Mariano, Roberto S
Diebold, Francis X. and Mariano, Roberto S. , title =. Journal of Business & Economic Statistics , volume =. 1995 , doi =
1995
-
[37]
, title =
Patton, Andrew J. , title =. Journal of Econometrics , volume =. 2011 , doi =
2011
-
[38]
Econometrica , volume =
White, Halbert , title =. Econometrica , volume =. 2000 , doi =
2000
-
[39]
, title =
Hansen, Peter R. , title =. Journal of Business & Economic Statistics , volume =. 2005 , doi =
2005
-
[40]
, title =
Christoffersen, Peter and Diebold, Francis X. , title =. Management Science , volume =. 2006 , doi =
2006
-
[41]
and Shephard, Neil , title =
Barndorff-Nielsen, Ole E. and Shephard, Neil , title =. Journal of Financial Econometrics , volume =. 2004 , doi =
2004
-
[42]
and Sheppard, Kevin , title =
Patton, Andrew J. and Sheppard, Kevin , title =. Review of Economics and Statistics , volume =. 2015 , doi =
2015
-
[43]
and Bollerslev, Tim and Diebold, Francis X
Andersen, Torben G. and Bollerslev, Tim and Diebold, Francis X. , title =. International Encyclopedia of Statistical Science , editor =. 2011 , doi =
2011
-
[44]
Econometrica , volume =
Giacomini, Raffaella and White, Halbert , title =. Econometrica , volume =. 2006 , doi =
2006
-
[45]
and Lopez de Prado, Marcos , title =
Bailey, David H. and Lopez de Prado, Marcos , title =. Journal of Risk , volume =. 2012 , doi =
2012
-
[46]
Economic Forecasts and Expectations: Analysis of Forecasting Behavior and Performance , editor =
Jacob Mincer and Victor Zarnowitz , title =. Economic Forecasts and Expectations: Analysis of Forecasting Behavior and Performance , editor =. 1969 , pages =
1969
-
[47]
Newey and Kenneth D
Whitney K. Newey and Kenneth D. West , title =. Econometrica , year =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.