Inspectable Neural Markov Models for Non-Stationary Time Series

Jan Rovirosa; Jesse Schmolze

arxiv: 2605.30943 · v1 · pith:YPK4NVHHnew · submitted 2026-05-29 · 💱 q-fin.MF · stat.ML

Inspectable Neural Markov Models for Non-Stationary Time Series

Jan Rovirosa , Jesse Schmolze This is my paper

Pith reviewed 2026-06-28 20:05 UTC · model grok-4.3

classification 💱 q-fin.MF stat.ML

keywords neural markov modelsnon-stationary time seriesrealized volatilitytime-inhomogeneous markov chainsstochastic matriceschapman-kolmogorov discrepancyfinancial time serieshybrid neural probabilistic models

0 comments

The pith

Conditioning Markov states on realized volatility yields more internally consistent transition structures than return-based states in financial time series.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a neural method to estimate time-inhomogeneous Markov chains by directly parameterizing the manifold of stochastic matrices, sidestepping the sparsity that collapses traditional frequency counts at fine resolutions. It applies the approach to financial market data as a testbed for comparing state definitions as inductive biases. The key result is that states based on realized volatility produce transition matrices with lower Chapman-Kolmogorov discrepancy and stronger held-out likelihood than states based on returns. This combination of neural flexibility and explicit probabilistic structure matters for modeling systems that change over time, because the output matrices remain directly inspectable and geometrically analyzable rather than opaque.

Core claim

A neural network is trained to output valid stochastic matrices, allowing reliable estimation of time-varying transition probabilities even when data are sparse. On ten financial assets, volatility-conditioned states reduce Chapman-Kolmogorov discrepancy by 5.6 percent and deliver superior held-out likelihood in nine assets relative to return-conditioned states. The resulting explicit matrices further show that high-volatility regimes drive homogenization of transition probabilities across assets.

What carries the argument

Neural parameterization of the manifold of stochastic matrices, which produces explicit time-inhomogeneous transition matrices for geometric inspection.

If this is right

Explicit transition matrices become available for direct geometric analysis instead of black-box sequence predictions.
High-volatility regimes produce homogenized transition probabilities that hold across multiple assets.
The hybrid model improves predictive likelihood on held-out data for most tested assets while preserving Markov structure.
The same parameterization framework can be applied to any non-stationary system where classical frequency estimation fails due to data limits.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The inspectable matrices could support regime-shift detection tools that operate directly on probability geometry rather than latent embeddings.
Similar neural parameterization might be tested on non-financial non-stationary series such as physiological or environmental recordings to check whether the volatility-style state advantage generalizes.
If the homogenization pattern under high volatility holds in other domains, it could inform stress-testing procedures that assume increased uniformity of dynamics during extremes.

Load-bearing premise

The neural network can parameterize the manifold of stochastic matrices in a way that supports reliable estimation of time-inhomogeneous Markov chains without sparsity collapse.

What would settle it

New financial datasets in which volatility-based states produce equal or higher Chapman-Kolmogorov discrepancy and no likelihood improvement over return-based states would falsify the consistency advantage.

read the original abstract

Modeling non-stationary stochastic systems requires balancing the representational capacity of deep learning with the structural transparency of classical probabilistic models. Markov transition matrices provide such a framework, but traditional frequency-based estimation collapses at high resolutions due to data sparsity. We propose a hybrid approach that parameterizes the manifold of stochastic matrices through a neural network, enabling estimation of time-inhomogeneous Markov chains in sparse-data regimes, and use financial markets as a testbed to investigate the Markov state variable as a critical inductive bias. We show that conditioning on realized volatility produces a more internally consistent Markovian structure than return-based states, achieving a $5.6\%$ reduction in Chapman-Kolmogorov discrepancy and superior held-out likelihood in 9 of 10 assets. Unlike black-box sequence models, our approach generates explicit matrices amenable to direct geometric analysis, surfacing structural findings such as the universal homogenization of transition probabilities under high-volatility regimes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The neural parameterization of stochastic matrices is a reasonable way to handle sparse time-inhomogeneous Markov chains in finance, but the abstract gives almost no implementation details so the 5.6% volatility advantage cannot be assessed yet.

read the letter

The paper's main move is to let a neural net output time-varying transition matrices that stay row-stochastic, so you can fit Markov models at finer resolutions where raw counts are too sparse. They test this on asset returns by contrasting volatility-defined states against return-defined ones and report that volatility yields lower Chapman-Kolmogorov discrepancy plus better held-out likelihood in nine of ten cases. The explicit matrices also let them observe that high-volatility periods flatten the transition probabilities across assets.

That last structural observation is the part that feels usable: it is a concrete claim about regime behavior that can be checked against other data sets. Keeping the output as actual matrices rather than a black-box predictor is also a clear design choice that preserves inspectability.

The gaps are straightforward. The abstract never says how the network enforces non-negativity and row sums of one, what features feed the time dependence, what loss is minimized, or how large the data sets are. Without those pieces the reported 5.6% improvement has no error bars and no clear baseline beyond the return variant, so it is impossible to tell whether the edge comes from the state definition or from how the parameterization interacts with sparsity. The stress-test note correctly flags this.

The work is aimed at quantitative-finance people who already use Markov models and want them to scale without losing the ability to read the transitions. A reader in that group might take the volatility-homogenization finding as a hypothesis worth testing on their own series. The method itself is not ready to adopt until the constraint mechanism and training protocol are written down.

I would send it to referees. The hybrid idea addresses a practical estimation problem and the transparency requirement is worth checking, even though the current evidence is too thin to judge whether the central comparison holds.

Referee Report

3 major / 0 minor

Summary. The paper proposes a hybrid neural network approach to parameterize the manifold of stochastic matrices, enabling estimation of time-inhomogeneous Markov chains for non-stationary time series in sparse-data regimes. Using financial assets as a testbed, it claims that conditioning on realized volatility as the state variable produces a more internally consistent Markovian structure than return-based states, with a 5.6% reduction in Chapman-Kolmogorov discrepancy and superior held-out likelihood in 9 of 10 assets. The resulting explicit transition matrices support direct geometric analysis, revealing findings such as homogenization of probabilities under high-volatility regimes.

Significance. If the neural parameterization is shown to reliably enforce stochastic matrix constraints without introducing artifacts, the approach could bridge deep learning capacity with interpretable probabilistic models for non-stationary processes. The empirical comparison of state definitions (volatility vs. returns) and the inspectability of matrices offer potential value for financial time series modeling where inductive bias in state choice matters.

major comments (3)

[Abstract] Abstract: the central claim of a 5.6% Chapman-Kolmogorov discrepancy reduction and superior held-out likelihood for volatility states rests on the hybrid NN successfully estimating time-inhomogeneous transitions, yet no architecture, output activation (e.g., row-wise softmax), loss function, or input features for time dependence are described; without these, it is impossible to confirm that row-stochasticity and non-negativity are strictly enforced or that improvements are not parameterization artifacts.
[Abstract] Abstract: the reported 5.6% reduction and likelihood superiority in 9/10 assets are presented without error bars, dataset sizes, time periods, cross-validation procedure, or statistical significance tests, undermining assessment of whether the volatility-state advantage is robust or load-bearing for the inductive-bias conclusion.
[Abstract] Abstract: the hybrid approach is asserted to avoid sparsity collapse of frequency-based methods, but no baseline comparisons (e.g., regularized frequency counts, other NN variants, or standard time-inhomogeneous estimators) or training details are supplied, leaving open whether the volatility advantage is isolated from implementation choices.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We address each major comment below and will revise the abstract to incorporate additional methodological and statistical details for improved clarity.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of a 5.6% Chapman-Kolmogorov discrepancy reduction and superior held-out likelihood for volatility states rests on the hybrid NN successfully estimating time-inhomogeneous transitions, yet no architecture, output activation (e.g., row-wise softmax), loss function, or input features for time dependence are described; without these, it is impossible to confirm that row-stochasticity and non-negativity are strictly enforced or that improvements are not parameterization artifacts.

Authors: Section 3 of the manuscript specifies the neural parameterization: a feedforward network with time and volatility inputs, row-wise softmax output activation to enforce row-stochasticity and non-negativity, negative log-likelihood loss, and explicit time indexing for time dependence. Post-training verification confirms zero constraint violations. The abstract is intentionally concise, but we will revise it to briefly note the stochasticity enforcement mechanism. revision: yes
Referee: [Abstract] Abstract: the reported 5.6% reduction and likelihood superiority in 9/10 assets are presented without error bars, dataset sizes, time periods, cross-validation procedure, or statistical significance tests, undermining assessment of whether the volatility-state advantage is robust or load-bearing for the inductive-bias conclusion.

Authors: The experimental details, including dataset sizes, time periods, cross-validation, and significance testing, appear in Sections 4 and 5. We will revise the abstract to include error bars, dataset information, and a note on statistical testing to strengthen the presentation of the reported figures. revision: yes
Referee: [Abstract] Abstract: the hybrid approach is asserted to avoid sparsity collapse of frequency-based methods, but no baseline comparisons (e.g., regularized frequency counts, other NN variants, or standard time-inhomogeneous estimators) or training details are supplied, leaving open whether the volatility advantage is isolated from implementation choices.

Authors: Section 4.3 and the appendix provide baseline comparisons to regularized frequency estimators and other time-inhomogeneous models, along with full training details. The volatility versus returns comparison uses an identical model architecture to isolate the state-variable effect. We will revise the abstract to reference these comparisons. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims rest on held-out metrics

full rationale

The paper's key results (5.6% Chapman-Kolmogorov reduction and superior held-out likelihood for volatility-based states) are presented as direct empirical comparisons on financial time series using standard external validation (held-out likelihood, discrepancy measures). No equations, self-citations, or definitions are provided in the abstract or described full text that reduce these outcomes to fitted parameters by construction, rename known results, or import uniqueness via author overlap. The neural parameterization of stochastic matrices is introduced as an enabling method for sparse regimes, but the reported superiority is framed as an inductive bias test rather than a tautology. This is the common case of a self-contained empirical study.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the unstated premise that a neural network can faithfully parameterize the manifold of valid stochastic matrices and that the resulting time-inhomogeneous chain remains Markovian under the chosen state variable.

axioms (2)

domain assumption A neural network can parameterize the manifold of stochastic matrices without violating row-stochastic constraints or introducing estimation bias in sparse regimes.
Invoked in the hybrid approach description; required for the method to function as stated.
domain assumption Financial time series admit a Markovian representation once an appropriate state variable (volatility or return) is chosen.
Underlying the comparison of state variables and the claim of internal consistency.

pith-pipeline@v0.9.1-grok · 5683 in / 1447 out tokens · 43504 ms · 2026-06-28T20:05:53.485784+00:00 · methodology

Inspectable Neural Markov Models for Non-Stationary Time Series

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)